foreign_key_info is un-optimized for large sets of databases #326

SineSwiper · 2021-01-21T01:16:52Z

Due to the way MySQL 5 supports information_schema, the SQL statement used for foreign_key_info requires a full scan of all databases. This can break servers in environments with a large amount of databases (literally die from an OOM), or take a long time to complete in the best case. Even with filters for key_column_usage.constraint_schema, the table_constraints scan isn't filtered properly and results in an EXPLAIN plan like this:

           id: 1
  select_type: SIMPLE
        table: A
         type: ALL
possible_keys: NULL
          key: TABLE_SCHEMA
      key_len: NULL
          ref: NULL
         rows: NULL
        Extra: Using where; Open_full_table; Scanned 1 database
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: B
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
        Extra: Using where; Open_full_table; Scanned all databases; Using join buffer (Block Nested Loop)
2 rows in set (0.00 sec)

So, for table B, the optimizer Scanned all databases, which is bad. The table_constraints inclusion is unnecessary, anyway, since all foreign/unique/PK data is available in key_column_usage. Better to just take out the problematic join, and fix the optimization problem at the same time.

The text was updated successfully, but these errors were encountered:

SineSwiper · 2021-01-22T16:38:07Z

The code could include JOINs to fill in the NULL gaps in the SELECT clause, but those JOIN statements would need to apply the filters directly in order to get the EXPLAIN plan benefit. This is because MySQL 5's implementation of filtering is rather kludgy and doesn't take advantage of the optimizer. For example:

JOIN INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS R ON (
    R.CONSTRAINT_SCHEMA = $database AND
    R.CONSTRAINT_NAME = A.CONSTRAINT_NAME
)

Not R.CONSTRAINT_SCHEMA = A.CONSTRAINT_SCHEMA.

Anyway, I can provide a PR if you like.

dveeden · 2024-11-11T07:07:31Z

The code for this can be found here:

DBD-mysql/lib/DBD/mysql.pm

Lines 624 to 705 in 5a669fb

    
           sub foreign_key_info { 
        
               my ($dbh, 
        
                   $pk_catalog, $pk_schema, $pk_table, 
        
                   $fk_catalog, $fk_schema, $fk_table, 
        
                  ) = @_; 
        
               return unless $dbh->func('_async_check'); 
        
               # INFORMATION_SCHEMA.KEY_COLUMN_USAGE was added in 5.0.6 
        
               # no one is going to be running 5.0.6, taking out the check for $point > .6 
        
               my ($maj, $min, $point) = _version($dbh); 
        
               return if $maj < 5 ; 
        
               my $sql = <<'EOF'; 
        
           SELECT NULL AS PKTABLE_CAT, 
        
                  A.REFERENCED_TABLE_SCHEMA AS PKTABLE_SCHEM, 
        
                  A.REFERENCED_TABLE_NAME AS PKTABLE_NAME, 
        
                  A.REFERENCED_COLUMN_NAME AS PKCOLUMN_NAME, 
        
                  A.TABLE_CATALOG AS FKTABLE_CAT, 
        
                  A.TABLE_SCHEMA AS FKTABLE_SCHEM, 
        
                  A.TABLE_NAME AS FKTABLE_NAME, 
        
                  A.COLUMN_NAME AS FKCOLUMN_NAME, 
        
                  A.ORDINAL_POSITION AS KEY_SEQ, 
        
                  NULL AS UPDATE_RULE, 
        
                  NULL AS DELETE_RULE, 
        
                  A.CONSTRAINT_NAME AS FK_NAME, 
        
                  NULL AS PK_NAME, 
        
                  NULL AS DEFERABILITY, 
        
                  NULL AS UNIQUE_OR_PRIMARY 
        
             FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE A, 
        
                  INFORMATION_SCHEMA.TABLE_CONSTRAINTS B 
        
            WHERE A.TABLE_SCHEMA = B.TABLE_SCHEMA AND A.TABLE_NAME = B.TABLE_NAME 
        
              AND A.CONSTRAINT_NAME = B.CONSTRAINT_NAME AND B.CONSTRAINT_TYPE IS NOT NULL 
        
           EOF 
        
               my @where; 
        
               my @bind; 
        
               # catalogs are not yet supported by MySQL 
        
           #    if (defined $pk_catalog) { 
        
           #        push @where, 'A.REFERENCED_TABLE_CATALOG = ?'; 
        
           #        push @bind, $pk_catalog; 
        
           #    } 
        
               if (defined $pk_schema) { 
        
                   push @where, 'A.REFERENCED_TABLE_SCHEMA = ?'; 
        
                   push @bind, $pk_schema; 
        
               } 
        
               if (defined $pk_table) { 
        
                   push @where, 'A.REFERENCED_TABLE_NAME = ?'; 
        
                   push @bind, $pk_table; 
        
               } 
        
           #    if (defined $fk_catalog) { 
        
           #        push @where, 'A.TABLE_CATALOG = ?'; 
        
           #        push @bind,  $fk_schema; 
        
           #    } 
        
               if (defined $fk_schema) { 
        
                   push @where, 'A.TABLE_SCHEMA = ?'; 
        
                   push @bind,  $fk_schema; 
        
               } 
        
               if (defined $fk_table) { 
        
                   push @where, 'A.TABLE_NAME = ?'; 
        
                   push @bind,  $fk_table; 
        
               } 
        
               if (@where) { 
        
                   $sql .= ' AND '; 
        
                   $sql .= join ' AND ', @where; 
        
               } 
        
               $sql .= " ORDER BY A.TABLE_SCHEMA, A.TABLE_NAME, A.ORDINAL_POSITION"; 
        
               local $dbh->{FetchHashKeyName} = 'NAME_uc'; 
        
               my $sth = $dbh->prepare($sql); 
        
               $sth->execute(@bind); 
        
               return $sth; 
        
           }

@SineSwiper would you be interested in submitting a PR to improve this?

Also MySQL 8.0 switched to an InnoDB based data dictionary, which improves performance for at least some information_schema queries (see also https://dev.mysql.com/blog-archive/mysql-8-0-scaling-and-performance-of-information_schema/ ). This may or may not improve the performance of the queries mentioned here. Maybe someone could run some tests to benchmark the performance of the original query vs a modified one and with various MySQL major versions?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

foreign_key_info is un-optimized for large sets of databases #326

foreign_key_info is un-optimized for large sets of databases #326

SineSwiper commented Jan 21, 2021

SineSwiper commented Jan 22, 2021 •

edited

Loading

dveeden commented Nov 11, 2024

foreign_key_info is un-optimized for large sets of databases #326

foreign_key_info is un-optimized for large sets of databases #326

Comments

SineSwiper commented Jan 21, 2021

SineSwiper commented Jan 22, 2021 • edited Loading

dveeden commented Nov 11, 2024

SineSwiper commented Jan 22, 2021 •

edited

Loading