MYSQL

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • implicit cast forces table scan?

    6 answers - 3498 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Hi,
    I am currently using mysql 4.0.18 as distributed with red hat Linux. I find
    when I perform a select on a varchar(30) field, the index is used only if I
    have quoted the value in the where clause. , mysql performs a
    table scan.
    The queries in question are:
    This query uses the index:
    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = "170807";
    This query performs a table scan:
    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = 170807;
    My question is this: is the issue here that mysql is converting every single
    itran_log_actionid value, from all 1.5 million rows, and hence the index is
    not useful and not used? My initial assumption was that the constant value
    170807 in the second query, would be converted to text before the query was
    executed, and so the index could be used. This does not seem to be the
    case.
    I ask both for my own edification, and also because it seems to me this
    should be mentioned in the manual for newbies like myself.
    thanks,
    Details on versions, table structures, indexes, etc. below
    $ rpm -qa | grep -i mysql
    MySQL-shared-compat-4.0.15-0
    MySQL-client-4.0.18-0
    php-mysql-4.1.2-7.2.6
    MySQL-server-4.0.18-0
    $ /usr/bin/mysql -V
    /usr/bin/mysql Ver 12.22 Distrib 4.0.18, for pc-linux (i686)
    mysqldescribe itran_log;
    +
    | Field | Type | Null | Key | Default | Extra
    |
    +
    | itran_user_id | varchar(100) | | | |
    |
    | itran_log_date | date | | MUL | 0000-00-00 |
    |
    | itran_log_time | time | | | 00:00:00 |
    |
    | itran_log_filename | varchar(100) | | | |
    |
    | itran_log_action | varchar(25) | | MUL | |
    |
    | itran_log_actionid | varchar(30) | | MUL | |
    |
    | itran_site_id | varchar(100) | YES | MUL | NULL |
    |
    | itran_log_instructions | text | | | |
    |
    | itran_log_id | bigint(20) | | PRI | NULL |
    auto_increment |
    +
    mysqlshow indexes from itran_log;
    +
    | Table | Non_unique | Key_name | Seq_in_index |
    Column_name | Collation | Cardinality | Sub_part | Packed | Null |
    Index_type | Comment |
    +
    | itran_log | 0 | PRIMARY | 1 |
    itran_log_id | A | 1500793 | NULL | NULL | |
    BTREE | |
    | itran_log | 1 | itran_site_id_ix | 1 |
    itran_site_id | A | NULL | 15 | NULL | YES |
    BTREE | |
    | itran_log | 1 | itran_log_action_ix | 1 |
    itran_log_action | A | NULL | 3 | NULL | |
    BTREE | |
    | itran_log | 1 | itran_log_actionid_ix | 1 |
    itran_log_actionid | A | NULL | NULL | NULL | |
    BTREE | |
    | itran_log | 1 | itran_log_date_ix | 1 |
    itran_log_date | A | NULL | NULL | NULL | |
    BTREE | |
    +
    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = 170807;
    | table | type | possible_keys | key | key_len | ref | rows
    | Extra |
    | itran_log | ALL | itran_log_actionid_ix | NULL | NULL | NULL | 1500775
    | Using where |
    1 row in set (0.02 sec)
    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = "170807";
    | table | type | possible_keys | key | key_len
    | ref | rows | Extra |
    | itran_log | ref | itran_log_actionid_ix | itran_log_actionid_ix | 30
    | const | 4 | Using where |
    1 row in set (0.00 sec)
    Faaland
    Sovran Inc.
  • No.1 | | 1470 bytes | |

    Wednesday 19 2005 01:15 pm, Faaland wrote:
    The queries in question are:

    This query uses the index:
    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = "170807";

    This query performs a table scan:
    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = 170807;

    My question is this: is the issue here that mysql is converting every
    single itran_log_actionid value, from all 1.5 million rows, and hence the
    index is not useful and not used? My initial assumption was that the
    constant value 170807 in the second query, would be converted to text
    before the query was executed, and so the index could be used. This does
    not seem to be the case.

    I ask both for my own edification, and also because it seems to me this
    should be mentioned in the manual for newbies like myself.

    It doesnt know what value your giving it. If it thought to assume converting
    the data, you could have "17h120", and it would fail converting the data.
    Mysql, nor any DB for that matter, should not, and do not, assume anything.
    It just happens in the case your dealing with numeric data. If thats the
    case, you should have made the column numeric in type. (int whatever)

    Jeff

    Jeff

    PGP SIGNATURE
    Version: GnuPG v1.4.1 (GNU/Linux)

    qa/VghnUIFhtkboTG48/t/I=
    =+prH
    PGP SIGNATURE
  • No.2 | | 3442 bytes | |

    Faaland wrote:
    Hi,

    I am currently using mysql 4.0.18 as distributed with red hat Linux. I find
    when I perform a select on a varchar(30) field, the index is used only if I
    have quoted the value in the where clause. , mysql performs a
    table scan.

    A varchar(30) field is a string, so only comparisons to strings really make
    sense. If you don't quote the comparison value in the WHERE clause, you're
    asking for trouble. First, the unquoted value must make sense. There are 3
    possibilities: it's a number, it's a function, or it's a column name.
    Hence, you can get away with

    WHERE varchar_field = 170807

    but probably not

    WHERE varachar_field = dogs

    (unless you have a column named "dogs").

    I expect you know all that. My point is, why aren't you quoting the
    comparison string?

    The queries in question are:

    This query uses the index:
    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = "170807";

    This is a string column compared to a string constant, so the index can be used.

    This query performs a table scan:
    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = 170807;

    This is a string compared to a number. You are asking mysql to make an
    implicit conversion so it can make the comparison. You expected the number
    to be converted to a string, but that's not how it works -- it's the other
    way around, the string is converted to a number. Why? Because many strings
    convert to the same number. Consider:

    mysqlSELECT '17' + 0, '17.0' + 0, '+17' + 0, '17,345' + 0, '17 dogs' + 0;

    | '17' + 0 | '17.0' + 0 | '+17' + 0 | '17,345' + 0 | '17 dogs' + 0 |

    | 17 | 17 | 17 | 17 | 17 |

    1 row in set (0.00 sec)

    The map from string to number is many-to-one, so the only safe course is to
    convert the string to a number, then do a numeric comparison. course,
    that renders the string index useless, so the full table scan is required.

    My question is this: is the issue here that mysql is converting every single
    itran_log_actionid value, from all 1.5 million rows, and hence the index is
    not useful and not used?

    Yes.

    My initial assumption was that the constant value
    170807 in the second query, would be converted to text before the query was
    executed, and so the index could be used. This does not seem to be the
    case.

    Right, it's not.

    I ask both for my own edification, and also because it seems to me this
    should be mentioned in the manual for newbies like myself.

    I thought it was mentioned in the manual, though I must admit I didn't find
    it just now after a brief search.

    I'm wondering, however, if we're really approaching this from the right
    angle. Your column is named itran_log_actionid, and it seems to contain
    numbers such as 170807. The obvious question, then, is why is
    itran_log_actionid a VARCHAR(30) instead of one of the INT types? If it
    were, you wouldn't need to quote the constant in the WHERE clause, and the
    comparison would be numeric (faster than string) with no type conversion

    thanks,

    Michael
  • No.3 | | 3623 bytes | |

    Hello.

    I suggest you to check you query with MySQL 4.1.14 which might has a bit
    clever optimizer.

    Faaland wrote:
    Hi,

    I am currently using mysql 4.0.18 as distributed with red hat Linux. I find
    when I perform a select on a varchar(30) field, the index is used only if I
    have quoted the value in the where clause. , mysql performs a
    table scan.

    The queries in question are:

    This query uses the index:
    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = "170807";

    This query performs a table scan:
    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = 170807;

    My question is this: is the issue here that mysql is converting every single
    itran_log_actionid value, from all 1.5 million rows, and hence the index is
    not useful and not used? My initial assumption was that the constant value
    170807 in the second query, would be converted to text before the query was
    executed, and so the index could be used. This does not seem to be the
    case.

    I ask both for my own edification, and also because it seems to me this
    should be mentioned in the manual for newbies like myself.

    thanks,

    Details on versions, table structures, indexes, etc. below

    $ rpm -qa | grep -i mysql
    MySQL-shared-compat-4.0.15-0
    MySQL-client-4.0.18-0
    php-mysql-4.1.2-7.2.6
    MySQL-server-4.0.18-0

    $ /usr/bin/mysql -V
    /usr/bin/mysql Ver 12.22 Distrib 4.0.18, for pc-linux (i686)

    mysqldescribe itran_log;

    +
    | Field | Type | Null | Key | Default | Extra
    |

    +
    | itran_user_id | varchar(100) | | | |
    |
    | itran_log_date | date | | MUL | 0000-00-00 |
    |
    | itran_log_time | time | | | 00:00:00 |
    |
    | itran_log_filename | varchar(100) | | | |
    |
    | itran_log_action | varchar(25) | | MUL | |
    |
    | itran_log_actionid | varchar(30) | | MUL | |
    |
    | itran_site_id | varchar(100) | YES | MUL | NULL |
    |
    | itran_log_instructions | text | | | |
    |
    | itran_log_id | bigint(20) | | PRI | NULL |
    auto_increment |

    +

    mysqlshow indexes from itran_log;

    +
    | Table | Non_unique | Key_name | Seq_in_index |
    Column_name | Collation | Cardinality | Sub_part | Packed | Null |
    Index_type | Comment |

    +
    | itran_log | 0 | PRIMARY | 1 |
    itran_log_id | A | 1500793 | NULL | NULL | |
    BTREE | |
    | itran_log | 1 | itran_site_id_ix | 1 |
    itran_site_id | A | NULL | 15 | NULL | YES |
    BTREE | |
    | itran_log | 1 | itran_log_action_ix | 1 |
    itran_log_action | A | NULL | 3 | NULL | |
    BTREE | |
    | itran_log | 1 | itran_log_actionid_ix | 1 |
    itran_log_actionid | A | NULL | NULL | NULL | |
    BTREE | |
    | itran_log | 1 | itran_log_date_ix | 1 |
    itran_log_date | A | NULL | NULL | NULL | |
    BTREE | |

    +

    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = 170807;

    | table | type | possible_keys | key | key_len | ref | rows
    | Extra |

    | itran_log | ALL | itran_log_actionid_ix | NULL | NULL | NULL | 1500775
    | Using where |

    1 row in set (0.02 sec)

    mysqlexplain
    -select itran_log_date, itran_log_actionid from itran_log where
    -itran_log_actionid = "170807";

    | table | type | possible_keys | key | key_len
    | ref | rows | Extra |

    | itran_log | ref | itran_log_actionid_ix | itran_log_actionid_ix | 30
    | const | 4 | Using where |

    1 row in set (0.00 sec)

    Faaland
    Sovran Inc.
  • No.4 | | 2137 bytes | |

    Gleb Paharenko wrote:
    Hello.
    >

    I suggest you to check you query with MySQL 4.1.14 which might has a bit
    clever optimizer.

    Did you read my earlier reply? This is neither an optimizer nor version issue.
    Consider:

    mysqlSELECT VERSIN();
    ++
    | VERSIN() |
    ++
    | 4.1.15 |
    ++
    1 row in set (0.01 sec)

    CREATE TABLE ict (id INT NT NULL AUTINCREMENT PRIMARY KEY,
    vcf VARCHAR(30),
    INDEX (vcf)
    );

    INSERT INT ict (vcf)
    VALUES ('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),(' 9'),('10'),
    ('11'),('12'),('13'),('14'),('15'),('16'),('17'),( '18'),('19'),('20');

    EXPLAIN SELECT * FRM ict WHERE vcf = '17';

    | id | select_type | table | type | p_keys | key | ref | rows | Extra |

    | 1 | SIMPLE | ict | ref | vcf | vcf | const | 1 | Using where |

    1 row in set (0.01 sec)

    EXPLAIN SELECT * FRM ict WHERE vcf = 17;

    | id | select_type | table | type | p_keys | key | ref | rows | Extra |

    | 1 | SIMPLE | ict | ALL | vcf | NULL | NULL | 20 | Using where |

    1 row in set (0.01 sec)

    Why? Because there are many strings which evaluate to a given number. For example:

    INSERT INT ict (vcf)
    VALUES (' 17'), ('+17'), ('17.0'), ('17,34'), ('17 dogs');

    SELECT * FRM ict WHERE vcf = '17';

    | id | vcf |

    | 17 | 17 |

    1 row in set (0.00 sec)

    SELECT * FRM ict WHERE vcf = 17;

    | id | vcf |

    | 17 | 17 |
    | 21 | 17 |
    | 22 | +17 |
    | 23 | 17.0 |
    | 24 | 17,34 |
    | 25 | 17 dogs |

    6 rows in set (0.00 sec)

    You see? The two WHERE clauses are actually different, so the optimizer must
    treat them differently. No amount of optimizer cleverness can change that. The
    moral of the story: Use numeric types to store numbers, not string types.

    Michael
  • No.5 | | 2675 bytes | |

    Dear, Michael!

    Did you read my earlier reply? This is neither an optimizer nor >
    version issue.

    I've sent my e-mail before I've read your. The problem is that I was
    testing my guessing on the table with a bit different structure than
    your (mine didn't have a primary key field 'id'). And optimizer has
    shown that it uses indexes! There is a verified bug:

    Thank you for good explanations!

    Michael Stassen wrote:
    Gleb Paharenko wrote:
    >Hello.
    >>

    >I suggest you to check you query with MySQL 4.1.14 which might has a bit
    >clever optimizer.


    Did you read my earlier reply? This is neither an optimizer nor version
    issue. Consider:

    mysqlSELECT VERSIN();
    ++
    | VERSIN() |
    ++
    | 4.1.15 |
    ++
    1 row in set (0.01 sec)

    CREATE TABLE ict (id INT NT NULL AUTINCREMENT PRIMARY KEY,
    vcf VARCHAR(30),
    INDEX (vcf)
    );

    INSERT INT ict (vcf)
    VALUES ('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),(' 9'),('10'),

    ('11'),('12'),('13'),('14'),('15'),('16'),('17'),( '18'),('19'),('20');

    EXPLAIN SELECT * FRM ict WHERE vcf = '17';

    | id | select_type | table | type | p_keys | key | ref | rows |
    Extra |

    | 1 | SIMPLE | ict | ref | vcf | vcf | const | 1 | Using
    where |

    1 row in set (0.01 sec)

    EXPLAIN SELECT * FRM ict WHERE vcf = 17;

    | id | select_type | table | type | p_keys | key | ref | rows |
    Extra |

    | 1 | SIMPLE | ict | ALL | vcf | NULL | NULL | 20 | Using
    where |

    1 row in set (0.01 sec)

    Why? Because there are many strings which evaluate to a given number.
    For example:

    INSERT INT ict (vcf)
    VALUES (' 17'), ('+17'), ('17.0'), ('17,34'), ('17 dogs');

    SELECT * FRM ict WHERE vcf = '17';

    | id | vcf |

    | 17 | 17 |

    1 row in set (0.00 sec)

    SELECT * FRM ict WHERE vcf = 17;

    | id | vcf |

    | 17 | 17 |
    | 21 | 17 |
    | 22 | +17 |
    | 23 | 17.0 |
    | 24 | 17,34 |
    | 25 | 17 dogs |

    6 rows in set (0.00 sec)

    You see? The two WHERE clauses are actually different, so the optimizer
    must treat them differently. No amount of optimizer cleverness can
    change that. The moral of the story: Use numeric types to store
    numbers, not string types.

    Michael
  • No.6 | | 1021 bytes | |

    Gleb Paharenko wrote:
    Dear, Michael!

    >>Did you read my earlier reply? This is neither an optimizer nor >
    >>version issue.


    I've sent my e-mail before I've read your. The problem is that I was
    testing my guessing on the table with a bit different structure than
    your (mine didn't have a primary key field 'id'). And optimizer has
    shown that it uses indexes! There is a verified bug:

    Thank you for good explanations!

    I've added a comment to bug 14220 explaining why I do not believe this is a bug.
    The short version is that without the id column, "SELECT *" is the same as
    "SELECT vcf". In that case, the index on vcf is a covering index, so mysql
    reads the data from the index, rather than the table. You will note in the
    EXPLAIN output that it is still doing a full scan. It's just that it's a full
    index scan rather than a full table scan.

    Michael

Re: implicit cast forces table scan?


max 4000 letters.
Your nickname that display:
In order to stop the spam: 4 + 3 =
QUESTION ON "MYSQL"

EMSDN.COM