OSGalaxy

published by noreply@blogger.com (milek) on 2009-07-03 10:13:34
I came across an interesting problem today. A perl script running a mysql query and it takes too much time to complete. It spends almost all its time while waiting for this mysql query (anonymized):

select a, b, registered, c, d from XXX where date(registered) >= date_sub(date(NOW()),interval 7 day)

The problem is that there are over 70 million rows in the XXX table and the query takes over 7 minutes to complete mostly waiting for a disk I/O.

explain select a, b, registered, c, d from XXX where date(registered) >= date_sub(date(NOW()),interval 7 day)G

id: 1
select_type: SIMPLE
table: XXX
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 72077742
Extra: Using where
1 row in set (0.02 sec)


So the reason it is so slow is that mysql does not use an index here. It turned out that if you use a function on a column in a where statement then mysql won't use an index!

There is a reason why the statement is using date() and date_sub() functions as it is expected to compare dates where time is 00:00:00 so I can't . But one can cast() functions to timestamp and used registered directly which will allow mysql to use index:
explain select a, b, registered, c, d from XXX where registered >= cast(date_sub(date(NOW()),interval 7 day) as datetime)

id: 1
select_type: SIMPLE
table: XXX
type: range
possible_keys: YYY
key: YYY
key_len: 9
ref: NULL
rows: 1413504
Extra: Using where
1 row in set (0.10 sec)
After the modification the script takes about 50s to execute compared to over 7 minutes which is a very nice 8x performance improvement! Not to mention a much less impact on a database server.


> Read More... | Digg This!