MySQL: počasen NOT IN query

Zdravo,
imam dve tabeli.
Product
in
visitstat

SELECT id, name FROM Product ORDER BY name ASC LIMIT 100

100 rows in set (0.05 sec)

SELECT DISTINCT refId FROM visitstat WHERE type='P' AND date > '2010-10-02'

9 rows in set (0.08 sec)

SELECT id, name FROM Product WHERE id NOT IN (SELECT DISTINCT refId FROM visitstat WHERE type='P' AND date > '2010-10-02') ORDER BY name ASC LIMIT 100

100 rows in set (4 min 15.10 sec)

Ve kdo, v čem bi lahko bil problem?

15 odgovorov

Daj, pripopaj še EXPLAIN.

brez explaina ne boš prišel do dna zadevi, drugače pa... distinct zna bit problem... koliko imaš vseh vrstic v teh 2 tabelah, očitno precej...

+----+--------------------+-----------+------+---------------+------+---------+------+-------+------------------------------+
| id | select_type        | table     | type | possible_keys | key  | key_len | ref  | rows  | Extra                        |
+----+--------------------+-----------+------+---------------+------+---------+------+-------+------------------------------+
|  1 | PRIMARY            | Product   | ALL  | NULL          | NULL | NULL    | NULL |  1519 | Using where; Using filesort  |
|  2 | DEPENDENT SUBQUERY | visitstat | ALL  | NULL          | NULL | NULL    | NULL | 25947 | Using where; Using temporary |
+----+--------------------+-----------+------+---------------+------+---------+------+-------+------------------------------+

Product count: 1519
Visitstat count: 25947

Poglej possible_keys, je prazen. Imaš indexe na visitstat?

Ni indexa :)
DB model je star, vse je MyISAM, tabele nimajo primarnih ključev, indeksov...

Pol si si odgovoril na vprašanje zakaj toliko časa rabi:)

Nope ne pomaga :)
100 rows in set (4 min 9.50 sec)

mysql> show indexes from visitstat;
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table     | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| visitstat |          0 | PRIMARY  |            1 | id          | A         |       25947 |     NULL | NULL   |      | BTREE      |         |
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

Bi moral še kaj glede indexa popraviti?

Explain pa še vedno kaže isto zadevo kot zgoraj.

name tudi rabi index če delaš sort na njem

Letos so na PHPkonferenci povedali, da se včasih splača zadevo naredit v 2 korakih.

In sicer, najprej
SELECT DISTINCT refId FROM visitstat WHERE type='P' AND date > '2010-10-02'
rezultate lepo spravit v string ločen z vejicami,
v 2. koraku pa
SELECT id, name FROM Product WHERE id NOT IN (23,45,67,...)
Tukaj moraš pazit, da dolžina queirja ne preseže 'maxallowedpacket'

Drugače pa malo experimentiraj z indexi. (indexe na tisto po čemer sortiraš, tisto kar selectaš).

Tabele lahko probaš spremenit v innodb, da vidiš če kaj profitiraš...

1

Prepiši na LEFT JOIN, mislim, da bi šlo. Subqueryji so pri MySQLu strahotno počasni, majo polno ticketov odprtih na to temo. :)