hiroohiのメモ

はてななのでITやスタートアップ周りの話(ほとんどが自分への備忘録だけど)を書いています。

フレームワーク開発時代のSQLチューニング基礎(3)SQLのやっちゃだめ後編

この記事はhttp://mtl.recruit.co.jp/blog/2011/02/sqlsql.htmlの転載です。

研究員の棚橋です。最終回は「SQLのやっちゃ駄目」後編です。

これまでの記事

3.SQLのやっちゃ駄目(後編)

  • 条件指定で複合インデックスのキー順番を間違えない。
  • ORは極力INに置き換える。
  • 後方一致または中間一致のLIKEは使わない。
  • INを使う場合、引数リストは確率の高い順に左から記述する。
  • サブクエリを使う場合、INよりEXISTSを優先して使う?
  • INSERTやUPDATEでJOINを使うテクニック
条件指定で複合インデックスのキー順番を間違えない。
せっかく作った複合インデックスでも、キー指定の順番を間違えるとインデックスが使用されません。インデックスを有効活用するためにキー順番に注意しましょう。
mysql> create index users_idx01 on users(user_level,user_type);
Query OK, 900000 rows affected (16.86 sec)
Records: 900000  Duplicates: 0  Warnings: 0

select * from users where user_level=30 and user_type=2;

| 898322 | 0000898321 | USER898321 |         2 |         30 |     2900 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 898509 | 0000898508 | USER898508 |         2 |         30 |     2996 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 898540 | 0000898539 | USER898539 |         2 |         30 |     2986 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 898921 | 0000898920 | USER898920 |         2 |         30 |     2998 | NULL         | 2011-01-27 00:00:00 |          0 | 
+--------+------------+------------+-----------+------------+----------+--------------+---------------------+------------+
749 rows in set (0.01 sec)

mysql> explain select * from users where user_level=30 and user_type=2;
+----+-------------+-------+------+---------------+-------------+---------+-------------+------+-------+
| id | select_type | table | type | possible_keys | key         | key_len | ref         | rows | Extra |
+----+-------------+-------+------+---------------+-------------+---------+-------------+------+-------+
|  1 | SIMPLE      | users | ref  | users_idx01   | users_idx01 | 8       | const,const |  748 |       | 
+----+-------------+-------+------+---------------+-------------+---------+-------------+------+-------+
1 row in set (0.00 sec)

mysql> select * from users where user_type=2 and user_level=30;

| 898322 | 0000898321 | USER898321 |         2 |         30 |     2900 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 898509 | 0000898508 | USER898508 |         2 |         30 |     2996 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 898540 | 0000898539 | USER898539 |         2 |         30 |     2986 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 898921 | 0000898920 | USER898920 |         2 |         30 |     2998 | NULL         | 2011-01-27 00:00:00 |          0 | 
+--------+------------+------------+-----------+------------+----------+--------------+---------------------+------------+
749 rows in set (0.01 sec)

mysql> explain select * from users where user_type=2 and user_level=30;
+----+-------------+-------+------+---------------+-------------+---------+-------------+------+-------+
| id | select_type | table | type | possible_keys | key         | key_len | ref         | rows | Extra |
+----+-------------+-------+------+---------------+-------------+---------+-------------+------+-------+
|  1 | SIMPLE      | users | ref  | users_idx01   | users_idx01 | 8       | const,const |  748 |       | 
+----+-------------+-------+------+---------------+-------------+---------+-------------+------+-------+
1 row in set (0.00 sec)

今回の結果は、検索時間、explainの結果とも同じになりました。これはMySQLが内部的に条件順番を入れ替えてインデックスを使用してくれたおかげです。ただし、このような動作は必ず行われるわけではありません。データベースの機能に期待するのではなく、インデックスの存在とその定義を意識する事は重要です。

ORは極力INに置き換える。

特定項目の条件が複数の場合、ORで条件を列挙しますが、このとき出来るならばINに置き換えてみるとSQLクエリが高速化する事があります。

mysql> select * from users where (user_level=10 or user_level=11 or user_level=12) and user_type=2;

| 896881 | 0000896880 | USER896880 |         2 |         12 |     1188 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 897386 | 0000897385 | USER897385 |         2 |         12 |     1195 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 898269 | 0000898268 | USER898268 |         2 |         12 |     1132 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899330 | 0000899329 | USER899329 |         2 |         12 |     1124 | NULL         | 2011-01-27 00:00:00 |          0 | 
+--------+------------+------------+-----------+------------+----------+--------------+---------------------+------------+
2259 rows in set (0.02 sec)

mysql> explain select * from users where (user_level=10 or user_level=11 or user_level=12) and user_type=2;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
|  1 | SIMPLE      | users | range | users_idx01   | users_idx01 | 8       | NULL | 2257 | Using where | 
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> select * from users where user_level in (10,11,12) and user_type=2;

| 896881 | 0000896880 | USER896880 |         2 |         12 |     1188 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 897386 | 0000897385 | USER897385 |         2 |         12 |     1195 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 898269 | 0000898268 | USER898268 |         2 |         12 |     1132 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899330 | 0000899329 | USER899329 |         2 |         12 |     1124 | NULL         | 2011-01-27 00:00:00 |          0 | 
+--------+------------+------------+-----------+------------+----------+--------------+---------------------+------------+
2259 rows in set (0.01 sec)

mysql> explain select * from users where user_level in (10,11,12) and user_type=2;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
|  1 | SIMPLE      | users | range | users_idx01   | users_idx01 | 8       | NULL | 2257 | Using where | 
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.00 sec)

 今回の結果ではあまり大きな差異は現れませんでしたが、explainの結果が同じことから内部的には同じように実行されているからです。ただし、ORをINに置き換える事で高速化する事は確かにありますので、ぎりぎりのチューニングでは必要かもしれません。

後方一致または中間一致のLIKEは使わない。

 部分一致を検索するときに利用されるLIKEですが、その条件の指定の仕方でインデックスが、利用されたりされなかったりします。例をみてみると。

mysql> create index users_idx01 on users(user_name);
Query OK, 900000 rows affected (17.61 sec)
Records: 900000  Duplicates: 0  Warnings: 0

mysql> select * from users where user_name like 'USER88%';

| 889997 | 0000889996 | USER889996 |         2 |        207 |    20662 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 889998 | 0000889997 | USER889997 |         2 |        382 |    38127 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 889999 | 0000889998 | USER889998 |         2 |         88 |     8750 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 890000 | 0000889999 | USER889999 |         2 |         92 |     9170 | NULL         | 2011-01-27 00:00:00 |          0 | 
+--------+------------+------------+-----------+------------+----------+--------------+---------------------+------------+
10000 rows in set (0.07 sec)

mysql> explain select * from users where user_name like 'USER88%';
+----+-------------+-------+-------+---------------+-------------+---------+------+-------+-------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref  | rows  | Extra       |
+----+-------------+-------+-------+---------------+-------------+---------+------+-------+-------------+
|  1 | SIMPLE      | users | range | users_idx01   | users_idx01 | 98      | NULL | 20592 | Using where | 
+----+-------------+-------+-------+---------------+-------------+---------+------+-------+-------------+
1 row in set (0.00 sec)

mysql> select * from users where user_name like '�';

| 899689 | 0000899688 | USER899688 |         2 |         60 |     5976 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899789 | 0000899788 | USER899788 |         2 |        393 |    39278 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899889 | 0000899888 | USER899888 |         2 |        397 |    39618 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899989 | 0000899988 | USER899988 |         2 |         93 |     9247 | NULL         | 2011-01-27 00:00:00 |          0 | 
+--------+------------+------------+-----------+------------+----------+--------------+---------------------+------------+
9000 rows in set (0.40 sec)

mysql> explain select * from users where user_name like '�';
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | users | ALL  | NULL          | NULL | NULL    | NULL | 900191 | Using where | 
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

mysql> select * from users where user_name like 'USER�';


| 899689 | 0000899688 | USER899688 |         2 |         60 |     5976 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899789 | 0000899788 | USER899788 |         2 |        393 |    39278 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899889 | 0000899888 | USER899888 |         2 |        397 |    39618 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899989 | 0000899988 | USER899988 |         2 |         93 |     9247 | NULL         | 2011-01-27 00:00:00 |          0 | 
+--------+------------+------------+-----------+------------+----------+--------------+---------------------+------------+
9000 rows in set (0.61 sec)

mysql> explain select * from users where user_name like 'USER�';
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | users | ALL  | users_idx01   | NULL | NULL    | NULL | 900191 | Using where | 
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

 前方一致の場合はインデックスが使われていますが、中間一致や後方一致ではインデックスが使用されず、全表検索を行っています。LIKEの場合、インデックスが張ってあっても使用されない場合がある事を覚えておいてください。

INを使う場合、引数リストは確率の高い順に左から記述する。

INは条件リストの左から検査を行い、合致したらそこで終了します。そのため、INのリストには左から出現確率の多い順に記述すると若干ですが高速化します。

mysql> select * from users where user_level in (68,18,269);

| 899490 | 0000899489 | USER899489 |         2 |         68 |     6729 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899550 | 0000899549 | USER899549 |         2 |         18 |     1751 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899778 | 0000899777 | USER899777 |         2 |        269 |    26801 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899973 | 0000899972 | USER899972 |         2 |         18 |     1719 | NULL         | 2011-01-27 00:00:00 |          0 | 
+--------+------------+------------+-----------+------------+----------+--------------+---------------------+------------+
6753 rows in set (0.40 sec)

mysql> explain select * from users where user_level in (68,18,269);
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | users | ALL  | NULL          | NULL | NULL    | NULL | 900339 | Using where | 
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

mysql> select * from users where user_level in (269,18,68);

| 899490 | 0000899489 | USER899489 |         2 |         68 |     6729 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899550 | 0000899549 | USER899549 |         2 |         18 |     1751 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899778 | 0000899777 | USER899777 |         2 |        269 |    26801 | NULL         | 2011-01-27 00:00:00 |          0 | 
| 899973 | 0000899972 | USER899972 |         2 |         18 |     1719 | NULL         | 2011-01-27 00:00:00 |          0 | 
+--------+------------+------------+-----------+------------+----------+--------------+---------------------+------------+
6753 rows in set (0.55 sec)

mysql> explain select * from users where user_level in (269,18,68);
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | users | ALL  | NULL          | NULL | NULL    | NULL | 900339 | Using where | 
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

explainの結果は全く同じですが、検索時間は0.40秒と0.55秒とで若干ですが差異が出ています。データにあまりばらつきがないのでこのようになりましたが、ばらつきの激しいデータであればより顕著になるかと思います。

サブクエリを使う場合、INよりEXISTSを優先して使う?

INとEXISTSはたいていの場合、全く同じ結果を返します。ただし、EXISTSの方が高速に動作する事があり、またINをEXISTSに置き換えた場合遅くなるケースもあります。わかりづらいので例をみてみましょう。

mysql> create index users_idx01 on users(user_level,user_type);
Query OK, 900000 rows affected (16.74 sec)
Records: 900000  Duplicates: 0  Warnings: 0
mysql> create index users_idx02 on users(user_id);
Query OK, 900000 rows affected (31.47 sec)
Records: 900000  Duplicates: 0  Warnings: 0
mysql> create index sales_history_idx01 on sales_history(user_id);
Query OK, 3600000 rows affected (1 min 36.75 sec)
Records: 3600000  Duplicates: 0  Warnings: 0

mysql> create index sales_history_idx02 on sales_history(item_id);
Query OK, 3600000 rows affected (2 min 9.89 sec)
Records: 3600000  Duplicates: 0  Warnings: 0

mysql> select * from sales_history where user_id in (select user_id from users where user_level=40 and user_type=2);

| 3595628 | 0000821381 |    4859 |          0 |            0 | 2010-10-30 00:00:00 | 
| 3595857 | 0000658738 |    3410 |          0 |            0 | 2010-03-30 00:00:00 | 
| 3598488 | 0000863509 |    5956 |          0 |            0 | 2010-08-24 00:00:00 | 
| 3598896 | 0000879872 |    2261 |          0 |            0 | 2010-07-28 00:00:00 | 
+---------+------------+---------+------------+--------------+---------------------+
2936 rows in set (5.54 sec)

mysql> explain select * from sales_history where user_id in (select user_id from users where user_level=40 and user_type=2);
+----+--------------------+---------------+----------------+-------------------------+-------------+---------+------+---------+-------------+
| id | select_type        | table         | type           | possible_keys           | key         | key_len | ref  | rows    | Extra       |
+----+--------------------+---------------+----------------+-------------------------+-------------+---------+------+---------+-------------+
|  1 | PRIMARY            | sales_history | ALL            | NULL                    | NULL        | NULL    | NULL | 3600159 | Using where | 
|  2 | DEPENDENT SUBQUERY | users         | index_subquery | users_idx01,users_idx02 | users_idx02 | 32      | func |       1 | Using where | 
+----+--------------------+---------------+----------------+-------------------------+-------------+---------+------+---------+-------------+
2 rows in set (0.00 sec)

mysql> select * from sales_history as a where exists (select * from users as b where a.user_id=b.user_id and user_level=40 and user_type=2);

| 3595628 | 0000821381 |    4859 |          0 |            0 | 2010-10-30 00:00:00 | 
| 3595857 | 0000658738 |    3410 |          0 |            0 | 2010-03-30 00:00:00 | 
| 3598488 | 0000863509 |    5956 |          0 |            0 | 2010-08-24 00:00:00 | 
| 3598896 | 0000879872 |    2261 |          0 |            0 | 2010-07-28 00:00:00 | 
+---------+------------+---------+------------+--------------+---------------------+
2936 rows in set (6.51 sec)

mysql> explain select * from sales_history as a where exists (select * from users as b where a.user_id=b.user_id and user_level=40 and user_type=2);
+----+--------------------+-------+------+-------------------------+-------------+---------+----------------+---------+-------------+
| id | select_type        | table | type | possible_keys           | key         | key_len | ref            | rows    | Extra       |
+----+--------------------+-------+------+-------------------------+-------------+---------+----------------+---------+-------------+
|  1 | PRIMARY            | a     | ALL  | NULL                    | NULL        | NULL    | NULL           | 3600159 | Using where | 
|  2 | DEPENDENT SUBQUERY | b     | ref  | users_idx01,users_idx02 | users_idx02 | 32    | exam.a.user_id | 1 | Using where | 
+----+--------------------+-------+------+-------------------------+-------------+---------+----------------+---------+-------------+
2 rows in set (0.00 sec)

若干ですがEXISTSの方がINより遅くなっています。そして、EXISTSが同等か早くなる例は、

mysql> select * from sales_history as a where item_id=489 and exists (select * from users as b where a.user_id=b.user_id);

| 3555724 | 0000491392 |     489 |          0 |            0 | 2009-10-01 00:00:00 | 
| 3560001 | 0000537794 |     489 |          0 |            0 | 2010-09-26 00:00:00 | 
| 3571143 | 0000169593 |     489 |          0 |            0 | 2010-01-07 00:00:00 | 
| 3579831 | 0000892020 |     489 |          0 |            0 | 2010-03-26 00:00:00 | 
+---------+------------+---------+------------+--------------+---------------------+
623 rows in set (0.00 sec)

mysql> explain select * from sales_history as a where item_id=489 and exists (select * from users as b where a.user_id=b.user_id);
+----+--------------------+-------+------+---------------------+---------------------+---------+----------------+------+-------------+
| id | select_type        | table | type | possible_keys       | key                 | key_len | ref            | rows | Extra       |
+----+--------------------+-------+------+---------------------+---------------------+---------+----------------+------+-------------+
|  1 | PRIMARY            | a     | ref  | sales_history_idx02 | sales_history_idx02 | 5       | const          |  623 | Using where | 
|  2 | DEPENDENT SUBQUERY | b     | ref  | users_idx02         | users_idx02         | 32      | exam.a.user_id |    1 | Using index | 
+----+--------------------+-------+------+---------------------+---------------------+---------+----------------+------+-------------+
2 rows in set (0.00 sec)

mysql> select * from sales_history as a where item_id=489 and a.user_id in (select b.user_id from users as b);

| 3555724 | 0000491392 |     489 |          0 |            0 | 2009-10-01 00:00:00 | 
| 3560001 | 0000537794 |     489 |          0 |            0 | 2010-09-26 00:00:00 | 
| 3571143 | 0000169593 |     489 |          0 |            0 | 2010-01-07 00:00:00 | 
| 3579831 | 0000892020 |     489 |          0 |            0 | 2010-03-26 00:00:00 | 
+---------+------------+---------+------------+--------------+---------------------+
623 rows in set (0.01 sec)

mysql> explain select * from sales_history as a where item_id=489 and a.user_id in (select b.user_id from users as b);
+----+--------------------+-------+----------------+---------------------+---------------------+---------+-------+------+-------------+
| id | select_type        | table | type           | possible_keys       | key                 | key_len | ref   | rows | Extra       |
+----+--------------------+-------+----------------+---------------------+---------------------+---------+-------+------+-------------+
|  1 | PRIMARY            | a     | ref            | sales_history_idx02 | sales_history_idx02 | 5       | const |  623 | Using where | 
|  2 | DEPENDENT SUBQUERY | b     | index_subquery | users_idx02         | users_idx02         | 32      | func  |    1 | Using index | 
+----+--------------------+-------+----------------+---------------------+---------------------+---------+-------+------+-------------+
2 rows in set (0.00 sec)

 あまり有為な差は出ませんでした。どちらのケースもexplainでみる限りMySQLがうまく処理してしまっているようです。

 本来的にはINとEXISTSは以下のように動きます。 INはサブクエリを実行し、その結果を持って親表を検索します。 EXISTSは親表を検索し、都度EXISTSの条件を評価して検索します。

 このような違いから、INが早いパターンは、サブクエリの結果が条件でしぼられている場合。EXISTSが早いパターンは、親表が条件でしぼられている場合となります。

 ただし、MySQLやその他のDBMSは実行方法の最適化のため絶えず進化しているため、INとEXISTSはどちらが優れているとは一概にはいえないのです。

最後にINやEXISTSをJOINに置き換えるという事も出来ます。

mysql> select * from sales_history where user_id in (select user_id from users where user_level=40 and user_type=2);

| 3595628 | 0000821381 |    4859 |          0 |            0 | 2010-10-30 00:00:00 | 
| 3595857 | 0000658738 |    3410 |          0 |            0 | 2010-03-30 00:00:00 | 
| 3598488 | 0000863509 |    5956 |          0 |            0 | 2010-08-24 00:00:00 | 
| 3598896 | 0000879872 |    2261 |          0 |            0 | 2010-07-28 00:00:00 | 
+---------+------------+---------+------------+--------------+---------------------+
2936 rows in set (6.80 sec)

mysql> explain select * from sales_history where user_id in (select user_id from users where user_level=40 and user_type=2);
+----+--------------------+---------------+----------------+-------------------------+-------------+---------+------+---------+-------------+
| id | select_type        | table         | type           | possible_keys           | key         | key_len | ref  | rows    | Extra       |
+----+--------------------+---------------+----------------+-------------------------+-------------+---------+------+---------+-------------+
|  1 | PRIMARY            | sales_history | ALL            | NULL                    | NULL        | NULL    | NULL | 3600189 | Using where | 
|  2 | DEPENDENT SUBQUERY | users         | index_subquery | users_idx01,users_idx02 | users_idx02 | 32      | func |       1 | Using where | 
+----+--------------------+---------------+----------------+-------------------------+-------------+---------+------+---------+-------------+
2 rows in set (0.00 sec)

mysql> select * from sales_history as a where exists (select * from users as b where a.user_id=b.user_id and user_level=40 and user_type=2);

| 3595628 | 0000821381 |    4859 |          0 |            0 | 2010-10-30 00:00:00 | 
| 3595857 | 0000658738 |    3410 |          0 |            0 | 2010-03-30 00:00:00 | 
| 3598488 | 0000863509 |    5956 |          0 |            0 | 2010-08-24 00:00:00 | 
| 3598896 | 0000879872 |    2261 |          0 |            0 | 2010-07-28 00:00:00 | 
+---------+------------+---------+------------+--------------+---------------------+
2936 rows in set (8.11 sec)

mysql> explain select * from sales_history as a where exists (select * from users as b where a.user_id=b.user_id and user_level=40 and user_type=2);
+----+--------------------+-------+------+-------------------------+-------------+---------+----------------+---------+-------------+
| id | select_type        | table | type | possible_keys           | key         | key_len | ref            | rows    | Extra       |
+----+--------------------+-------+------+-------------------------+-------------+---------+----------------+---------+-------------+
|  1 | PRIMARY            | a     | ALL  | NULL                    | NULL        | NULL    | NULL           | 3600189 | Using where | 
|  2 | DEPENDENT SUBQUERY | b     | ref  | users_idx01,users_idx02 | users_idx02 | 32      | exam.a.user_id |       1 | Using where | 
+----+--------------------+-------+------+-------------------------+-------------+---------+----------------+---------+-------------+
2 rows in set (0.00 sec)

mysql> select a.* from sales_history as a inner join users as b on a.user_id=b.user_id and b.user_level=40 and b.user_type=2;

| 3361398 | 0000899637 |    1118 |          0 |            0 | 2009-12-16 00:00:00 | 
| 3426495 | 0000899637 |    4602 |          0 |            0 | 2010-09-22 00:00:00 | 
| 1976341 | 0000899650 |    1391 |          0 |            0 | 2010-08-14 00:00:00 | 
| 3484618 | 0000899650 |    4565 |          0 |            0 | 2010-07-27 00:00:00 | 
+---------+------------+---------+------------+--------------+---------------------+
2936 rows in set (0.02 sec)

mysql> explain select a.* from sales_history as a inner join users as b on a.user_id=b.user_id and b.user_level=40 and b.user_type=2;
+----+-------------+-------+------+-------------------------+---------------------+---------+----------------+------+-------------+
| id | select_type | table | type | possible_keys           | key                 | key_len | ref            | rows | Extra       |
+----+-------------+-------+------+-------------------------+---------------------+---------+----------------+------+-------------+
|  1 | SIMPLE      | b     | ref  | users_idx01,users_idx02 | users_idx01         | 8       | const,const    |  716 |             | 
|  1 | SIMPLE      | a     | ref  | sales_history_idx01     | sales_history_idx01 | 33      | exam.b.user_id |    2 | Using where | 
+----+-------------+-------+------+-------------------------+---------------------+---------+----------------+------+-------------+
2 rows in set (0.10 sec)

INもEXISTSもお世辞にも早いとはいえませんが、JOINに置き換えると明らかに高速化しています。ただし、JOINに置き換えると並び順は滅茶苦茶ですし、JOINよって結果行数が増えてしまう事もあるので注意が必要です。その事をクリアできるならばINやEXISTSのJOINへの置き換えは非常に有効です。

INSERTやUPDATEでJOINを使うテクニック

ここまで、SELECT文にしぼって解説してきましたが、INSERT文はおいといてUPDATEとDELETEについてちょっとしたテクニックを最後に示して終わりたいと思います。

ちょっとしたテクニックといってもUPDATEやDELETEの構文をよく見ると気づくのですが、とりあえずみてください。

mysql> update sales_history as a inner join items as b set a.complete_flg=1 where b.id=a.item_id and b.item_price1<=100;
Query OK, 0 rows affected (2.16 sec)
Rows matched: 351637  Changed: 0  Warnings: 0

mysql> update sales_history as a set complete_flg=0 where exists (select * from items as b where b.id=a.item_id and b.item_price1<=100);
Query OK, 351637 rows affected (15.74 sec)
Rows matched: 351637  Changed: 351637  Warnings: 0

mysql> update sales_history as a set complete_flg=0 where a.item_id in (select id from items as b where b.item_price1<=100);
Query OK, 0 rows affected (6.59 sec)
Rows matched: 351637  Changed: 0  Warnings: 0

INもしくはEXISTSをJOINに置き換えて検索したように、UPDATEやDELETEでINやEXISTSを使う場合もJOINを使って代用できます。もちろんSELECTのときと同じ注意点がありますが、UPDATEやDELETEで更新削除対象のデータを高速に検索するために利用するとよいでしょう。

以上、サワリ程度ですがSQLクエリのチューニングの基礎でした。