mysql - sql 优化问题,between比in好?
问题描述
看到网上的资料说:
in 和 not in 也要慎用,否则会导致全表扫描,如:select id from t where num in(1,2,3)对于连续的数值,能用 between 就不要用 in 了:select id from t where num between 1 and 3
问题解答
回答1:连续数值当然between好了 减少解析 并且in的范围默认超过一定数目就会走全表 9个还是多少忘了
针对楼下评论再补充一下答案:in走全表是分情况的,上面的解答只是凭记忆大概说了一下,楼下小伙伴比较认真,这里就再解释一下,这个不是个数而应该是个比例,大概25%-35%左右,你要再问到底多少不好意思水平有限不读源码没办法确认。然后这个30左右的比例也并不是说一定走全表扫描,因为mysql还有一个索引扫描,就是说如果select的内容在你的索引里面就能找到的话当然不会去扫全表了,比如下面的例子中select id from ttt where id in (..);和select * from ttt where id in (...);前面这个肯定是走主键扫描,即使你in了所有id值他也是走主键,而后面的情况就是这种百分比的情况了,具体看下面示例,欢迎指正^_^额,然后还想说一下between的情况,为什么好是因为除了索引段上连续存取减少解析以外,还有一个情况就是在磁盘寻址检索数据的时候,会默认读取第一次取值附近的部分数据(有这么一个概率算法说的是当一个数据被检索到的时候,他附近的数据也很大概率会被用到)所以就有了这么一个一次性取出冗余数据避免多次寻址的情况,这时候使用between的连续取值就正适用了
mysql> select * from ttt;+----+-------------------+| id | name |+----+-------------------+| 1 | I17021234001 || 2 | IC17031234002 || 3 | C17041234003 || 4 | IAsEw1234001 || 5 | I17021234001A2 || 6 | IC17031234002A2 || 7 | C17041234003A2 || 8 | IAsEw1234001A2 || 9 | I17021234001A2 || 10 | IC17031234002A2 || 11 | C17041234003A2 || 12 | IAsEw1234001A2 || 13 | I17021234001A2A2 || 14 | IC17031234002A2A2 || 15 | C17041234003A2A2 || 16 | IAsEw1234001A2A2 || 17 | I17021234001A2 || 18 | IC17031234002A2 || 19 | C17041234003A2 || 20 | IAsEw1234001A2 |+----+-------------------+20 rows in set (0.00 sec)mysql> show create table ttt;+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+| Table | Create Table |+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+| ttt | CREATE TABLE `ttt` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` char(32) DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=32 DEFAULT CHARSET=utf8 |+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+1 row in set (0.00 sec)mysql> explain select * from ttt where id in (1,2,3,4,5,6);+----+-------------+-------+------+---------------+------+---------+------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+------+---------------+------+---------+------+------+-------------+| 1 | SIMPLE | ttt | ALL | PRIMARY | NULL | NULL | NULL | 20 | Using where |+----+-------------+-------+------+---------------+------+---------+------+------+-------------+1 row in set (0.00 sec)mysql> explain select * from ttt where id in (1,2,3,4,5);+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+| 1 | SIMPLE | ttt | range | PRIMARY | PRIMARY | 4 | NULL | 5 | Using where |+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+1 row in set (0.00 sec)mysql> explain select * from ttt where id in (1,2,3);+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+| 1 | SIMPLE | ttt | range | PRIMARY | PRIMARY | 4 | NULL | 3 | Using where |+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+1 row in set (0.00 sec)mysql> explain select * from ttt where id in (1,2,3,4,5,6,7);+----+-------------+-------+------+---------------+------+---------+------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+------+---------------+------+---------+------+------+-------------+| 1 | SIMPLE | ttt | ALL | PRIMARY | NULL | NULL | NULL | 20 | Using where |+----+-------------+-------+------+---------------+------+---------+------+------+-------------+1 row in set (0.00 sec)mysql> explain select id from ttt where id in (1,2,3,4,5,6,7);+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+| 1 | SIMPLE | ttt | index | PRIMARY | PRIMARY | 4 | NULL | 20 | Using where; Using index |+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+1 row in set (0.00 sec)mysql> explain select name from ttt where id in (1,2,3,4,5,6,7);+----+-------------+-------+------+---------------+------+---------+------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+------+---------------+------+---------+------+------+-------------+| 1 | SIMPLE | ttt | ALL | PRIMARY | NULL | NULL | NULL | 20 | Using where |+----+-------------+-------+------+---------------+------+---------+------+------+-------------+1 row in set (0.00 sec)回答2:
根据数据库中B树索引的存储结构,在叶节点存储指向数据的物理地址,而这个物理地址当存在聚簇索引时是有序的。
如果是连续数值,between在找到第一个匹配值后,则直接从该地址往后搜索,直到最后一个元素为止。这样就不会对后续值进行索引扫描,因此速度快了。对于in操作,不大清楚,但是估计应该会对全索引进行扫描吧。回答3:
EXPLAIN mysql 语句 看下输出
回答4:使用between时只需要匹配上下界,故而会快一点;in每个都要看一遍,会造成全表扫描。