mysql中innodb表中count()优化

小明SEO博客 小明SEO博客,新时代SEO博客

count()是用来统计数据表中所有记录的一个函数了,但在此函数在innodb中性能不怎么样了,下面我们来看看mysql中innodb表中count()优化,希望例子对各位有帮助.

起因:在innodb表上做count(*)统计实在是太慢了,因此想办法看能不能再快点.

现象:先来看几个测试案例,如下.

一、sbtest 表上的测试.

  1. showcreatetablesbtest\G
  2. ***************************1.row***************************
  3. Table:sbtest
  4. CreateTable:CREATETABLE`sbtest`(
  5. `aid`bigint(20)unsignedNOTNULLauto_increment,
  6. `id`int(10)unsignedNOTNULLdefault'0',
  7. `k`int(10)unsignedNOTNULLdefault'0',
  8. `c`char(120)NOTNULLdefault'',
  9. `pad`char(60)NOTNULLdefault'',
  10. PRIMARYKEY(`aid`),
  11. KEY`k`(`k`),
  12. KEY`id`(`id`)
  13. )ENGINE=InnoDBAUTO_INCREMENT=1000001DEFAULTCHARSET=latin1
  14. showindexfromsbtest;
  15. +--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
  16. |Table|Non_unique|Key_name|Seq_in_index|Column_name|Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|--phpfensi.com
  17. +--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
  18. |sbtest|0|PRIMARY|1|aid|A|1000099|NULL|NULL||BTREE||
  19. |sbtest|1|k|1|k|A|18|NULL|NULL||BTREE||
  20. |sbtest|1|id|1|id|A|1000099|NULL|NULL||BTREE||
  21. +--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

填充了 100万条 记录.

1、直接 count(*)

  1. explainSELECTCOUNT(*)FROMsbtest;
  2. +----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+
  3. |id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|
  4. +----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+
  5. |1|SIMPLE|sbtest|index|NULL|PRIMARY|8|NULL|1000099|Usingindex|
  6. +----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+
  7. SELECTCOUNT(*)FROMsbtest;
  8. +----------+
  9. |COUNT(*)|
  10. +----------+
  11. |1000000|
  12. +----------+
  13. 1rowinset(1.42sec)

可以看到,如果不加任何条件,那么优化器优先采用 primary key 来进行扫描.

2、count(*) 使用 primary key 字段做条件.

  1. explainSELECTCOUNT(*)FROMsbtestWHEREaid>=0;
  2. +----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+
  3. |id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|
  4. +----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+
  5. |1|SIMPLE|sbtest|range|PRIMARY|PRIMARY|8|NULL|485600|Usingwhere;Usingindex|
  6. +----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+
  7. SELECTCOUNT(*)FROMsbtestWHEREaid>=0;
  8. +----------+
  9. |COUNT(*)|
  10. +----------+
  11. |1000000|
  12. +----------+
  13. 1rowinset(1.39sec)

可以看到,尽管优化器认为只需要扫描 485600 条记录(其实是索引),比刚才少多了,但其实仍然要做全表(索引)扫描,因此耗时和第一种相当.

3、count(*) 使用 secondary index 字段做条件

  1. explainSELECTCOUNT(*)FROMsbtestWHEREid>=0;
  2. +----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+
  3. |id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|
  4. +----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+
  5. |1|SIMPLE|sbtest|range|id|id|4|NULL|500049|Usingwhere;Usingindex|
  6. +----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+
  7. SELECTCOUNT(*)FROMsbtestWHEREid>=0;
  8. +----------+
  9. |COUNT(*)|
  10. +----------+
  11. |1000000|
  12. +----------+
  13. 1rowinset(0.43sec)

可以看到,采用这种方式查询会非常快,有人也许会问了,会不会是因为 id 字段的长度比 aid 字段的长度来的小,导致它扫描起来比较快呢?先不着急下结论,咱们来看看下面的测试例子.

二、sbtest1 表上的测试

  1. showcreatetablesbtest1\G
  2. ***************************1.row***************************
  3. Table:sbtest1
  4. CreateTable:CREATETABLE`sbtest1`(
  5. `aid`int(10)unsignedNOTNULLAUTO_INCREMENT,
  6. `id`bigint(20)unsignedNOTNULLDEFAULT'0',
  7. `k`int(10)unsignedNOTNULLDEFAULT'0',
  8. `c`char(120)NOTNULLDEFAULT'',
  9. `pad`char(60)NOTNULLDEFAULT'',
  10. PRIMARYKEY(`aid`),
  11. KEY`k`(`k`),
  12. KEY`id`(`id`)
  13. )ENGINE=InnoDBAUTO_INCREMENT=1000001DEFAULTCHARSET=latin1
  14. showindexfromsbtest1;
  15. +---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
  16. |Table|Non_unique|Key_name|Seq_in_index|Column_name|Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|
  17. +---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
  18. |sbtest1|0|PRIMARY|1|aid|A|1000099|NULL|NULL||BTREE||
  19. |sbtest1|1|k|1|k|A|18|NULL|NULL||BTREE||
  20. |sbtest1|1|id|1|id|A|1000099|NULL|NULL||BTREE||
  21. +---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

这个表里,把 aid 和 id 的字段长度调换了一下,也填充了 1000万条记录.

1、直接 count(*).

  1. explainSELECTCOUNT(*)FROMsbtest1;
  2. +----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+
  3. |id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|
  4. +----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+
  5. |1|SIMPLE|sbtest1|index|NULL|PRIMARY|4|NULL|1000099|Usingindex|
  6. +----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+
  7. SELECTCOUNT(*)FROMsbtest1;
  8. +----------+
  9. |COUNT(*)|
  10. +----------+
  11. |1000000|
  12. +----------+
  13. 1rowinset(1.42sec)

可以看到,如果不加任何条件,那么优化器优先采用 primary key 来进行扫描.

2、count(*) 使用 primary key 字段做条件.

  1. explainSELECTCOUNT(*)FROMsbtest1WHEREaid>=0;
  2. +----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
  3. |id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|
  4. +----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
  5. |1|SIMPLE|sbtest1|range|PRIMARY|PRIMARY|4|NULL|316200|Usingwhere;Usingindex|
  6. +----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
  7. 1rowinset(0.00sec)
  8. SELECTCOUNT(*)FROMsbtest1WHEREaid>=0;
  9. +----------+
  10. |COUNT(*)|
  11. +----------+
  12. |1000000|
  13. +----------+
  14. 1rowinset(1.42sec)

可以看到,尽管优化器认为只需要扫描 485600 条记录(其实是索引),比刚才少多了,但其实仍然要做全表(索引)扫描,因此耗时和第一种相当.

3、count(*) 使用 secondary index 字段做条件.

  1. explainSELECTCOUNT(*)FROMsbtest1WHEREid>=0;
  2. +----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+
  3. |id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|
  4. +----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+
  5. |1|SIMPLE|sbtest1|range|id|id|8|NULL|500049|Usingwhere;Usingindex|
  6. +----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+
  7. 1rowinset(0.00sec)
  8. SELECTCOUNT(*)FROMsbtest1WHEREid>=0;
  9. +----------+
  10. |COUNT(*)|
  11. +----------+
  12. |1000000|
  13. +----------+
  14. 1rowinset(0.45sec)

可以看到,采用这种方式查询会非常快,上面的所有测试,均在 mysql 5.1.24 环境下通过,并且每次查询前都重启了 mysqld.

可以看到,把 aid 和 id 的长度调换之后,采用 secondary index 查询仍然是要比用 primary key 查询来的快很多。看来主要不是字段长度引起的索引扫描快慢,而是采用 primary key 以及 secondary index 引起的区别,那么,为什么用 secondary index 扫描反而比 primary key 扫描来的要快呢?我们就需要了解innodb的 clustered index 和secondary index 之间的区别了.

innodb 的 clustered index 是把 primary key 以及 row data 保存在一起的,而 secondary index 则是单独存放,然后有个指针指向 primary key,因此,需要进行 count(*) 统计表记录总数时,利用 secondary index 扫描起来,显然更快,而primary key则主要在扫描索引,同时要返回结果记录时的作用较大,例如:

SELECT * FROM sbtest WHERE aid = xxx;

那既然是使用 secondary index 会比 primary key 更快,为何优化器却优先选择 primary key 来扫描呢,Heikki Tuuri 的回答是:

  1. intheexampletable,thesecondaryindexisinsertedintoinaperfectorder!Thatis
  2. veryunusual.Normallythesecondaryindexwouldbefragmented,causingrandomdiskI/O,
  3. andthescanwouldbeslowerthanintheprimaryindex.
  4. Iamchangingthistoafeaturerequest:keep'clusteringratio'statisticsonasecondary
  5. indexanddothescanthereiftheorderisalmostthesameasintheprimaryindex.I
  6. doubtthisfeaturewilleverbeimplemented,though.

相关广告
  • mysql中innodb表中count()优化 mysql中innodb表中count()优化 mysql中innodb表中count()优化
相关阅读

mysql中innodb表中count()优化

2019/10/10 17:32:37 | 谷歌SEO算法 | 百度K站