mysql数据库分表性能优化分析

木木老贼 互联网营销达人~

在mysql使用时到超大千万级数据量时我们大多会考虑到分表查询,分库查询,下面我们来介绍一下关于分表查询时的优化.

我们的项目中有好多不等于的情况,今天写这篇文章简单的分析一下怎么个优化法,这里的分表逻辑是根据t_group表的user_name组的个数来分的.

因为这种情况单独user_name字段上的索引就属于烂索引,起不了啥名明显的效果.

1、试验PROCEDURE,代码如下:

  1. DELIMITER$$
  2. DROPPROCEDURE`t_girl`.`sp_split_table`$$
  3. CREATEPROCEDURE`t_girl`.`sp_split_table`()
  4. BEGIN
  5. declaredoneintdefault0;
  6. declarev_user_namevarchar(20)default'';
  7. declarev_table_namevarchar(64)default'';
  8. --Getallusers'name.
  9. declarecur1cursorforselectuser_namefromt_groupgroupbyuser_name;
  10. --Dealwitherrororwarnings.
  11. declarecontinuehandlerfor1329setdone=1;
  12. --Opencursor.
  13. opencur1;
  14. whiledone<>1
  15. do
  16. fetchcur1intov_user_name;
  17. ifnotdonethen
  18. --Gettablename.
  19. setv_table_name=concat('t_group_',v_user_name);
  20. --Createnewextratable.
  21. set@stmt=concat('createtable',v_table_name,'liket_group');
  22. prepares1from@stmt;
  23. executes1;
  24. dropprepares1;
  25. --Loaddataintoit.
  26. set@stmt=concat('insertinto',v_table_name,'select*fromt_groupwhereuser_name=''',v_user_name,'''');
  27. prepares1from@stmt;
  28. executes1;
  29. dropprepares1;
  30. endif;
  31. endwhile;
  32. --Closecursor.
  33. closecur1;
  34. --Freevariablefrommemory.
  35. set@stmt=NULL;
  36. END$$
  37. DELIMITER;

2、试验表,我们用一个有一千万条记录的表来做测试,代码如下:

  1. mysql>selectcount(*)fromt_group;
  2. +----------+
  3. |count(*)|
  4. +----------+
  5. |10388608|
  6. +----------+
  7. 1rowinset(0.00sec)

表结构,代码如下:

  1. mysql>desct_group;
  2. +-------------+------------------+------+-----+-------------------+----------------+
  3. |Field|Type|Null|Key|Default|Extra|
  4. +-------------+------------------+------+-----+-------------------+----------------+
  5. |id|int(10)unsigned|NO|PRI|NULL|auto_increment|
  6. |money|decimal(10,2)|NO||||
  7. |user_name|varchar(20)|NO|MUL|||
  8. |create_time|timestamp|NO||CURRENT_TIMESTAMP||
  9. +-------------+------------------+------+-----+-------------------+----------------+
  10. 4rowsinset(0.00sec)

索引情况.

  1. mysql>showindexfromt_group;
  2. +---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
  3. |Table|Non_unique|Key_name|Seq_in_index|Column_name|Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|
  4. +---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
  5. |t_group|0|PRIMARY|1|id|A|10388608|NULL|NULL||BTREE||
  6. |t_group|1|idx_user_name|1|user_name|A|8|NULL|NULL||BTREE||
  7. |t_group|1|idx_combination1|1|user_name|A|8|NULL|NULL||BTREE||
  8. |t_group|1|idx_combination1|2|money|A|3776|NULL|NULL||BTREE||
  9. +---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
  10. 4rowsinset(0.00sec)--phpfensi.com

PS:idx_combination1 这个索引是必须的,因为要对user_name来GROUP BY,此时属于松散索引扫描,当然完了后你可以干掉她.

idx_user_name 这个索引是为了加快单独执行constant这种类型的查询.

我们要根据用户名来分表,代码如下:

  1. mysql>selectuser_namefromt_groupwhere1groupbyuser_name;
  2. +-----------+
  3. |user_name|
  4. +-----------+
  5. |david|
  6. |leo|
  7. |livia|
  8. |lucy|
  9. |sarah|
  10. |simon|
  11. |sony|
  12. |sunny|
  13. +-----------+
  14. 8rowsinset(0.00sec)

所以结果表应该是这样的,代码如下:

  1. mysql>showtableslike't_group_%';
  2. +------------------------------+
  3. |Tables_in_t_girl(t_group_%)|
  4. +------------------------------+
  5. |t_group_david|
  6. |t_group_leo|
  7. |t_group_livia|
  8. |t_group_lucy|
  9. |t_group_sarah|
  10. |t_group_simon|
  11. |t_group_sony|
  12. |t_group_sunny|
  13. +------------------------------+
  14. 8rowsinset(0.00sec)

3、对比结果,代码如下:

  1. mysql>selectcount(*)fromt_groupwhereuser_name='david';
  2. +----------+
  3. |count(*)|
  4. +----------+
  5. |1298576|
  6. +----------+
  7. 1rowinset(1.71sec)

执行了将近2秒.

  1. mysql>selectcount(*)fromt_group_david;
  2. +----------+
  3. |count(*)|
  4. +----------+
  5. |1298576|
  6. +----------+
  7. 1rowinset(0.00sec)

几乎是瞬间的.

  1. mysql>selectcount(*)fromt_groupwhereuser_name<>'david';
  2. +----------+
  3. |count(*)|
  4. +----------+
  5. |9090032|
  6. +----------+
  7. 1rowinset(9.26sec)

执行了将近10秒,可以想象,这个是实际的项目中是不能忍受的.

  1. mysql>select(selectcount(*)fromt_group)-(selectcount(*)fromt_group_david)astotal;
  2. +---------+
  3. |total|
  4. +---------+
  5. |9090032|
  6. +---------+
  7. 1rowinset(0.00sec)

几乎是瞬间的.

我们来看看聚集函数,对于原表的操作,代码如下:

  1. mysql>selectmin(money),max(money)fromt_groupwhereuser_name='david';
  2. +------------+------------+
  3. |min(money)|max(money)|
  4. +------------+------------+
  5. |-6.41|500.59|
  6. +------------+------------+
  7. 1rowinset(0.00sec)

最小,最大值都是FULL INDEX SCAN,所以是瞬间的.

  1. mysql>selectsum(money),avg(money)fromt_groupwhereuser_name='david';
  2. +--------------+------------+
  3. |sum(money)|avg(money)|
  4. +--------------+------------+
  5. |319992383.84|246.417910|
  6. +--------------+------------+
  7. 1rowinset(2.15sec)

其他聚集函数的结果就不是FULL INDEX SCAN了,耗时2.15秒,对于小表的操作,代码如下:

  1. mysql>selectmin(money),max(money)fromt_group_david;
  2. +------------+------------+
  3. |min(money)|max(money)|
  4. +------------+------------+
  5. |-6.41|500.59|
  6. +------------+------------+
  7. 1rowinset(1.50sec)

最大最小值完全是FULL TABLE SCAN,耗时1.50秒,不划算,以此看来,代码如下:

  1. mysql>selectsum(money),avg(money)fromt_group_david;
  2. +--------------+------------+
  3. |sum(money)|avg(money)|
  4. +--------------+------------+
  5. |319992383.84|246.417910|
  6. +--------------+------------+
  7. 1rowinset(1.68sec)

取得这两个结果也是花了快2秒,快了一点,我们来看看这个小表的结构,代码如下:

  1. mysql>desct_group_david;
  2. +-------------+------------------+------+-----+-------------------+----------------+
  3. |Field|Type|Null|Key|Default|Extra|
  4. +-------------+------------------+------+-----+-------------------+----------------+
  5. |id|int(10)unsigned|NO|PRI|NULL|auto_increment|
  6. |money|decimal(10,2)|NO||||
  7. |user_name|varchar(20)|NO|MUL|||
  8. |create_time|timestamp|NO||CURRENT_TIMESTAMP||
  9. +-------------+------------------+------+-----+-------------------+----------------+
  10. 4rowsinset(0.00sec)

明显的user_name属性是多余的,那么就干掉它,代码如下:

  1. mysql>altertablet_group_daviddropuser_name;
  2. QueryOK,1298576rowsaffected(7.58sec)
  3. Records:1298576Duplicates:0Warnings:0

现在来重新对小表运行查询,代码如下:

  1. mysql>selectmin(money),max(money)fromt_group_david;
  2. +------------+------------+
  3. |min(money)|max(money)|
  4. +------------+------------+
  5. |-6.41|500.59|
  6. +------------+------------+
  7. 1rowinset(0.00sec)

此时是瞬间的.

  1. mysql>selectsum(money),avg(money)fromt_group_david;
  2. +--------------+------------+
  3. |sum(money)|avg(money)|
  4. +--------------+------------+
  5. |319992383.84|246.417910|
  6. +--------------+------------+
  7. 1rowinset(0.94sec)

这次算是控制在一秒以内了.

mysql> Aborted

小总结一下:分出的小表的属性尽量越少越好,大胆的去干吧.

相关广告
  • mysql数据库分表性能优化分析 mysql数据库分表性能优化分析 mysql数据库分表性能优化分析
相关阅读

mysql数据库分表性能优化分析

2019/10/10 17:33:43 | 谷歌SEO算法 | SEO培训