北京SEO

Mysql中如何查找并删除重复数据

2019/10/10/17:36:42  阅读:1863  来源:谷歌SEO算法  标签: 微信公众号

在数据库中我们经常会碰到重复数据了,下面我来介绍利用相关的mysql语句实现查找重复记录并且实现删除重复记录的sql语句.

考试系统中做了一个用户导入试题功能,导致用户导入了很多重复的试题,我需要查询及删除一下重复的记录,于是有了这篇文章.

(一)单个字段

1、查找表中多余的重复记录,根据(question_title)字段来判断,代码如下:

select * from questions where question_title in (select question_title from people group by question_title having count(question_title) > 1)

2、删除表中多余的重复记录,根据(question_title)字段来判断,只留有一个记录,代码如下:

  1. deletefromquestions
  2. wherepeopleIdin(selectpeopleIdfrompeoplegroupbypeopleIdhavingcount(question_title)>1)
  3. andmin(id)notin(selectquestion_idfromquestionsgroupbyquestion_titlehavingcount(question_title)>1)

(二)多个字段

删除表中多余的重复记录(多个字段),只留有rowid最小的记录,代码如下:

  1. DELETEFROMquestionsWHERE(questions_title,questions_scope)IN(SELECTquestions_title,questions_scopeFROMquestionsGROUPBYquestions_title,questions_scopeHAVINGCOUNT(*)>1)ANDquestion_idNOTIN(SELECTMIN(question_id)FROMquestionsGROUPBYquestions_scope,questions_titleHAVINGCOUNT(*)>1)

用上述语句无法删除,创建了临时表才删的,求各位达人解释一下,代码如下:

  1. CREATETABLEtmpASSELECTquestion_idFROMquestionsWHERE(questions_title,questions_scope)IN(SELECTquestions_title,questions_scopeFROMquestionsGROUPBYquestions_title,questions_scopeHAVINGCOUNT(*)>1)ANDquestion_idNOTIN(SELECTMIN(question_id)FROMquestionsGROUPBYquestions_scope,questions_titleHAVINGCOUNT(*)>1);
  2. DELETEFROMquestionsWHEREquestion_idIN(SELECTquestion_idFROMtmp);
  3. DROPTABLEtmp;

(三),代码如下:

  1. declare@maxinteger,@idinteger
  2. declarecur_rowscursorlocalforselect主字段,count(*)from表名groupby主字段havingcount(*)>;1
  3. opencur_rows
  4. fetchcur_rowsinto@id,@max
  5. while@@fetch_status=0
  6. begin
  7. select@max=@max-1
  8. setrowcount@max
  9. deletefrom表名where主字段=@id
  10. fetchcur_rowsinto@id,@max
  11. end
  12. closecur_rows
  13. setrowcount0

上面讲到了很多,下面我们一起来看实例删除重复记录实例.

例1,表中有主键(可唯一标识的字段),且该字段为数字类型,代码如下:

  1. /*表结构*/
  2. DROPTABLEIFEXISTS`t1`;
  3. CREATETABLEIFNOTEXISTS`t1`(
  4. `id`INT(1)NOTNULLAUTO_INCREMENT,
  5. `name`VARCHAR(20)NOTNULL,
  6. `add`VARCHAR(20)NOTNULL,
  7. PRIMARYKEY(`id`)
  8. )Engine=InnoDB;
  9. /*插入测试数据*/
  10. INSERTINTO`t1`(`name`,`add`)VALUES
  11. ('abc',"123"),
  12. ('abc',"123"),
  13. ('abc',"321"),
  14. ('abc',"123"),
  15. ('xzy',"123"),
  16. ('xzy',"456"),
  17. ('xzy',"456"),
  18. ('xzy',"456"),
  19. ('xzy',"789"),
  20. ('xzy',"987"),
  21. ('xzy',"789"),
  22. ('ijk',"147"),
  23. ('ijk',"147"),
  24. ('ijk',"852"),
  25. ('opq',"852"),
  26. ('opq',"963"),
  27. ('opq',"741"),
  28. ('tpk',"741"),
  29. ('tpk',"963"),
  30. ('tpk',"963"),
  31. ('wer',"546"),
  32. ('wer',"546"),
  33. ('once',"546");
  34. SELECT*FROM`t1`;
  35. +----+------+-----+
  36. |id|name|add|
  37. +----+------+-----+
  38. |1|abc|123|
  39. |2|abc|123|
  40. |3|abc|321|
  41. |4|abc|123|
  42. |5|xzy|123|
  43. |6|xzy|456|
  44. |7|xzy|456|
  45. |8|xzy|456|
  46. |9|xzy|789|
  47. |10|xzy|987|
  48. |11|xzy|789|
  49. |12|ijk|147|
  50. |13|ijk|147|
  51. |14|ijk|852|
  52. |15|opq|852|
  53. |16|opq|963|
  54. |17|opq|741|
  55. |18|tpk|741|
  56. |19|tpk|963|
  57. |20|tpk|963|
  58. |21|wer|546|
  59. |22|wer|546|
  60. |23|once|546|
  61. +----+------+-----+
  62. rowsinset(0.00sec)

查找id最小的重复数据(只查找id字段),代码如下:

  1. /*查找id最小的重复数据(只查找id字段)*/
  2. SELECTDISTINCTMIN(`id`)AS`id`
  3. FROM`t1`
  4. GROUPBY`name`,`add`
  5. HAVINGCOUNT(1)>1;
  6. +------+
  7. |id|
  8. +------+
  9. |1|
  10. |12|
  11. |19|
  12. |21|
  13. |6|
  14. |9|
  15. +------+
  16. rowsinset(0.00sec)

查找所有重复数据,代码如下:

  1. /*查找所有重复数据*/
  2. SELECT`t1`.*
  3. FROM`t1`,(
  4. SELECT`name`,`add`
  5. FROM`t1`
  6. GROUPBY`name`,`add`
  7. HAVINGCOUNT(1)>1
  8. )AS`t2`
  9. WHERE`t1`.`name`=`t2`.`name`
  10. AND`t1`.`add`=`t2`.`add`;
  11. +----+------+-----+
  12. |id|name|add|
  13. +----+------+-----+
  14. |1|abc|123|
  15. |2|abc|123|
  16. |4|abc|123|
  17. |6|xzy|456|
  18. |7|xzy|456|
  19. |8|xzy|456|
  20. |9|xzy|789|
  21. |11|xzy|789|
  22. |12|ijk|147|
  23. |13|ijk|147|
  24. |19|tpk|963|
  25. |20|tpk|963|
  26. |21|wer|546|
  27. |22|wer|546|
  28. +----+------+-----+
  29. rowsinset(0.00sec)

查找除id最小的数据外的重复数据,代码如下:

  1. /*查找除id最小的数据外的重复数据*/
  2. SELECT`t1`.*
  3. FROM`t1`,(
  4. SELECTDISTINCTMIN(`id`)AS`id`,`name`,`add`
  5. FROM`t1`
  6. GROUPBY`name`,`add`
  7. HAVINGCOUNT(1)>1
  8. )AS`t2`
  9. WHERE`t1`.`name`=`t2`.`name`
  10. AND`t1`.`add`=`t2`.`add`
  11. AND`t1`.`id`<>`t2`.`id`;
  12. +----+------+-----+
  13. |id|name|add|
  14. +----+------+-----+
  15. |2|abc|123|
  16. |4|abc|123|
  17. |7|xzy|456|
  18. |8|xzy|456|
  19. |11|xzy|789|
  20. |13|ijk|147|
  21. |20|tpk|963|
  22. |22|wer|546|
  23. +----+------+-----+
  24. rowsinset(0.00sec)

例2,表中没有主键(可唯一标识的字段),或者主键并非数字类型(也可以删除重复数据,但效率上肯定比较慢.

例2测试数据,代码如下:

  1. /*表结构*/
  2. DROPTABLEIFEXISTS`noid`;
  3. CREATETABLEIFNOTEXISTS`noid`(
  4. `pk`VARCHAR(20)NOTNULLCOMMENT'字符串主键',
  5. `name`VARCHAR(20)NOTNULL,
  6. `add`VARCHAR(20)NOTNULL,
  7. PRIMARYKEY(`pk`)
  8. )Engine=InnoDB;
  9. /*测试数据,与上例一样的测试数据,只是主键变为字符串形式*/
  10. INSERTINTO`noid`(`pk`,`name`,`add`)VALUES
  11. ('a','abc',"123"),
  12. ('b','abc',"123"),
  13. ('c','abc',"321"),
  14. ('d','abc',"123"),
  15. ('e','xzy',"123"),
  16. ('f','xzy',"456"),
  17. ('g','xzy',"456"),
  18. ('h','xzy',"456"),
  19. ('i','xzy',"789"),
  20. ('j','xzy',"987"),
  21. ('k','xzy',"789"),
  22. ('l','ijk',"147"),
  23. ('m','ijk',"147"),
  24. ('n','ijk',"852"),
  25. ('o','opq',"852"),
  26. ('p','opq',"963"),
  27. ('q','opq',"741"),
  28. ('r','tpk',"741"),
  29. ('s','tpk',"963"),
  30. ('t','tpk',"963"),
  31. ('u','wer',"546"),
  32. ('v','wer',"546"),
  33. ('w','once',"546");
  34. SELECT*FROM`noid`;
  35. +----+------+-----+
  36. |pk|name|add|
  37. +----+------+-----+
  38. |a|abc|123|
  39. |b|abc|123|
  40. |c|abc|321|
  41. |d|abc|123|
  42. |e|xzy|123|
  43. |f|xzy|456|
  44. |g|xzy|456|
  45. |h|xzy|456|
  46. |i|xzy|789|
  47. |j|xzy|987|
  48. |k|xzy|789|
  49. |l|ijk|147|
  50. |m|ijk|147|
  51. |n|ijk|852|
  52. |o|opq|852|
  53. |p|opq|963|
  54. |q|opq|741|
  55. |r|tpk|741|
  56. |s|tpk|963|
  57. |t|tpk|963|
  58. |u|wer|546|
  59. |v|wer|546|
  60. |w|once|546|
  61. +----+------+-----+
  62. rowsinset(0.00sec)

为表添加自增长的id字段,代码如下:

  1. /*为表添加自增长的id字段*/
  2. ALTERTABLE`noid`ADD`id`INT(1)NOTNULLAUTO_INCREMENT,ADDINDEX`id`(`id`);
  3. QueryOK,23rowsaffected(0.16sec)
  4. Records:23Duplicates:0Warnings:0
  5. SELECT*FROM`noid`;
  6. +----+------+-----+----+
  7. |pk|name|add|id|
  8. +----+------+-----+----+
  9. |a|abc|123|1|
  10. |b|abc|123|2|
  11. |c|abc|321|3|
  12. |d|abc|123|4|
  13. |e|xzy|123|5|
  14. |f|xzy|456|6|
  15. |g|xzy|456|7|
  16. |h|xzy|456|8|
  17. |i|xzy|789|9|
  18. |j|xzy|987|10|
  19. |k|xzy|789|11|
  20. |l|ijk|147|12|
  21. |m|ijk|147|13|
  22. |n|ijk|852|14|
  23. |o|opq|852|15|
  24. |p|opq|963|16|
  25. |q|opq|741|17|
  26. |r|tpk|741|18|
  27. |s|tpk|963|19|
  28. |t|tpk|963|20|
  29. |u|wer|546|21|
  30. |v|wer|546|22|
  31. |w|once|546|23|
  32. +----+------+-----+----+
  33. rowsinset(0.00sec)

MySQL中必须是有索引的字段才可以使用AUTO_INCREMENT,删除重复数据与上例一样,记得删除完数据把id字段也删除了,删除重复数据,只保留一条数据,代码如下:

  1. /*删除重复数据,只保留一条数据*/
  2. DELETEFROM`noid`
  3. USING`noid`,(
  4. SELECTDISTINCTMIN(`id`)AS`id`,`name`,`add`
  5. FROM`noid`
  6. GROUPBY`name`,`add`
  7. HAVINGCOUNT(1)>1
  8. )AS`t2`
  9. WHERE`noid`.`name`=`t2`.`name`
  10. AND`noid`.`add`=`t2`.`add`
  11. AND`noid`.`id`<>`t2`.`id`;
  12. QueryOK,8rowsaffected(0.05sec)
  13. /*删除id字段*/
  14. ALTERTABLE`noid`DROP`id`;
  15. QueryOK,15rowsaffected(0.16sec)
  16. Records:15Duplicates:0Warnings:0
  17. SELECT*FROM`noid`;
  18. +----+------+-----+
  19. |pk|name|add|
  20. +----+------+-----+
  21. |a|abc|123|
  22. |c|abc|321|
  23. |e|xzy|123|
  24. |f|xzy|456|
  25. |i|xzy|789|
  26. |j|xzy|987|
  27. |l|ijk|147|
  28. |n|ijk|852|
  29. |o|opq|852|
  30. |p|opq|963|
  31. |q|opq|741|
  32. |r|tpk|741|
  33. |s|tpk|963|
  34. |u|wer|546|//phpfensi.com
  35. |w|once|546|
  36. +----+------+-----+
  37. rowsinset(0.00sec)

广告内容

Mysql中如何查找并删除重复数据 Mysql中如何查找并删除重复数据 Mysql中如何查找并删除重复数据

相关阅读

热门评论

sunshine技术博客 sunshine技术博客

sunshine技术博客

总篇数164

精选文章

RMAN中catalog和nocatalog区别介绍 小技巧:为Linux下的文件分配多个权限 zimbra8.5.1安装第三方签名ssl证书的步骤 解决mysql不能远程连接数据库方法 windows服务器mysql增量备份批处理数据库 mysql中slow query log慢日志查询分析 JavaScript跨域问题总结 Linux下负载均衡软件LVS配置(VS/DR)教程 mysql中权限参数说明 MYSQL(错误1053)无法正常启动

SEO最新算法