在mysql中出现重复的数据时我们有需要进行处理,下面我来给大家介绍关于mysql重复数据的一些处理方法,包括删除重复数据、排序数据且不重复、查询等等.
今日做一个功能需求,从表中排序取出前十个且数据不一致的数据,如下列a表中的值有以下数据:
- mysql>select*froma;
- +----+----------+
- |id|user|
- +----+----------+
- |1|zhangsan|
- |2|lisi|
- |3|wangwu|
- |4|zhangsan|
- |5|zhaosi|
- |6|wangwu|
- |7|lisi|
- |8|lisi|
- |9|zhaosi|
- +----+----------+
- 9rowsinset(0.00sec)
我们需要取出id最大的前四位,且user不能一致的信息,按照以上信息以及需求,我们要搜索出结果为:
- zhaosi
- lisi
- wangwu
- zhangsan
不能按照普通的做法,代码如下:
- mysql>select*fromaorderbyiddesclimit4;
- +----+--------+
- |id|user|
- +----+--------+
- |9|zhaosi|
- |8|lisi|
- |7|lisi|
- |6|wangwu|
- +----+--------+
- 4rowsinset(0.00sec)
这样搜索出来的有重复值,得使用distinct关键字,代码如下:
- mysql>selectdistinctuserfromaorderbyiddesclimit4;
- +----------+
- |user|
- +----------+
- |zhaosi|
- |wangwu|
- |lisi|
- |zhangsan|
- +----------+
- 4rowsinset(0.00sec)
其实应该是lisi与wangwu互换一下才是比较理想的,因为lisi最大的ID是8,而wangwu最大的ID是6,可能是lisi有一个ID为2导致的,我们把ID为2的删除,在来试试,代码如下:
- mysql>deletefromawhereid=2;
- QueryOK,1rowaffected(0.02sec)
- mysql>select*froma;
- +----+----------+
- |id|user|
- +----+----------+
- |1|zhangsan|
- |3|wangwu|
- |4|zhangsan|
- |5|zhaosi|
- |6|wangwu|
- |7|lisi|
- |8|lisi|
- |9|zhaosi|
- +----+----------+
- 8rowsinset(0.00sec)
- mysql>selectdistinctuserfromaorderbyiddesclimit4;
- +----------+--phpfensi.com
- |user|
- +----------+
- |lisi|
- |zhaosi|
- |wangwu|
- |zhangsan|
- +----------+
- 4rowsinset(0.00sec)
结果正是由于前边有较低的ID记录影响了排序.
虽然这条语句能搜索正确的效果,但可能排序不是那么理想,也就是ID最大的前四位能搜索出来,但在这四位数据里并不是按照ID大小排序的.
例1测试数据,代码如下:
- /*表结构*/
- DROPTABLEIFEXISTS`t1`;
- CREATETABLEIFNOTEXISTS`t1`(
- `id`INT(1)NOTNULLAUTO_INCREMENT,
- `name`VARCHAR(20)NOTNULL,
- `add`VARCHAR(20)NOTNULL,
- PRIMARYKEY(`id`)
- )Engine=InnoDB;
- /*插入测试数据*/
- INSERTINTO`t1`(`name`,`add`)VALUES
- ('abc',"123"),
- ('abc',"123"),
- ('abc',"321"),
- ('abc',"123"),
- ('xzy',"123"),
- ('xzy',"456"),
- ('xzy',"456"),
- ('xzy',"456"),
- ('xzy',"789"),
- ('xzy',"987"),
- ('xzy',"789"),
- ('ijk',"147"),
- ('ijk',"147"),
- ('ijk',"852"),
- ('opq',"852"),
- ('opq',"963"),
- ('opq',"741"),
- ('tpk',"741"),
- ('tpk',"963"),
- ('tpk',"963"),
- ('wer',"546"),
- ('wer',"546"),
- ('once',"546");
- SELECT*FROM`t1`;
- +----+------+-----+
- |id|name|add|
- +----+------+-----+
- |1|abc|123|
- |2|abc|123|
- |3|abc|321|
- |4|abc|123|
- |5|xzy|123|
- |6|xzy|456|
- |7|xzy|456|
- |8|xzy|456|
- |9|xzy|789|
- |10|xzy|987|
- |11|xzy|789|
- |12|ijk|147|
- |13|ijk|147|
- |14|ijk|852|
- |15|opq|852|
- |16|opq|963|
- |17|opq|741|
- |18|tpk|741|
- |19|tpk|963|
- |20|tpk|963|
- |21|wer|546|
- |22|wer|546|
- |23|once|546|
- +----+------+-----+
- rowsinset(0.00sec)
查找id最小的重复数据(只查找id字段),代码如下:
- /*查找id最小的重复数据(只查找id字段)*/
- SELECTDISTINCTMIN(`id`)AS`id`
- FROM`t1`
- GROUPBY`name`,`add`
- HAVINGCOUNT(1)>1;
- +------+
- |id|
- +------+
- |1|
- |12|
- |19|
- |21|
- |6|
- |9|
- +------+
- rowsinset(0.00sec)
查找所有重复数据,代码如下:
- /*查找所有重复数据*/
- SELECT`t1`.*
- FROM`t1`,(
- SELECT`name`,`add`
- FROM`t1`
- GROUPBY`name`,`add`
- HAVINGCOUNT(1)>1
- )AS`t2`
- WHERE`t1`.`name`=`t2`.`name`
- AND`t1`.`add`=`t2`.`add`;
- +----+------+-----+
- |id|name|add|
- +----+------+-----+
- |1|abc|123|
- |2|abc|123|
- |4|abc|123|
- |6|xzy|456|
- |7|xzy|456|
- |8|xzy|456|
- |9|xzy|789|
- |11|xzy|789|
- |12|ijk|147|
- |13|ijk|147|
- |19|tpk|963|
- |20|tpk|963|
- |21|wer|546|
- |22|wer|546|
- +----+------+-----+
- rowsinset(0.00sec)
查找除id最小的数据外的重复数据,代码如下:
- /*查找除id最小的数据外的重复数据*/
- SELECT`t1`.*
- FROM`t1`,(
- SELECTDISTINCTMIN(`id`)AS`id`,`name`,`add`
- FROM`t1`
- GROUPBY`name`,`add`
- HAVINGCOUNT(1)>1
- )AS`t2`
- WHERE`t1`.`name`=`t2`.`name`
- AND`t1`.`add`=`t2`.`add`
- AND`t1`.`id`<>`t2`.`id`;
- +----+------+-----+
- |id|name|add|
- +----+------+-----+
- |2|abc|123|
- |4|abc|123|
- |7|xzy|456|
- |8|xzy|456|
- |11|xzy|789|
- |13|ijk|147|
- |20|tpk|963|
- |22|wer|546|
- +----+------+-----+
- rowsinset(0.00sec)
例2,表中没有主键(可唯一标识的字段),或者主键并非数字类型(也可以删除重复数据,但效率上肯定比较慢).
例2测试数据,代码如下:
- /*表结构*/
- DROPTABLEIFEXISTS`noid`;
- CREATETABLEIFNOTEXISTS`noid`(
- `pk`VARCHAR(20)NOTNULLCOMMENT'字符串主键',
- `name`VARCHAR(20)NOTNULL,
- `add`VARCHAR(20)NOTNULL,
- PRIMARYKEY(`pk`)
- )Engine=InnoDB;
- /*测试数据,与上例一样的测试数据,只是主键变为字符串形式*/
- INSERTINTO`noid`(`pk`,`name`,`add`)VALUES
- ('a','abc',"123"),
- ('b','abc',"123"),
- ('c','abc',"321"),
- ('d','abc',"123"),
- ('e','xzy',"123"),
- ('f','xzy',"456"),
- ('g','xzy',"456"),
- ('h','xzy',"456"),
- ('i','xzy',"789"),
- ('j','xzy',"987"),
- ('k','xzy',"789"),
- ('l','ijk',"147"),
- ('m','ijk',"147"),
- ('n','ijk',"852"),
- ('o','opq',"852"),
- ('p','opq',"963"),
- ('q','opq',"741"),
- ('r','tpk',"741"),
- ('s','tpk',"963"),
- ('t','tpk',"963"),
- ('u','wer',"546"),
- ('v','wer',"546"),
- ('w','once',"546");
- SELECT*FROM`noid`;
- +----+------+-----+
- |pk|name|add|
- +----+------+-----+
- |a|abc|123|
- |b|abc|123|
- |c|abc|321|
- |d|abc|123|
- |e|xzy|123|
- |f|xzy|456|
- |g|xzy|456|
- |h|xzy|456|
- |i|xzy|789|
- |j|xzy|987|
- |k|xzy|789|
- |l|ijk|147|
- |m|ijk|147|
- |n|ijk|852|
- |o|opq|852|
- |p|opq|963|
- |q|opq|741|
- |r|tpk|741|
- |s|tpk|963|
- |t|tpk|963|
- |u|wer|546|
- |v|wer|546|
- |w|once|546|
- +----+------+-----+
- rowsinset(0.00sec)
为表添加自增长的id字段,代码如下:
- /*为表添加自增长的id字段*/
- ALTERTABLE`noid`ADD`id`INT(1)NOTNULLAUTO_INCREMENT,ADDINDEX`id`(`id`);
- QueryOK,23rowsaffected(0.16sec)
- Records:23Duplicates:0Warnings:0
- SELECT*FROM`noid`;
- +----+------+-----+----+
- |pk|name|add|id|
- +----+------+-----+----+
- |a|abc|123|1|
- |b|abc|123|2|
- |c|abc|321|3|
- |d|abc|123|4|
- |e|xzy|123|5|
- |f|xzy|456|6|
- |g|xzy|456|7|
- |h|xzy|456|8|
- |i|xzy|789|9|
- |j|xzy|987|10|
- |k|xzy|789|11|
- |l|ijk|147|12|
- |m|ijk|147|13|
- |n|ijk|852|14|
- |o|opq|852|15|
- |p|opq|963|16|
- |q|opq|741|17|
- |r|tpk|741|18|
- |s|tpk|963|19|
- |t|tpk|963|20|
- |u|wer|546|21|
- |v|wer|546|22|
- |w|once|546|23|
- +----+------+-----+----+
- rowsinset(0.00sec)
MySQL中必须是有索引的字段才可以使用AUTO_INCREMENT.
删除重复数据与上例一样,记得删除完数据把id字段也删除了,删除重复数据,只保留一条数据,代码如下:
- /*删除重复数据,只保留一条数据*/
- DELETEFROM`noid`
- USING`noid`,(
- SELECTDISTINCTMIN(`id`)AS`id`,`name`,`add`
- FROM`noid`
- GROUPBY`name`,`add`
- HAVINGCOUNT(1)>1
- )AS`t2`
- WHERE`noid`.`name`=`t2`.`name`
- AND`noid`.`add`=`t2`.`add`
- AND`noid`.`id`<>`t2`.`id`;
- QueryOK,8rowsaffected(0.05sec)
- /*删除id字段*/
- ALTERTABLE`noid`DROP`id`;
- QueryOK,15rowsaffected(0.16sec)
- Records:15Duplicates:0Warnings:0
- SELECT*FROM`noid`;
- +----+------+-----+
- |pk|name|add|
- +----+------+-----+
- |a|abc|123|
- |c|abc|321|
- |e|xzy|123|
- |f|xzy|456|
- |i|xzy|789|
- |j|xzy|987|
- |l|ijk|147|
- |n|ijk|852|
- |o|opq|852|
- |p|opq|963|
- |q|opq|741|
- |r|tpk|741|
- |s|tpk|963|
- |u|wer|546|
- |w|once|546|
- +----+------+-----+
- rowsinset(0.00sec)