在应用中一个表中出现大量重复记录是常事,但有的时间我们希望过滤重复数据并取重复记录的一条记录,下面我来给大家介绍一个取重复记录其中一条的方法.
如下表,代码如下:
- CREATETABLE`t1`(
- `userid`INT(11)DEFAULTNULL,
- `atime`datetimeDEFAULTNULL,
- KEY`idx_userid`(`userid`)
- )ENGINE=InnoDBDEFAULTCHARSET=utf8;
数据如下,代码如下:
- MySQL>SELECT*FROMt1;
- +--------+---------------------+
- |userid|atime|
- +--------+---------------------+
- |1|2013-08-1211:05:25|
- |2|2013-08-1211:05:29|
- |3|2013-08-1211:05:32|
- |5|2013-08-1211:05:34|
- |1|2013-08-1211:05:40|
- |2|2013-08-1211:05:43|
- |3|2013-08-1211:05:48|
- |5|2013-08-1211:06:03|
- +--------+---------------------+
- 8ROWSINSET(0.00sec)
其中userid不唯一,要求取表中每个userid对应的时间离现在最近的一条记录,初看到一个这条件一般都会想到借用临时表及添加主建借助于join操作之类的.
给一个简方法,代码如下:
- MySQL>SELECTuserid,substring_index(group_concat(atimeORDERBYatimeDESC),",",1)ASatimeFROMt1GROUPBYuserid;
- +--------+---------------------+
- |userid|atime|
- +--------+---------------------+
- |1|2013-08-1211:05:40|
- |2|2013-08-1211:05:43|
- |3|2013-08-1211:05:48|
- |5|2013-08-1211:06:03|--phpfensi.com
- +--------+---------------------+
- 4ROWSINSET(0.03sec)
查询及删除重复记录,删除表中多余的重复记录,重复记录是根据单个字段(peopleId)来判断,只留有rowid最小的记录,代码如下:
- deletefrompeople
- wherepeopleIdin(selectpeopleIdfrompeoplegroupbypeopleIdhavingcount(peopleId)>1)
- androwidnotin(selectmin(rowid)frompeoplegroupbypeopleIdhavingcount(peopleId)>1)
查找表中多余的重复记录(多个字段),代码如下:
- select*fromvitaea
- where(a.peopleId,a.seq)in(selectpeopleId,seqfromvitaegroupbypeopleId,seqhavingcount(*)>1)
删除表中多余的重复记录(多个字段),只留有rowid最小的记录,代码如下:
- deletefromvitaea
- where(a.peopleId,a.seq)in(selectpeopleId,seqfromvitaegroupbypeopleId,seqhavingcount(*)>1)
- androwidnotin(selectmin(rowid)fromvitaegroupbypeopleId,seqhavingcount(*)>1)
查找表中多余的重复记录(多个字段),不包含rowid最小的记录,代码如下:
- select*fromvitaea
- where(a.peopleId,a.seq)in(selectpeopleId,seqfromvitaegroupbypeopleId,seqhavingcount(*)>1)
- androwidnotin(selectmin(rowid)fromvitaegroupbypeopleId,seqhavingcount(*)>1)