access快速删除重复数据？

2024-11-19 07:31:54

推荐回答（2个）

回答1：

not in 逻辑上很容易理解，但是碰上大数据表时，不管有无可被利用的索引运行起来都会发生效率悲剧！应尽量避免使用。如果记录行达到数万、数十万行以上时，语句运行花上几个小时甚至数天才有结果都是很正常的。

如果Id字段是自增ID(建有索引)，下列写法的效率将会很高：

delete from zhifu a where not exists(select 1 from (select min(id) as idmin from zhifu group by bianhao)b where b.idmin=a.id)

如果Id字段没有索引，但是没有重复值，前面的写法效率也不行。推荐下列写法：

delete from zhifu c where exists(select 1 from (select a.id
FROM zhifu AS a LEFT JOIN (select min(id) as idmin from zhifu group by bianhao)b ON b.idmin=a.id
WHERE b.idmin is null)d where d.id=c.id)

如果Id字段不是自增ID，又没有索引，而且有重复值。这样就比较麻烦了，建议分多步走：
先创建一个临时表，将有重复的记录只取一条存入临时表，然后删除表中有重复的记录，再将临时表中的数据插入会原始表，最后在删除临时表。

假设用TEMP做临时表名
1)取有重复的唯一记录存入临时表（注意各个字段的顺序不要搞乱，而且要写齐）
select first(id),first(bianhao),....into temp from zhifu group by bianhao having count(*)>1
2)删除原始表中的有重复记录
delete from zhifu a where exists(select 1 from (select bianhao from zhifu group by bianhao having count(*)>1)b where b.bianhao=a.bianhao)
3)将临时表中的记录插入回原始表
insert into zhifu select * from temp
4)删除临时表
drop table temp

当然对付小数据表用NOT IN就行了，不用搞得那么麻烦。

回答2：

delete from zhifu t1
inner join (select bianhao, min(id) mid from zhifu group by bianhao) t2
on t1.bianhao=t2.bianhao
where t1.id>t2.mid
没测试过，自己试下