Created
May 24, 2018 10:02
-
-
Save jometho/7732ee518f8752e6320618c0ccad3af0 to your computer and use it in GitHub Desktop.
Handling duplicates records in postgresql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--Given table coverage with field name which is guaranteed uniqueness | |
-- You can select duplicated records with this "sequel" on the table | |
select coverage.name, count(*) from catalog.coverage GROUP BY coverage.name | |
HAVING count(*) > 1; | |
--Same scenario now say you want to delete the redundant tables and remain with only one record for each | |
--You can use this SQL, where name is the unique column | |
--ctid is a postgresql system columns for the physical row location of a record | |
DELETE FROM catalog.coverage a USING ( | |
SELECT MIN(ctid) as ctid, name | |
FROM catalog.coverage | |
GROUP BY name HAVING COUNT(*) > 1 | |
) b | |
WHERE a.name = b.name | |
AND a.ctid <> b.ctid |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment