Library

How to find and delete duplicate records in Firebird database

DELETE command (as it is defined in SQL standard) always deletes all records which satisfy the condition. Usually, we want to leave one of the duplicates and delete others. To do it, we need to use RDB$DB_KEY - an implicit physical record number in Firebird database, it can be used to create an artificial difference between identical records.

 DELETE FROM XXX T1 WHERE EXISTS 
  (SELECT * FROM XXX T2 WHERE 
     (T2.column1 = T1.column1 or (T2.column1 is null and T2.column1 is null)) AND 
     (T2.column2 = T1.column2 or (T2.column2 is null and T2.column2 is null)) AND 
     (.......) AND 
     (T2.RDB$DB_KEY > T1.RDB$DB_KEY))


In this case, we are using physical RDB$DB_KEY to distinguish 2 records with identical contents in specified fields (column1, column1, etc), and record with the bigger RDB$DB_KEY.

To find duplicates you can use the following query with RDB$DB_KEY condition:

SELECT * FROM TABLE T1 
WHERE EXISTS 
   (SELECT FIELD FROM TABLE T2 
    WHERE T1.FIELD = T2.FIELD AND  
          T1.RDB$DB_KEY != T2.RDB$DB_KEY)