Let me first explain what this question is all about. Let’s say you have table which has names as shown in the below figure. Now in the names table you have duplicate records (ex. Shiv) and this table does not have a primary key.
So the question is how can we delete duplicate records and keep one of the records from the duplicates. For example from the below table how can we delete one “Shiv” and keep the other one.
Now there are lots of ways of doing and the best I found personally was by using “Row_Number” and “CTE” (common table expression). In case you are not aware of CTE and row_number please refer(What is CTE ) SQL Server interview question and (What is row_number ?) SQL Server interview question.
It’s a 2 step process:-
• Create a temp result set using CTE which has a new column which uses “row_number” using partition.
Now the “row_number” with partition will create unique numbers for unique records with same. In case the records are same it will increment the number. For instance you can see for duplicate Shiv record, he has numbered them 1 and 2.But for “Raju” and “Shyam” he has created fresh number sequence.
• Once the CTE is created delete the records whose row sequence number is greater than one.
Below is the complete code snippet for the same.
Here are some SQL Server interview question around database design taken from the book SQL Server interview question by Shivprasadkoirala.