Monday, March 22, 2010

SQL SERVER – Delete Duplicate Records

SQL SERVER – Delete Duplicate Records – Rows

March 1, 2007 by pinaldave

Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.

DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn2)

Reference : Pinal Dave (http://blog.SQLAuthority.com)
Ads by Google
Compare SQL databases
Easily compare and synchronize your SQL databases. Free 14-day trial.
www.idera.com



Posted in Pinal Dave, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology | Tagged Duplicate Records | 232 Comments
232 Responses

1.
on April 16, 2007 at 5:28 am | Reply Rahul

if there is no key in the table then what ?

I have on table named [Duplicate] {ID int,FNAME varchar(10),MNAME varchar(10)}

Here there is no key and here are duplicate rows. so hoca can i delete this duplicate rows.

Check Data

ID FNAME LNAME
1 AAA CCC
2 BBB DDD
1 AAA CCC
2 BBB DDD
1 AAA CCC
2 BBB DDD
3 BCB DGD

Remove duplicate rows and keep the data in to the table like this using single query.

ID FNAME LNAME
1 AAA CCC
2 BBB DDD
3 BCB DGD

PLease if possible help because i faced this question in many interviews.
Reply me

2.
on April 16, 2007 at 6:31 am | Reply pinaldave

Good Question. I have been asking this question in interview many times to new candidates. Answer is:
1) Add Identity Col and perform the operation displayed in this blog and drop Identity Col.
2) User Cursor and save your values in temp table. If you receive the same row which you put in temp table delete it.
3) You can use while loop as well do the same as cursor.

Pinal Dave
(http://www.sqlauthority.com)

*
on December 9, 2009 at 1:14 pm | Reply siddhartha

Hi Pinal,

Could you please answer ,what will be the exact query for Rahul’s posted question where he has listed an example.I am stuck with the same example.Provide One or More answers if possible and by writing the exact query.
Thanks.

o
on December 9, 2009 at 4:36 pm | Reply Tejas Shah

Hi Siddhatha,

Pinal already mentioned steps for that.
I would like to use ROW_NUMBER which is available from SQL SERVER 2005.

DECLARE @Duplicate TABLE (
ID INT,
FNAME VARCHAR(10),
MNAME VARCHAR(10)
)

INSERT INTO @Duplicate VALUES(1, ‘AAA’,'CCC’)
INSERT INTO @Duplicate VALUES(2, ‘BBB’,'DDD’)
INSERT INTO @Duplicate VALUES(1, ‘AAA’,'CCC’)
INSERT INTO @Duplicate VALUES(2, ‘BBB’,'DDD’)
INSERT INTO @Duplicate VALUES(1, ‘AAA’,'CCC’)
INSERT INTO @Duplicate VALUES(2, ‘BBB’,'DDD’)
INSERT INTO @Duplicate VALUES(3, ‘BCB’,'DGD’)

–FOR SQL SERVER 2005 and above
;WITH CTE as(
SELECT ROW_NUMBER() OVER(PARTITION BY ID, FName, MName ORDER BY (SELECT 1)) AS RowID,
*
FROM @Duplicate
)
SELECT ID, FName, MName
FROM CTE
WHERE RowID = 1

*
on February 23, 2010 at 1:25 pm | Reply Abhijit

Pinal,
How about using the ROW_NUMBER() to delete the duplicate records?

Thanks,
Abhijit

o
on February 23, 2010 at 9:14 pm | Reply Pinal Dave

Hello Abhijit,

Yes, the method describe just above by Tejas is a good alternative to remove duplicate records.

Regards,
Pinal Dave

+
on February 25, 2010 at 3:55 pm | Reply Haripriya

sir ,

pls temme 4 sql server 2000.Me too Having the same probs.
i have nearly some lakhs record… currently am working… by my small mistake i had executed two times the same insert query… donno wat to do….pls help

#
on February 25, 2010 at 3:56 pm Haripriya

i wanna delete the duplicated records …. and that too all are got shuffled…

+
on February 26, 2010 at 9:30 am | Reply Haripriya

pinal,

help me…

#
on February 26, 2010 at 3:09 pm Pinal Dave

Hello Haripriya,

The best method depends on the structure and data in table. But for now I can provide you the easiest method that is storing the distinct records (SELECT DISTINCT * FROM table) in another table and then replace the original table with new table.
Another method is describe here:
http://support.microsoft.com/kb/70956

Regards,
Pinal Dave

3.
on April 20, 2007 at 2:11 pm | Reply Dan W

Wouldn’t it be a little more simple to use Group By or Distinct to get the unique records in that situation?
SELECT ID, FNAME, LNAME
FROM Duplicate
GROUP BY ID, FNAME, LNAME
would give you the unique rows of Duplicate

Rough Steps of one possibility to fix the table:
1) Save the results of the group by query to a temp table
2) Delete rows from the original table
3) Fill original table with rows of temp table

That you way you don’t have to use a cursor or change the design of table Duplicate. However, in the real world, you’d want to add a unique constraint to the ID column of table Duplicate to make sure it doesn’t get any more duplicates by ID.

4.
on April 20, 2007 at 2:17 pm | Reply pinaldave

David,

Thanks! That is good idea.

Regards,
Pinal Dave
(http://www.SQLAuthority.com)

5.
on April 20, 2007 at 2:31 pm | Reply Dan W

Whoops, I almost forgot…

Thanks for the post about deleting duplicates! Your code was easy find with a Google search. It’s exactly what I was needing. It’s simple and elegant! I sometimes find it difficult to think of how to do things like that, so it was nice to find your post about it!

6.
on May 1, 2007 at 8:19 pm | Reply Ben M

While searching for this solution I found many others which use cursors and complicated SQL statements but in the end they do the same thing as your simple & powerful solution. Thank you for posting it online!

7.
on May 1, 2007 at 8:21 pm | Reply pinaldave

My pleasure.

Kind Regards,
Pinal Dave
(http://www.SQLAuthority.com)

8.
on May 24, 2007 at 4:23 am | Reply Jelle

Seriously thanks,

Keep plopping my head why i didnt come up with simplicity.

– note
Plopping is a merely profound and uncatastrophic way of activating the tiny brain members that are usually on vacation.

– note 2
nevermind the note

9.
on May 24, 2007 at 9:29 am | Reply jaort2007

Pinal,

Once again you are the man! I looked at other sites to answer this question and I found this SQL statement to be the best and easiest. Now I know to just come straight here.

10.
on May 31, 2007 at 4:56 am | Reply Ravi

If you are looking for deleting duplicates of more than 2 records, then the following code can be used…

set nocount on
drop table tbl1
create table tbl1
(
col1 int
)
insert into tbl1 values(1)
insert into tbl1 values(1)
insert into tbl1 values(1)
insert into tbl1 values(2)
insert into tbl1 values(2)
insert into tbl1 values(2)
insert into tbl1 values(2)
insert into tbl1 values(3)
insert into tbl1 values(3)

set rowcount 1
select ’start’
while @@rowcount > 0 delete a from tbl1 a where (select count(*) from tbl1 b where a.col1 = b.col1)>1
set rowcount 0
select * from tbl1
set nocount off

11.
on May 31, 2007 at 5:01 am | Reply Ravi

If you are looking for deleting multiple records (duplicate with more than 2 records like 3,4,5 duplicate records, etc) and without key column then the following method can be adopted.

Lets say your table is tab1.

SELECT DISTINCT * INTO tab2 FROM tab1 — distinct of all columns

DROP TABLE tab1

EXEC sp_rename ‘tab2′,’tab1′

Note:
Take a back up of your original table before using dropping for your reference. You may need to grant permission of tab2 as given for tab1

12.
on May 31, 2007 at 2:51 pm | Reply Izzak

Simple is good! You Rock!

(I found so many answers elsewhere and it’s incredible how 6 lines of code and one push of a button can do with your code)

13.
on June 14, 2007 at 12:23 am | Reply Amit Jain

Ravi,
your solution seems to be excelent but what about triggers and indexes defined on that table tab1.
will they remain intact??
Please repsond ….

Amit

14.
on June 14, 2007 at 6:03 am | Reply pinaldave

Amit,

Answer to your question is that will not remain intact. You may have to recreate all of them as well as all the constraints.

It is recommended to use the method explained in the original post to delete duplicate records.

Regards,
Pinal Dave(http://www.SQLAuthority.com)

15.
on June 26, 2007 at 3:36 pm | Reply Vamshi

Well to answer Rahul’s question and modify Ravi’s and Pinal Dave query…
Here is a single query which will answer all your questions but this still needs Primary key or identity column to be added to the table though.
DELETE FROM MyTable
WHERE EXISTS (
SELECT * FROM MyTable AS b
WHERE
b.[col1] = MyTable.[col1]
AND b.[col2] = MyTable.[col2]
AND b.[col3] = MyTable.[col3]
GROUP BY
b.[col1], b.[col2], b.[col3]
HAVING
MyTable.[ID] > MIN(b.[ID])
)
If you want most recent records to be present in your DB and delete multiple Old duplicate records, You can make use of
DELETE FROM MyTable
WHERE EXISTS (
SELECT * FROM MyTable AS b
WHERE
b.[col1] = MyTable.[col1]
AND b.[col2] = MyTable.[col2]
AND b.[col3] = MyTable.[col3]
GROUP BY
b.[col1], b.[col2], b.[col3]
HAVING
MyTable.[ID] MIN(b.[ID])
)
Hope this answers all your questions.
If you dont have Primary key or Identity Column then you can always create it and run this query and delete the Primary Key or identity column.
“there are always better things to do to make lives easier.”
enjoy your day
Vamshi

16.
on July 3, 2007 at 1:49 am | Reply Senthilnathan

IN SQL SERVER 2005, This can be easily achieved without crating unique identifier by using CTE and ROW_NUMBER (), the modified query for sql server 2005 goes here
***********************************************
WITH T1 AS (SELECT ROW_NUMBER ( ) OVER ( PARTITION BY ID, FNAME, LNAME ORDER BY ID ) AS RNUM FROM DUPLICATE )
DELETE FROM T1 WHERE RNUM > 1
***********************************************
To get a detail grasp on ROW_NUMBER () OVER () … Refer MSDN http://msdn2.microsoft.com/en-us/library/ms186734.aspx for.

17.
on July 9, 2007 at 5:02 am | Reply Ankur

I have by mistake duplicated all my records.

Through export/import wizard I selected my database (e.g pension) and copied tables to the desination, later I came to know that source and destination where the same.

Therefore, in all the table all rows are duplicated.

Can someone solve the problem.

18.
on July 11, 2007 at 7:11 am | Reply SQL SERVER - Count Duplicate Records - Rows Journey to SQL Authority with Pinal Dave

[...] 11th, 2007 by pinaldave In my previous article SQL SERVER – Delete Duplicate Records – Rows, we have seen how we can delete all the duplicate records in one simple query. In this article we [...]

19.
on July 13, 2007 at 4:28 pm | Reply satish ramahnujan

Thanks Ravi, your information helped. I modified the query in the following manner and it worked fine:

Select distinct * into temp from tab1;
Delete tab1;

Insert tab1 Select * from temp;

drop table temp;

20.
on July 24, 2007 at 10:23 am | Reply Dipak

how can we delete duplicate rows if table doesn’t have any identity column and requirement is you don’t have to insert one column as identity and you don’t have to use cursor and you don’t have to use temp table. so query should be a single

Please reply

21.
on July 24, 2007 at 10:27 am | Reply pinaldave

Dipak,

What in application makes you have this many requirement?
I would do what I have suggested in initial post or add identity column and drop it afterwards.

Regards,
Pinal Dave (SQLAuthority.com)

*
on November 10, 2009 at 9:09 pm | Reply Aditya

Pinal rather than adding a ID col and dropping it I think WHILE loop (which you already suggested) will be better opeion and even better than that is use ROW_NUMBER() function and simply delete the records which has row_number count > 1

select row_number() over (partition by col order by col from tbl) and just delete where row_number() is greater than one and if one can set rowcount 1 also so if 2 records are exactly identical only 1 will be deleted….

thanks

22.
on July 24, 2007 at 10:33 am | Reply Dipak

this question i have asked in one of my interview, and requirement is like delete record should be in single query its ok if you can write subquery, but without use of temp table, without use cursor, and without use of insert identity column. still i am waiting for the answer,

23.
on July 24, 2007 at 10:35 am | Reply pinaldave

Please read the comments above there are solutions to your question. This blog readers have already answered that.

24.
on July 24, 2007 at 12:05 pm | Reply Tayeb

I’m trying to keep only the recent SaleDate in the following table. How can I do that?

ID SaleDate SaleAmount
40 6/23/2003 242
40 12/28/2001 212
40 6/13/1994 111
41 11/30/2001 233
41 10/25/1996 15
41 4/21/1994 132
42 6/17/2005 2765
42 5/14/1994 147
43 5/9/1994 145

Thanks much

25.
on July 31, 2007 at 11:42 am | Reply Vamshi

Tayeb below query should work for you

DELETE FROM #tempTab WHERE
EXISTS
(SELECT ID FROM #tempTab AS b WHERE #tempTab.ID = b.ID
GROUP BY b.ID
HAVING #tempTab.SaleDate < MAX(b.SaleDate)) 26. on August 3, 2007 at 10:00 am | Reply Vamshi Depak, to answer your question here is a solution in SQL Server 2005 CREATE TABLE #Table1 (col1 int, col2 int) INSERT INTO #Table1 VALUES (1, 1000) INSERT INTO #Table1 VALUES (2, 2000) INSERT INTO #Table1 VALUES (2, 2000) INSERT INTO #Table1 VALUES (3, 3000) INSERT INTO #Table1 VALUES (3, 3000) INSERT INTO #Table1 VALUES (4, 4000) SELECT * FROM #Table1; WITH T1 AS (SELECT (ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col1)) AS RNum FROM #Table1) DELETE FROM T1 WHERE RNum IN (SELECT a.RNum FROM T1 AS a, T1 AS b WHERE a.RNum > b.RNum GROUP BY a.RNum)

SELECT * FROM #Table1

DROP TABLE #Table1

ROW_NUMBER() will solve your purpose.

27.
on August 3, 2007 at 10:22 am | Reply pinaldave

Vamshi,

Excellent post. If you prefer, you can write post with little more explanation. I will be happy to post on my blog with your name.

Regards,
Pinal Dave (http://www.SQLAuthority.com)

28.
on August 10, 2007 at 8:12 am | Reply Geetha

Hi
I had problem with deleting dupilicate rows. i have used your code. work fine. thanks a lot.
Geetha

29.
on September 7, 2007 at 8:24 am | Reply Jaanoo

Awesome man. I have used you delete duplicate records query. And it works great.

Thanks

30.
on September 14, 2007 at 11:30 am | Reply Scollege

Unfortunately, this query doesn’t seem to work when attempting to delete a row that has duplicate id’s but different data.

Can’t believe how hard it is to find something simple that will work.

31.
on September 23, 2007 at 2:07 am | Reply Aditya

Why don’e we do a self union to the table.

select * from mytable union select * from mytable,

thsi query returns the exact records, and i guess we can use this data to build a new table or replacing the previous one.

Thoughts on this please

32.
on September 25, 2007 at 10:38 am | Reply bingo

I am trying to design an query which functions like ICETOOL of mainframe it also requires this kind of duplicate deletions…

ICETOOL in my application (sorts, merges and separates) duplicates and uniques from 4 input file into two output files(one containing unique ids and other containing its duplicate ids with different data) and discards the duplicates which occurs more than twice…

i have the 4 files as tables now i need to get two output tables

help will be much appreciated………

33.
on September 26, 2007 at 7:05 am | Reply Santosh

Hi
Sir can u please let me how to pass aliase name in Stored
procedure when I am using table as Parameter in A SP

34.
on September 26, 2007 at 11:51 am | Reply Emil

How would your solution work if I had a single table called Customers that looks like this:

ID LastName FirstName
1 Smith John Adam
2 Smith John A.
3 Jones Jane Mary
4 Jones Jane M.

No other fields are in this table. Just what you see. The ID field is unique for each record. In this example, both records for Smith are the same person. The same goes for Jones. I want to view in a table the resulting unique records (i.e. only one record for John Smith, and one record for Jane Jones).

I am a complete newbie and would appreciate guidance.

Thanks!

35.
on September 27, 2007 at 5:48 am | Reply Aditya

If you have Key already, you have seen many solutions in this post before, you need to group by on the repeating/columns that can be duplicated and delete them. The answered solution is if you dont have a unique key in the table.

well to add more to that, once you do self union and insert into temp table, you can delete every thing form main table and insert every thing from temp table in to the original/main table. I think this should help..

-Aditya

36.
on October 8, 2007 at 4:38 am | Reply Joe

SELECT id,name,mNumber,sNumber,partOfSpeech,type FROM keyword where id IN (Select max(id) from keyword GROUP BY name)

The above is if you want to just select a column without duplicates in it.

37.
on October 11, 2007 at 4:15 am | Reply Ravish Bharti

create table T1 (C1 int, C2 varchar(3))

–Fill it with duplicate Records

declare @as Table (c1 int,c2 varchar(3))
insert into @as select Distinct * from T1
Truncate table T1
Alter Table T1 Disable Trigger All
insert into T1 select * from @As
Alter Table T1 Enable Trigger All

38.
on October 19, 2007 at 10:13 pm | Reply Firdaus Juzup

Thanks Mr. Dave for the codes.
I’ve searched inside out in the internet but still couldn’t find a solution for my problem.
Your code manage to solved it, your truly a master in SQL. Thanks again

39.
on November 12, 2007 at 12:18 am | Reply Mhlove

Can you show how to make this a select Into, so that one can save the deleted records in another table

DELETE
FROM MyTable
WHERE ID NOT IN
(SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicatevalueColumn1, DuplicateValueColumn2, DuplicateValueColumn2

40.
on November 13, 2007 at 5:46 pm | Reply raja

I thing this one alos workout
====================
set rowcount 1
delete test
from test a
where(select count(*) from test b where b.age = a.age and b.name = a.name)
> 1
while @@rowcount>0
delete test
from test a
where(select count(*) from test b where b.age = a.age and b.name = a.name)
>1
set rowcount 0

41.
on November 13, 2007 at 6:58 pm | Reply Dilip Mohapatro

//Here n is maximum allowable number of a record
create or replace procedure t_dili1_2
(n in int,q in out sys_refcursor)
as

begin

open q for

select a,b
from dil1 where (a,b) in(
select a,b from (select a,b,count(*) from t group by a,b having count(*) 1) and AddressCount = 0

cheers,
anand.

58.
on December 25, 2007 at 3:49 pm | Reply jaya

very nice logic…u r great

59.
on December 26, 2007 at 11:57 am | Reply ATIN

–For finding second highest salary
select max(salary) from emp where salary 1

drop view abc

62.
on January 2, 2008 at 3:42 pm | Reply Kanhaiya

Hi All,

I would like to share one suggestion that is :
should’t We conclude each topic with one best answer (if we can).

63.
on January 4, 2008 at 12:35 pm | Reply Priya

Hi,

Thank you so much for sharing your knowledge.
Great Work.

64.
on January 4, 2008 at 6:18 pm | Reply satish

hi,
realy this is super i have not found any where very nice.
cheers,
Satish

65.
on January 5, 2008 at 1:43 pm | Reply Suchita

Hello Sir,

I recently joined your site, and found it really very helpful.

How about using ‘ROWID’ to delete the duplicate rows.
Please check this query.

DELETE
FROM MyTable
WHERE ROWID NOT IN
(SELECT MIN(ROWID)
FROM MyTable
GROUP BY DUPL_COL1,DUPL_COL2,DUPL_COL3)…
–All col names

Now, my question is that if I have more than two duplicate records I want to keep 2 of them and to remove rest.
How can I do it?

Please help out.

Thank you.

66.
on January 5, 2008 at 2:01 pm | Reply ssatish kumar

Hi

its a great thing to share knowledge

thanks for your help

ssatish kumar

67.
on January 8, 2008 at 11:49 am | Reply Angadi Doddappa

Hi ATIN(59),

For finding only second highest salary – -

select * from
(select * from employee orderby salary desc
where rownum>=2)
minus
select * from
(select * from employee orderby salary desc
where rownum>=1) ;

And to get Only Top 2 Highest salary –

select * from
(select * from employee orderby salary desc
where rownum>=2);

Thanks & Regards
Angadi Doddappa

68.
on January 12, 2008 at 1:19 pm | Reply Sameer Bhatnagar

Thanks Pinal Dave….

U are doing a great Job…

All the Best all Of u…

Jai Hind…

69.
on January 15, 2008 at 12:01 pm | Reply Ranjith

Hi,

This page looks really cool, hope I will get answer for my question, I have a table with 35 columns and have duplicate rows based on 6 columns. So how do I remove duplicates and keep the original rows in the table, keep in mind table has around 500,000 rows.

70.
on January 25, 2008 at 5:44 pm | Reply mansi

hiii
i want to select the second highest salary from an Employee table!
how to do that!

Thanks!

71.
on January 29, 2008 at 11:15 am | Reply Rashmi

Hi, please help me…. I couldn’t find answer for this anywhere…

I have a table with a column having data type varchar and it has data with numbers and alphabets in it. I want to replace the data in the column with ‘ ‘ where there are alphabets.
(REPLACE((column),’S',”)) — I know this would replace s with ‘ ‘, but how to replace all the alphabets with ‘ ‘.

Please answer, I urgently need the answer.

Thanks to all in advance….

72.
on January 29, 2008 at 1:08 pm | Reply Besho

Hellooo

I have a table for example EMployees, it has a PID which is auto number, and other 2 fields one EmployeeID and other is Name.

what i want is to delete duplicated data in Employee ID and Name for sure, what to do plz….

Thanks

73.
on January 31, 2008 at 3:44 pm | Reply venkat

DELETE
FROM EMployees
WHERE Employee ID IN
(select Employee ID
from EMployees
group by EMployees
having count(EMployees) > 1)

just try this……;

74.
on January 31, 2008 at 3:46 pm | Reply venkat

can any budy explain trigger concept…

75.
on February 3, 2008 at 10:11 pm | Reply Girish

this question i have asked in one of my interview, and requirement is like delete record should be in single query its ok if you can write subquery, but without use of temp table, without use cursor, and without use of insert identity column. still i am waiting for the answer,

Pls give solution as in sql server 2000

76.
on February 4, 2008 at 12:01 pm | Reply Pankaj

Hi
I am having a problem in generating reports from two SQL tables.
eg.
table1.
loc1 10
loc2 10
loc3 40

Table2
loc1 xx
loc1 yy
loc1 zz
loc3 pp

I want to print.

Table1 + table2
loc1 10 loc1 xx
null loc1 yy
null loc1 zz
loc2 10 null
loc3 40 loc3 pp

Any way to generate this using joins or any other.Pl. advise

pankaj

77.
on February 8, 2008 at 4:41 pm | Reply Raja

U are doing a great Job…

All the Best all Of u…

i like this site

thanks

78.
on February 9, 2008 at 5:08 am | Reply Jude

Great solution, however….i have a table that has…………..

1] Duplicate Id’s
2] Unique id’s

Now on the basis of your query i would be able to delete the duplicate id’s(and records corresponding to that is)..but that would delete all subsequent unique id’s as well.

So basically i wanto find all duplicates, choose one, delete the rest and keep the unique records as well.

Could you or anyone please help me find a solution to this…

Thanks!

79.
on February 10, 2008 at 4:31 am | Reply Raju

Marvellous men!!!!!!!!1

80.
on February 11, 2008 at 6:56 pm | Reply pradeep

Hi ,

Pls help in this issue .
create table test1
( A int , B varchar(2) , C varchar(20) , D int , E varchar(20))

insert into test1 values (1,’A',’aaa’ , 111, ‘adf’)
insert into test1 values (2,’B',’bbb’ , 222, ‘adsf’)
insert into test1 values (3,’C',’ccc’ , 333, ‘cfe’)
insert into test1 values (1,’A',’aaa’ , 444, ‘vgd’)
insert into test1 values (4,’D',’ddd’ , 555, ‘qef’)

Now for the above table Columns A , B and C (combined) form the unique identifiers for a given record.

I would like a query to retrieve the unique records(identified by columns A , B and C in the table )
i.e
2,’B',’bbb’ , 222, ‘adsf’
3,’C',’ccc’ , 333, ‘cfe’
4,’D',’ddd’ , 555, ‘qef’

Constraint being that i cannot add any composite / identity for the three columns (A,B ,C)

Any help is highly appreciated……

81.
on February 12, 2008 at 5:20 pm | Reply Faisal qureshi

its really nice

regds
faisal qureshi

82.
on February 13, 2008 at 3:05 pm | Reply Tarun Pant

My solution is:

select * from DUPLICATE
union
select * from DUPLICATE

This query will filter out all duplicate records from DUPLICATE table

Tarun

83.
on February 13, 2008 at 4:05 pm | Reply A.K.Rastogi

71.
Hi, please help me…. I couldn’t find answer for this anywhere…

I have a table with a column having data type varchar and it has data with numbers and alphabets in it. I want to replace the data in the column with ‘ ‘ where there are alphabets.
(REPLACE((column),’S’,”)) — I know this would replace s with ‘ ‘, but how to replace all the alphabets with ‘ ‘.

Please answer, I urgently need the answer.

The Answer is as follow :
You have to use Stuff function in place of Replace.
e.g.

select stuff(column,1,len(column),”) from Table_name

A.K.Rastogi

84.
on March 4, 2008 at 9:54 pm | Reply gogetter

This soluton worked perfectly in my situation! I had read many other suggestions online but most of them were overly complicated and usually required creating additional tables. Thanks for posting this!

85.
on March 22, 2008 at 12:50 am | Reply Saroop Chand

Hi Pradeep,

Today I saw your question. It’s late, but if you haven’t find any solution below may help…
____________________________________________________________________________________________
SELECT * FROM test1 t1, (SELECT A, B, C FROM test1 GROUP BY A,B,C HAVING COUNT(1)>1) t2
WHERE t1.A t2.A AND t1.B t2.B AND t1.C t2.C
____________________________________________________________________________________________

Regards
Saroop

86.
on March 22, 2008 at 12:55 am | Reply Saroop Chand

____________________________________________________________________________________________
SELECT * FROM test1 t1, (SELECT A, B, C FROM test1 GROUP BY A,B,C HAVING COUNT(1)>1) t2
WHERE t1.A <> t2.A AND t1.B <> t2.B AND t1.C <> t2.C
____________________________________________________________________________________________

87.
on March 24, 2008 at 11:47 am | Reply Ashok

thanks satish
tahnks ravi
cool !!!!!!!

88.
on March 24, 2008 at 4:40 pm | Reply deepti

hi dave,
i m really imprssd with ur blog. n u knw d bst part is u rply to each possible post.
hoping to get best out dis blog
thanks
and all d best
deepti

89.
on March 28, 2008 at 11:36 am | Reply Kannan

Hi all,

i have doupt ..plz clarify that.

create table a(empid int)
create table b(empid1 int)

insert into a values(1)
insert into b values(1)

insert into a values(2)
insert into b values(3)

output:
empid
———–
1
2

empid1
———–
1
3

with out using temporary table to del the comman record from separate two table.

90.
on April 1, 2008 at 11:01 pm | Reply mgwalk

Wow this is nice and simple!

91.
on April 4, 2008 at 10:26 pm | Reply Pravin Phatangare

Hi All,

Those who have requested for
How to get 1st, 2nd, 3rd, 4th, nth topmost salary from an Employee table

The following solution is for getting 6th highest salary from Employee table ,

SELECT TOP 1 salary FROM (SELECT DISTINCT TOP 6 salary FROM employee
ORDER BY salary DESC) a ORDER BY salary

You can change and use it for getting nth highest salary from Employee table as follows

SELECT TOP 1 salary FROM (SELECT DISTINCT TOP n salary FROM employee
ORDER BY salary DESC) a ORDER BY salary

where n > 1 (n is always greater than one)

92.
on April 13, 2008 at 6:24 pm | Reply Jay

I need to be able to select only the record with the max(datetime) of the same guid. Is this possible? I only want the record with the date of 6:53:02 and not 5:07:32 and I want to keep the record with 5:07:19 (different guid)

b16569a1-e3dc-4f03-b2a0-bf3e2677c466 rptProviders_Provider2Location 4/13/2008 5:07:19 AM 4/11/2008 6:46:29 PM

36b6419a-a3a5-49ad-b84d-942c89c4754c rptProviders_Provider2Location 4/13/2008 5:07:32 AM 4/11/2008 6:46:29 PM

36b6419a-a3a5-49ad-b84d-942c89c4754c rptProviders_Provider2Location 4/13/2008 6:53:02 AM 4/11/2008 6:46:29 PM

93.
on April 13, 2008 at 6:26 pm | Reply Jay

Here is the query I have so far:

SELECT P2L.ProviderId, dbo.AUDIT.TableName, dbo.AUDIT.OccurredAt, P2L.xLastUpdated
FROM dbo.rptProviders_Provider2Location AS P2L WITH (NOLOCK) INNER JOIN
dbo.rptProviders_Provider2LocationCategory AS P2LC WITH (NOLOCK) ON P2L.xId = P2LC.Provider2LocationId INNER JOIN
dbo.AUDIT ON P2L.xId = dbo.AUDIT.RowId

94.
on April 23, 2008 at 4:16 am | Reply Dave

I’ve got a slightly different problem. I’m creating a stored procedure which will retrieve several different columns from a hospital table (month, year, patient name, intervention hours, and date changed) Those aren’t the real column names; the idea is that records are kept when a patient’s behavior requires staff intervention. The records are entered monthly, but frequently–in a given time range–the reports show patients’ names more than once, because people update the intervention time numbers (stuff is loaded in wrong, then corrected, or loaded in as 0 to meet a deadline, then updated, etc). I need to get rid of the duplications, meaning that when names are shown more than once for a given month and year, I only want one of the duplicated intervention records to show up. I want to filter it by “date changed,” meaning the most recent entry for each patient.

So when I run an “interventions” report for specified time frame at a given hospital, I want to see the intervention hours broken down by patient name (but I only want to see the most recently entered record for each patient who had intervention activity during the time parameters of the report).

How can I modify my SELECT statement to do this?

95.
on April 23, 2008 at 10:57 am | Reply saravanan.p

how to solve this problem?

ID(identity column) name
——————————————

1 david
2 john
3 ram
4 paul
5 aish
6 raj
7 mani
8 madu
9 cotter

now i deleted the 4′th row

4 paul

ok

now how can i display the remaining names as given below

ID(identity column) name
——————————————

1 david
2 john
3 ram
4 aish
5 raj
6 mani
7 madu
8 cotter

any body can help me sir?

saranpselvam@gmail.com

96.
on April 23, 2008 at 3:11 pm | Reply Mayank Mishra

Is it possible to delete a single column from a table in SQL 2000.

Thanks

97.
on May 13, 2008 at 1:45 am | Reply Ray McCoy

This works in 2000. WARNING: According to MS, SET ROWCOUNT will not work with INSERT, DELETE, and UPDATE in later versions.

– Create table with a number of values between zero and nine
select a+b+c as val
into dbo.rmtemp
from (select 0 a union all select 1 union all select 2 union all select 3) a
, (select 0 b union all select 1 union all select 2 union all select 3) b
, (select 0 c union all select 1 union all select 2 union all select 3) c

– Show what you’ve got
select val,count(*) row_count from dbo.rmtemp group by val

– Limit processing to a single row
set rowcount 1
– While you’ve got duplicates, delete a row
while (select top 1 val from dbo.rmtemp group by val having count(*) > 1) is not null
delete from dbo.rmtemp where val in (select top 1 val from dbo.rmtemp group by val having count(*) > 1);
– Remove single row processing limit
set rowcount 0

– Confirm that only uniques remain
select val,count(*) row_count from dbo.rmtemp group by val

– Clean up
drop table dbo.rmtemp

– Comment
I knew there had to be a better way.

98.
on May 13, 2008 at 9:48 pm | Reply SQL SERVER - 2000 - SQL SERVER - Delete Duplicate Records - Rows - Readers Contribution Journey to SQL Authority with Pinal Dave

[...] pinaldave I am proud on readers of this blog. One of the reader asked asked question on article SQL SERVER – Delete Duplicate Records – Rows and another reader followed up with nice quick answer. Let us read them both [...]

99.
on June 10, 2008 at 5:50 pm | Reply micheal

hi pinal dave ,

i have a question for u , i want to retrive the rows from a table by avoiding duplicate rows with all coloumns .i.e; duplicate row sholud be retrieved only once . any help reg…

100.
on June 17, 2008 at 4:14 pm | Reply Smitha

Its really a nice

101.
on June 18, 2008 at 8:51 pm | Reply EstebanD

Hi Pinal,

Thanks for your tips.

I have a question. We have a table that have many contacts that are duplicated more than once. And we need a query that excludes all duplicated people who has already registered is there a more efficient way of doing it than this?

SELECT CD.email
, CD.contactid
FROM contact CD
WHERE (CD.opt_out = 0)
AND (CD.email ‘NoEmail’)
AND (CD.email IS NOT NULL)
AND (CD.email ”)
AND (CD.registered = 0)
AND Lower (CD.email) + ‘||’ + Lower (cd.firstname) + ‘||’ + Lower (CD.lastname) NOT IN
(SELECT Lower (CD1.email) + ‘||’ + Lower (CD1.firstname) + ‘||’ + Lower (CD1.lastname)
FROM contact CD1
WHERE (CD.registered = 1)
)
GROUP BY CD.contactid
, CD.email

Thank you,

EstebanD

102.
on June 19, 2008 at 7:04 pm | Reply preethi

Dear Sir,
i a have a table with id as one column,name as another column , in name there r diffrent names is there is possible to display all the alphabets in the name column.it is very urgent.where i can see mu solution if else post it to my id.

103.
on June 20, 2008 at 12:39 pm | Reply Jegan

I am having a query inside the view which is returning more than one value. I need all the values to be returned in a separate row while running the view. Can ypu provide me a solution?

104.
on June 23, 2008 at 5:10 pm | Reply Asit

Dear sir
I am faceing a problem in sql query.
I have two table.
1st table is maste table, In master table total 180 records are stored.
our requirement total 180 rows – 20 rows are stored in 2nd table, then display 160 mark as ‘N’ bcos stored in master table and other records are mark as ‘Y’ bcos its stored in 2nd table.

Table 1st
AppID FunCode SubFuncode FunName SubFunName Mark
Dis 1 30 Main 0 N
Depos 2 30 &Main 0 N
Lox 3 30 &Main 0 N
LCredit 4 30 &Main 0 N

In 2nd table same records but only 20 records are stored.
AppID FunCode SubFuncode FunName SubFunName Mark
Dis 1 30 &Main 0 Y
Depos 2 30 &Main 0 Y

In query we try to fetch all Y marked row and N marked with Union query.

SELECT TBLAPPLICATIONS.Name, TBLGROUPFUNCTIONS.AppID, TBLGROUPFUNCTIONS.FunctionCode, TBLALLFUNCTIONS.DefaultSubFunctionName,
TBLGROUPFUNCTIONS.SubFunctionCode, ‘Y’ AS AssignedFunction
FROM TBLGROUPFUNCTIONS INNER JOIN
TBLALLFUNCTIONS ON TBLGROUPFUNCTIONS.AppID = TBLALLFUNCTIONS.AppID AND
TBLGROUPFUNCTIONS.FunctionCode = TBLALLFUNCTIONS.FunctionCode AND
TBLGROUPFUNCTIONS.SubFunctionCode = TBLALLFUNCTIONS.SubFunctionCode INNER JOIN
TBLAPPLICATIONS ON TBLGROUPFUNCTIONS.AppID = TBLAPPLICATIONS.AppID
WHERE TBLGROUPFUNCTIONS.GroupID = 12
UNION ALL
SELECT TBLAPPLICATIONS.Name, TBLALLFUNCTIONS.AppID, TBLALLFUNCTIONS.FunctionCode,
TBLALLFUNCTIONS.DefaultSubFunctionName, TBLALLFUNCTIONS.SubFunctionCode, ‘N’ AS AssignedFunction
FROM TBLALLFUNCTIONS INNER JOIN TBLAPPLICATIONS ON TBLALLFUNCTIONS.AppID = TBLAPPLICATIONS.AppID
WHERE
(TBLALLFUNCTIONS.AppID NOT IN (SELECT AppID FROM TBLGROUPFUNCTIONS WHERE GroupID = 12) OR
TBLALLFUNCTIONS.FunctionCode NOT IN (SELECT FunctionCode FROM TBLGROUPFUNCTIONS WHERE GroupID = 12) OR
TBLALLFUNCTIONS.SubFunctionCode NOT IN (SELECT SubFunctionCode FROM TBLGROUPFUNCTIONS WHERE GroupID = 12))
Order BY aPPID,FunctionCode,sUBfUNCTIONcODE

From this query record is not show correctley.
Please please immediate reply me.
I am waiting your reply please suggest me.
Asit Sinha

105.
on June 28, 2008 at 5:47 am | Reply maxtrebor

Thanx Mhlove:

Your solution was succesfull for me, for delete duplicate data in secondary column but the same ID.

Regards.

106.
on July 28, 2008 at 3:21 pm | Reply bogdan

Hi,

I have an issue with and sql query i’m trying to get right and I seem not to find the answer.

I have a table with 4 colums: id1,id2,value1,value2.
I need to select all rows with a unique pair of (id1,id2) and use a 3rd column in the select for something like : sum(value1*value2) for duplicate rows.

So basicaly, show 1 row for each pair and next get a sum of all the values for that pair (sum the duplicate rows on value1,value 2).

I only been able to select the unique pairs until now using a group by id1,id2 but i’m having troble with the sums of the duplicate rows for each pair.

I’d appreciate your help

107.
on August 16, 2008 at 1:35 am | Reply Vamshi

Hi Bogdon,

It would have been easy to provide solution if you would have provided proper Input and desired output. Anyways see it the below works for you based on my understanding

CREATE TABLE #T1 (ID1 INT, ID2 INT, Value1 INT, Value2 INT)
INSERT INTO #T1 VALUES (1,1,5,10)
INSERT INTO #T1 VALUES (1,1,6,12)
INSERT INTO #T1 VALUES (1,1,50,100)
INSERT INTO #T1 VALUES (1,2,7,14)
INSERT INTO #T1 VALUES (1,2,8,16)
INSERT INTO #T1 VALUES (2,1,2,5)
INSERT INTO #T1 VALUES (2,1,3,6)
INSERT INTO #T1 VALUES (2,3,10,20)
INSERT INTO #T1 VALUES (2,3,25,50)

SELECT * FROM #T1;

SELECT ID1, ID2, SUM(Value1*Value2) FROM #T1 GROUP BY ID1, ID2;

WITH T1 AS (SELECT (ROW_NUMBER() OVER (PARTITION BY ID1 ORDER BY ID1)) AS RNum, * FROM #T1 )
DELETE FROM T1 WHERE Rnum IN (SELECT a.RNum AS S FROM T1 AS a, T1 AS b WHERE a.RNum > b.RNum GROUP BY a.RNum)

SELECT * FROM #T1

DROP TABLE #T1

Post the input and desired output if the above is not what you want.

Vamshi

108.
on August 16, 2008 at 2:42 am | Reply Vamshi

Sorry WITH statement is not yours…

CREATE TABLE #T1 (ID1 INT, ID2 INT, Value1 INT, Value2 INT)
INSERT INTO #T1 VALUES (1,1,5,10)
INSERT INTO #T1 VALUES (1,1,6,12)
INSERT INTO #T1 VALUES (1,1,50,100)
INSERT INTO #T1 VALUES (1,2,7,14)
INSERT INTO #T1 VALUES (1,2,8,16)
INSERT INTO #T1 VALUES (2,1,2,5)
INSERT INTO #T1 VALUES (2,1,3,6)
INSERT INTO #T1 VALUES (2,3,10,20)
INSERT INTO #T1 VALUES (2,3,25,50)

SELECT * FROM #T1;

SELECT ID1, ID2, SUM(Value1*Value2) FROM #T1 GROUP BY ID1, ID2;

I dont know why you said GROUP BY didnt worked for you if the above is what you are expecting

109.
on August 16, 2008 at 3:20 am | Reply Vamshi

Hi Haneef and Preveen

Hope this would help you in getting all those employees with Nth Highest salary

DECLARE @n INT

SELECT @n = 2 — Change N value to get Nth highest value

CREATE TABLE #T1 (ID1 INT, [Name] NVARCHAR(50), Salary INT)
INSERT INTO #T1 VALUES (1, ‘Vamshi’, 1000)
INSERT INTO #T1 VALUES (2, ‘xxxxx’, 2000)
INSERT INTO #T1 VALUES (3, ‘yyyyy’, 3000)
INSERT INTO #T1 VALUES (4, ‘zzzzz’, 4000)
INSERT INTO #T1 VALUES (5, ’sssss’, 5000)
INSERT INTO #T1 VALUES (6, ‘ccccc’, 6000)
INSERT INTO #T1 VALUES (7, ‘ppppp’, 2000)
INSERT INTO #T1 VALUES (8, ‘aaaaa’, 4000)
INSERT INTO #T1 VALUES (9, ‘bbbbb’, 5000)

SELECT a.ID1, a.[Name], a.Salary
FROM #T1 AS a
WHERE (@n-1) = (
SELECT COUNT(DISTINCT(b.Salary))
FROM #T1 AS b
WHERE b.Salary > a.Salary)

DROP TABLE #T1

Vamshi

110.
on August 18, 2008 at 8:36 pm | Reply Ahmed

This is amazing Pinal.

It resolved my problem.

Cheers.

111.
on August 21, 2008 at 6:29 pm | Reply Mohan

Hi,
I need to Get a primary key in a particular table in MS-SQL SERVER 2000. How Can i get. Is there any query is available???

Thanks in advance
Mohan.V

112.
on August 21, 2008 at 10:09 pm | Reply Dave

Pinal, great information on your site!

I think Senthilnathan in comment #16 has the right solution that can be used for deleting duplicates and for finding the nth row in a group. The key is ROW_NUMBER() function with a Partition.

– EXAMPLE TO GET THE 5th highest salary of the “Manager” group.

– CREATE Common Table Expression CTE
With MyCTE AS (
SELECT
ROW_NUMBER() OVER (PARTITION EmployeeType
ORDER BY EmployeeType, Salary) AS OrderedGroupId,
EmployeeId,
EmployeeType,
Salary
FROM SalaryHistory)
SELECT EmployeeId, — Query results of CTE
Salary
FROM MyCTE
WHERE EmployeeType = “Manager”
AND OrderedGroupId = 5

This could also be a DELETE statement deleting duplicate items in a PARTITION (DELETE FROM MyCTE WHERE OrderedGroupId > 1)

113.
on August 22, 2008 at 7:29 am | Reply Imran Mohammed

@Mohan,

1. If you want to see the primary on a specific table then Execute this stored procedure,

Sp_help table_name — you have to give table name

It will display all the table properties, like columns, keys, indexes…
In constraints section it will show you all the keys details, on which column primary key is made.

2. If you want to see all the primary keys on all tables, then use this query,

SELECT A.CONSTRAINT_NAME ‘CONSTRAINT NAME’ , A.COLUMN_NAME ‘COLUMN NAME’, A.TABLE_NAME’TABLE NAME’, A.ORDINAL_POSITION ‘POSITION OF COLUMN’
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS B, INFORMATION_SCHEMA.KEY_COLUMN_USAGE A
WHERE B.CONSTRAINT_NAME =A.CONSTRAINT_NAME AND CONSTRAINT_TYPE = ‘PRIMARY KEY’
ORDER BY A.TABLE_NAME, ORDINAL_POSITION
– Run this script as is don’t change anything

This will give you primary key name, on which column it is made and which table it is made and also position of the column.

Sometimes, we create composite primary keys, meaning primary key on more than one column on one table in that case when you run the above query you will see table name more than once, dont get confuse, see the “position of the column” ( in the output) and then you will come to know that this table has a composite primary key.

** A table can have only one Primary key.

114.
on September 3, 2008 at 12:32 pm | Reply Jamil

Rahul ask how if there is no key in the table to delete the duplicate data,

Please use this sql to do it, @ID, @Lname and @fname is depend on our field in the table:

DECLARE @ID varchar(150), @lname varchar(150), @fname varchar(150), @MyCount int, @Counting Int

DECLARE authors_cursor CURSOR FOR
SELECT id, FNAME, LNAME, COUNT(*) AS myCount
FROM dbo.Table1
GROUP BY id, FNAME, LNAME HAVING (COUNT(*) > 1)

open authors_cursor

FETCH NEXT FROM authors_cursor
INTO @ID, @fname, @LName, @MYCount

WHILE @@FETCH_STATUS = 0 BEGIN
DECLARE aut2 CURSOR local scroll dynamic OPTIMISTIC FOR
SELECT *
FROM table1
WHERE id = @id AND lname = @lname AND fname = @fname
OPEN aut2
SET @counting = 1

FETCH next FROM aut2
begin

WHILE @Counting < @MyCount BEGIN print @counting DELETE FROM table1 WHERE CURRENT OF aut2 SET @Counting = @Counting + 1 FETCH next FROM aut2 END –end end CLOSE aut2 DEALLOCATE aut2 FETCH NEXT FROM authors_cursor INTO @ID, @Lname, @FName, @MYCount END CLOSE authors_cursor DEALLOCATE authors_cursor 115. on September 3, 2008 at 1:25 pm | Reply Jamil Answer for Rasmi, please study this one: declare @MyStr varchar(150), @pjg int DECLARE authors_cursor CURSOR FOR SELECT LNAME FROM dbo.Table1 open authors_cursor FETCH NEXT FROM authors_cursor INTO @mystr WHILE @@FETCH_STATUS = 0 BEGIN set @pjg=len(@mystr)-1 while @pjg>=0
begin
print substring(@mystr,len(@mystr)-@pjg,1)
if isnumeric(substring(@mystr,len(@mystr)-@pjg,1))=0

set @mystr=replace (@mystr,substring(@mystr,len(@mystr)-@pjg,1),”)

set @pjg=@pjg-1
end
print @myStr
update table1 set lname=@mystr where current of authors_cursor
FETCH NEXT FROM authors_cursor
INTO @mystr
end
print @myStr

CLOSE authors_cursor
DEALLOCATE authors_cursor

116.
on September 12, 2008 at 8:09 am | Reply Mike

In most cases, two duplicate records in a table may not be the same. For example, a customer may enter his name or address, or phone number in different ways, but only from his email, we know this is the same person.

My question is, if a table contains duplicate records like this, and I only want to keep latest record of the dups. I mean, based on customer’s email, and the date of registration, can we have a way to delete the duplicate records? for example:

table_customers contains “email”, “name”, and “date”:

ab@c.com John Smith 24/8/2006
ab@c.com John S. 12/9/2007


I only wants to keep the latest record if two records have the same email, no matter if the names are the same or not.

Thank you for your help.

Mike

117.
on September 18, 2008 at 2:29 pm | Reply Jamil

For Mike may be you can use this Query:

DELETE table_customers
FROM (SELECT Email, MAX(Date) AS Date
FROM table_customers
GROUP BY Email
HAVING Email = ‘ab@c.com’)) DERIVEDTBL
WHERE table_customers.Email = DERIVEDTBL.Email AND table_customers.Date DERIVEDTBL.Date

118.
on September 18, 2008 at 3:47 pm | Reply Aashish Mangal

Hi

I have a question. Below is the Employee table
EmpID EmpName Salary
———– —————————– ———-
1 Aashish 15000
1 Aashish 15000
3 Gunjan 25000
3 Gunjan 25000
5 Atul 35000
6 Animesh 20000

Now I have to keep only one record either of the two and remove the duplicate record. Condition is There should not be any use of temp tables, no primary keys, no identity columns, no cursors. Everything has to be handeled in a query . And Database is MS SQL Server 2000

119.
on September 19, 2008 at 6:17 am | Reply Jamil

For Aashish Mangal, You Can Use Query Analyzer Like This:

ALTER TABLE [Employee] ADD [MySpecialIdx] [int] IDENTITY (1, 1) NOT NULL

Go

DELETE [Employee]
FROM (SELECT EmpID, EmpName, Salary, MAX(MySpecialIdx) AS mySpecialIdx
FROM [Employee]
GROUP BY EmpID, EmpName, Salary)) DERIVEDTBL
WHERE table_customers.EmpID = DERIVEDTBL.EmpID AND [Employee].EmpName = DERIVEDTBL.EmpName AND AND
[Employee].EmpName = DERIVEDTBL.EmpName AND Employee.myspecialIdx DerivedTbl.myspecialidx

Go

ALTER TABLE [Employee] DROP COLUMN MySpecialIdx

Go

Or You Can Use In View and Execute Step By Step

120.
on September 24, 2008 at 9:32 am | Reply jamil

The Next for Aashish Mangal, we can use this query to delete the duplicate record:

WHILE (SELECT TOP 1 COUNT(id) AS Expr1
FROM dbo.Table1
GROUP BY id, FNAME, LNAME
HAVING (COUNT(id) > 1)) > 1
BEGIN
set rowcount 1
DELETE table1
FROM (SELECT TOP 1 id, FNAME, LNAME
FROM dbo.Table1
GROUP BY id, FNAME, LNAME
HAVING (COUNT(id) > 1)) DERIVEDTBL
WHERE table1.id = derivedtbl.id AND table1.fname = derivedtbl.fname AND table1.lname = derivedtbl.lname
END

==
If this query will be saved, the first time you must add any table to query than remove the query text and replaced with the above query text, then you can saved this query with your desired name.

121.
on September 24, 2008 at 9:36 am | Reply jamil

Sorry, in the fact, that query only can be executed and can not be saved.

122.
on October 1, 2008 at 3:10 pm | Reply Mit_2807

Hi Pinal,

I had the same query i tried using Rank() function.

It worked for me.

Attaching code here:

WITH DeleteDuplicates
AS
(
SELECT RANK() OVER (PARTITION BY DUPFIELD ORDER BY PKFIELD ) AS RANK, * FROM TABLENAME
)
DELETE FROM DeleteDuplicates WHERE RANK > 1

Wish add more value to the topic.

Thanks,

Mit_2807

123.
on October 2, 2008 at 7:40 pm | Reply rakesh

hi sir i did like this to delete duplicate rows…

sir i accomplished this using rank functions… and CTE

1)create table emp (id int,name varchar(20))

2) i inserted 1,’rakesh ‘ 3 times
and 2,’sagar’ 2 times

deleting duplicate rows…

with cte
as
(
select id,name,rank() over(order by id) r,row_number() over(order by id) rn from emp
)
delete from cte
where r rn

will this query degrade the performance or not ……

124.
on October 3, 2008 at 11:43 am | Reply rakesh

hi sir

could you please give script for sending SMTP mail

USING SQLSERVER….

125.
on October 3, 2008 at 5:29 pm | Reply Purushot

Dear All,

i have a table called Player_details in that i have a data’ like below without any key i want to delete one record and another in data how to do this.

100 chennaiRoyals 9841998470
101 DareDevils 9984725487
101 DareDevils 9984725487

Both are same as like.

can u pls help me

126.
on October 3, 2008 at 5:57 pm | Reply Purushot

To Find the 2nd Highest Salary:

2nd highest salary

1)SELECT max(salary) FROM Employee
WHERE salary < (SELECT max(salary) FROM employee) Purushot 127. on October 5, 2008 at 4:45 pm | Reply stefano your query works great !!! finally I found an elegant way to solve the problem without exploiting new sql commands which in SQL 2005 don’t exist yet. 128. on October 6, 2008 at 12:56 pm | Reply Purushot Dear All, i have a table called Player_details in that i have a data’ like below without any key i want to delete one record and another in data how to do this. 100 chennaiRoyals 9841998470 101 DareDevils 9984725487 101 DareDevils 9984725487 Both are same as like. can u pls help me 129. on October 7, 2008 at 2:31 pm | Reply jamil Dear Purushot Suppose that table Player_details contains fields idx, Name and ID, and the table contains records as follows: 100 chennaiRoyals 9841998470 101 DareDevils 9984725487 101 DareDevils 9984725487 103 DareDevilx 9984725487 103 DareDevilx 9984725487 104 DareDevily 9984725487 104 DareDevily 9984725487 104 DareDevily 9984725487 The we want delete the duplicate record, those are the records with idx=101 and idx=103 and idx=104, so that just 1 record will be kept, we can follow these step: 1. Please create this query and save as PlayerDetails_ForDeletingDuplicate SELECT * FROM dbo.Player_details a WHERE ((SELECT COUNT(*) FROM dbo.Player_details b WHERE a.idx = b.idx AND a.name = b.name AND a.id = b.id) > 1)

2. Please run this query, if we found any record viewed, then go to next step

3. Edit this query, change the line:
select *
with
delete dbo.Player_details

4. In the top this query add this line
set rowcount 1

5. Execute this query several times until no more records affected

6. close this query without save

130.
on October 10, 2008 at 7:36 pm | Reply Saravanan

Dear All,
I have a table which can accept duplicate records, but it should not accept duplicate within 20 seconds.I want to delete duplicate row which has been created back to back in 20 seconds. Is there any way to delete these records? Please let me know if you need more info about this.

Thanks in advance.
Cheers,
Saravanan

131.
on October 10, 2008 at 7:38 pm | Reply Saravanan

Dear All,
I forgot to tell you one thing. I got a column which stores created date in that table.
Cheers,
Saravanan

132.
on October 12, 2008 at 5:57 pm | Reply R.Rohini

i like 2 learn more about the sql queries ,so please send me any updated information about the sql.

133.
on October 24, 2008 at 5:27 pm | Reply Mxolisi Phiri

Good coding you guys have displayed. I have just added a while loop to eliminate all duplicates.

–create fruit table and populate it with duplicate fruits

CREATE TABLE #tblFruit
(
ID int identity,
Fruit varchar(7)
)
INSERT INTO #tblFruit
VALUES(‘Banana’)
INSERT INTO #tblFruit
VALUES(‘Banana’)
INSERT INTO #tblFruit
VALUES(‘Pear’)
INSERT INTO #tblFruit
VALUES(‘Orange’)
INSERT INTO #tblFruit
VALUES(‘Orange’)
INSERT INTO #tblFruit
VALUES(‘Apple’)
INSERT INTO #tblFruit
VALUES(‘Banana’)
INSERT INTO #tblFruit
VALUES(‘Apple’)

–display all the fruits you have, and count how many duplicates you have to see if your code works

SELECT * FROM #tblFruit
SELECT COUNT(FRUIT) FROM #tblFruit
SELECT COUNT(DISTINCT(FRUIT)) FROM #tblFruit

–loop through your table and delete any duplicates, display your new table w/o duplicates, count if you still have duplicates or not, delete your temp table

WHILE EXISTS
(
SELECT MIN(ID)
FROM #tblFruit
GROUP BY FRUIT
HAVING COUNT(FRUIT) > 1
)
BEGIN
DELETE
FROM #tblFruit
WHERE ID IN ( SELECT MIN(ID) FROM #tblFruit
GROUP BY Fruit
HAVING COUNT(Fruit) > 1
)

SELECT * FROM #tblFruit
SELECT COUNT(FRUIT) FROM #tblFruit
SELECT COUNT(DISTINCT(FRUIT)) FROM #tblFruit
DROP TABLE #tblFruit

END

134.
on November 13, 2008 at 1:00 pm | Reply Pavan Thakur

Thanks Vamshi…It helped me a lot

135.
on November 19, 2008 at 9:15 pm | Reply Ashwin Rau

Pinal,
Thanks for the excellent web site. Just wanted to point out a small typo.

You have group by DuplicateValueColumn2 twice. It should be DuplicateValueColumn3. I know most people would catch it. Just thought you should update the web site solution with correct code. Thanks.

Ash

136.
on November 20, 2008 at 3:03 pm | Reply Ravichandra

Hi ;
I have two queries which is below. I want to get the records which are excess in Region1.table don’t want display common records also

1). SELECT * FROM
Region1.table
WHERE POST_KNTNR = ‘6731′
AND SLSKB_INIT = ‘NL’
AND VAKD = ‘DKK’
AND year(MDTGL_TMSTM) = 2008

2). SELECT * FROM
Region2.table
WHERE POST_KNTNR = ‘6731′
AND SLSKB_INIT = ‘NL’
AND VAKD = ‘DKK’
AND year(MDTGL_TMSTM) = 2008

137.
on November 21, 2008 at 12:18 pm | Reply Srinivas Asapu

Hi Ravichandra,

Use the following code. Replace COLUMN_LIST with column names. Let me know if you have any issues.

SELECT COLUMN_LIST FROM (SELECT DISTINCT ‘U’ AS SETNAME,
* FROM REGION1.TABLE WHERE POST_KNTNR = ‘6731′
AND SLSKB_INIT = ‘NL’
AND VAKD = ‘DKK’
AND YEAR (MDTGL_TMSTM) = 2008
UNION ALL
SELECT DISTINCT NULL, * FROM
REGION2.TABLE
WHERE POST_KNTNR = ‘6731′
AND SLSKB_INIT = ‘NL’
AND VAKD = ‘DKK’
AND YEAR (MDTGL_TMSTM) = 2008) A
GROUP BY COLUMN_LIST
HAVING COUNT (*) = 1 AND MAX (SETNAME) = ‘U’

138.
on December 4, 2008 at 10:52 am | Reply GArun

Hi,
A table contains PRIMARY KEY CLUSTERED and it contains 20 partitions. How to delete a FIRST (1st) Partition.

Thanks
GAG

139.
on December 9, 2008 at 7:27 pm | Reply Ramesh sharma

hi sir
i m very impressed with your article ?
it is great sir

140.
on December 23, 2008 at 3:14 pm | Reply Asha Sharda

To delete the DuplicateRows when the ID column is UniqueIdentifier

DELETE
FROM MyTable where Id not in(
SELECT Top 1 b.Id
FROM MyTable as b
WHERE b.Col1= MyTable.Col1
AND b.Col2= MyTable.Col2
AND b.Col3= MyTable.Col3
)

141.
on December 29, 2008 at 9:30 pm | Reply Srinivas Madabushi

Hi,
Do the following steps:
1: Create a temp table with the same column names as the main table
2: Copy the duplicate records by using count>1 in the where clause
3: Delete the duplicate records in the main table using the count>1
4: Insert the records from the temp table to the original table
5: Clear the temp table

142.
on January 21, 2009 at 3:58 pm | Reply Prajot

If in the sql server table there is no primary key in the table and the data is huge and we want to delete the duplicate records from that data by using the query so tell me which is the feasible way for deleting duplicate records.

143.
on January 31, 2009 at 10:46 am | Reply dyamond

Please help with the following:

SELECT
date,
name,
desc,
‘Code’ as cdDesc
FROM activity
WHERE
date = ‘2009-01-05′ and
site = ‘80′ and
exists
(select code
from activity
where code in (‘55544′,’33333′, ‘66666′)
GROUP BY code HAVING COUNT(*) > 2)

UNION ALL

SELECT
date,
name,
desc,
‘Code2′ as cdDesc
FROM activity
WHERE
date = ‘2009-01-05′ and
site = ‘80′ and
exists
(select code
from activity
where code in (‘2222′,’3333′, ‘66666′, ‘88888′)
GROUP BY code HAVING COUNT (*) > 3)

The issue is that some code(s) exist in multiple rows and I only need the whole group to display. I used “AND’ but that gave me no results. ANY HELP would be appreciated.

144.
on January 31, 2009 at 7:14 pm | Reply qute

CREATE TABLE #temp
(SAP_FL varchar(40),
EQUIPMENT_NO varchar(18),
SHORT_DESCR varchar(100),
COMPL_DATE datetime,
NOTIFICATION_NO varchar(12),
ACTIVITY_CODE VARCHAR(10),
ACTIVITY_NAME VARCHAR(100),
TOTAL_COUNT float,
CONST_TYPE varchar(100),
UNIT varchar(50),
PriorityID int,
)

————————————————————-
insert into #temp values ( ‘1023-A615400-443401-00901-00401′ ,50001564 , ‘Ultrasonic stack #30 ‘ , ‘11/14/2005′ , ‘500722482′ , ‘REPL’ , ‘Replaced’ , 70628 , 80159257 , ‘EA’ , 2 )
insert into #temp values ( ‘1023-A614400-443401-00901-00501′ ,50001564 , ‘Ultrasonic stack #30 ‘ , ‘4/15/2005′ , ‘500357639′ , ‘REPL’ , ‘Replaced’ , 70404 , 80159257 , ‘EA’ , 3 )
insert into #temp values ( ‘1023-A615400-443401-00901-00201′ ,50001564 , ‘Ultrasonic stack #30 ‘ , ‘3/21/2005′ , ‘500329094′ , ‘REPL’ , ‘Repaired’ , 70321 , 80159257 , ‘EA’ , 2 )

insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘10/4/2008′ ,’502737991′,’REPR’, ‘Repaired’, 218242 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘8/29/2008′ ,’502675451′,’REPR’, ‘Repaired’, 218242 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘8/29/2008′ ,’502620150′,’REPR’, ‘Repaired’, 218242 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘6/18/2008′ ,’502495333′,’REPR’, ‘Repaired’, 208032 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘9/7/2007′ ,’501944796′,’REPR’, ‘Repaired’, 197999 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘9/23/2005′ ,’500648350′,’REPR’, ‘Repaired’, 159277 ,’80296971′ ,’EA’, 3)
insert into #temp values ( ‘1023-F306400-411401-00801′ ,’50001721′, ‘Module- Cutting Fluff Ovrnite/wing 295mm’, ‘12/17/2004′,’500182407′,’REPR’, ‘Repaired’, 118935 ,’80296971′ ,’EA’, 3)

—————————————————-

select
t1.SAP_FL as SAP_FL_ST ,
t2.SAP_FL as SAP_FL_ED,

t1.EQUIPMENT_NO as EQUIPMENT_NO_ST,
t2.EQUIPMENT_NO as EQUIPMENT_NO_ED,

t1.COMPL_DATE as COMPL_DATE_ST ,
t2.COMPL_DATE as COMPL_DATE_ED,

t1.SHORT_DESCR as SHORT_DESCR_ST,
t2.SHORT_DESCR as SHORT_DESCR_ED,

t1.NOTIFICATION_NO as NOTIFICATION_NO_ST,
t2.NOTIFICATION_NO as NOTIFICATION_NO_ED,

t1.ACTIVITY_CODE as ACTIVITY_CODE_ST,
t2.ACTIVITY_CODE as ACTIVITY_CODE_ED ,

t1.ACTIVITY_NAME as ACTIVITY_NAME_ST ,
t2.ACTIVITY_NAME as ACTIVITY_CODE_ED,

t1.TOTAL_COUNT as TOTAL_COUNT_ST ,
t2.TOTAL_COUNT as TOTAL_COUNT_ED ,

t1.TOTAL_COUNT -
t2.TOTAL_COUNT as TOTAL_COUNT_DIFFERENCE,

t1.CONST_TYPE as CONST_TYPE_ST ,
t2.CONST_TYPE as CONST_TYPE_ED ,
t1.UNIT

from (select SAP_FL as ‘SAP_FL’,
EQUIPMENT_NO as ‘EQUIPMENT_NO’,
SHORT_DESCR as ‘SHORT_DESCR’,
COMPL_DATE as ‘COMPL_DATE’,
NOTIFICATION_NO as ‘NOTIFICATION_NO’,
ACTIVITY_CODE as ‘ACTIVITY_CODE’,
ACTIVITY_NAME as ‘ACTIVITY_NAME’,
CONST_TYPE as ‘CONST_TYPE’,
UNIT as ‘UNIT’
,max(TOTAL_COUNT) as ‘TOTAL_COUNT’
from #temp
– where EQUIPMENT_NO = 50001721
group by SAP_FL ,
EQUIPMENT_NO ,
COMPL_DATE ,
NOTIFICATION_NO ,
ACTIVITY_CODE ,
ACTIVITY_NAME ,
SHORT_DESCR ,
CONST_TYPE ,
UNIT )t1
INNER JOIN #temp t2
ON t1.EQUIPMENT_NO=t2.EQUIPMENT_NO
AND t1.ACTIVITY_CODE=t2.ACTIVITY_CODE
AND t1.COMPL_DATE>t2.COMPL_DATE

=====================================
OUTPUT–if we considered equi no 50001721
==================================
SAP_FL_ST SAP_FL_ED EQUIPMENT_NO_ST EQUIPMENT_NO_ED COMPL_DATE_ST COMPL_DATE_ED NOTIFICATION_NO_ST NOTIFICATION_NO_ED ACTIVITY_CODE_ST ACTIVITY_CODE_ED ACTIVITY_NAME_ST ACTIVITY_NAME_ED TOTAL_COUNT_ED TOTAL_COUNT_ST TOTAL_COUNT_DIFFERENCE SHORT_DESCR_ST SHORT_DESCR_ED CONST_TYPE_ST CONST_TYPE_ED UNIT
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 8/29/2008 502737991 502675451 REPR REPR Repaired Repaired 218242 218242 0 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 8/29/2008 502737991 502620150 REPR REPR Repaired Repaired 218242 218242 0 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 8/29/2008 502675451 502620150 REPR REPR Repaired Repaired 218242 218242 0 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 6/18/2008 502737991 502495333 REPR REPR Repaired Repaired 218242 208032 10210 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 6/18/2008 502620150 502495333 REPR REPR Repaired Repaired 218242 208032 10210 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 6/18/2008 502675451 502495333 REPR REPR Repaired Repaired 218242 208032 10210 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 9/7/2007 502737991 501944796 REPR REPR Repaired Repaired 218242 197999 20243 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 6/18/2008 9/7/2007 502495333 501944796 REPR REPR Repaired Repaired 208032 197999 10033 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 9/7/2007 502620150 501944796 REPR REPR Repaired Repaired 218242 197999 20243 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 9/7/2007 502675451 501944796 REPR REPR Repaired Repaired 218242 197999 20243 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 9/23/2005 502737991 500648350 REPR REPR Repaired Repaired 218242 159277 58965 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 6/18/2008 9/23/2005 502495333 500648350 REPR REPR Repaired Repaired 208032 159277 48755 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 9/23/2005 502620150 500648350 REPR REPR Repaired Repaired 218242 159277 58965 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 9/23/2005 502675451 500648350 REPR REPR Repaired Repaired 218242 159277 58965 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 9/7/2007 9/23/2005 501944796 500648350 REPR REPR Repaired Repaired 197999 159277 38722 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 10/4/2008 12/17/2004 502737991 500182407 REPR REPR Repaired Repaired 218242 118935 99307 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 6/18/2008 12/17/2004 502495333 500182407 REPR REPR Repaired Repaired 208032 118935 89097 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 12/17/2004 502620150 500182407 REPR REPR Repaired Repaired 218242 118935 99307 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 8/29/2008 12/17/2004 502675451 500182407 REPR REPR Repaired Repaired 218242 118935 99307 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 9/23/2005 12/17/2004 500648350 500182407 REPR REPR Repaired Repaired 159277 118935 40342 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-F306400-411401-00801 1023-F306400-411401-00801 50001721 50001721 9/7/2007 12/17/2004 501944796 500182407 REPR REPR Repaired Repaired 197999 118935 79064 Module, Cutting Fluff Ovrnite/wing 295mm Module, Cutting Fluff Ovrnite/wing 295mm 80296971 80296971 EA
1023-A614400-443401-00901-00501 1023-A615400-443401-00901-00201 50001564 50001564 3/21/2005 4/15/2005 500357639 500329094 REPL REPL Replaced Repaired 70404 70321 83 Ultrasonic stack #30 Ultrasonic stack #30 80159257 80159257 EA
1023-A615400-443401-00901-00401 1023-A614400-443401-00901-00501 50001564 50001564 4/15/2005 11/14/2005 500722482 500357639 REPL REPL Replaced Replaced 70628 70404 224 Ultrasonic stack #30 Ultrasonic stack #30 80159257 80159257 EA
1023-A615400-443401-00901-00401 1023-A615400-443401-00901-00201 50001564 50001564 3/21/2005 11/14/2005 500722482 500329094 REPL REPL Replaced Repaired 70628 70321 307 Ultrasonic stack #30 Ultrasonic stack #30 80159257 80159257 EA

here in output it puts tries to take start date as max date and calcuates difference based on max(date)- all remaining dates

but i want in this sequence

1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 10/4/2008 502737991 REPR Repaired 218242 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 8/29/2008 502675451 REPR Repaired 218242 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 8/29/2008 502620150 REPR Repaired 218242 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 6/18/2008 502495333 REPR Repaired 208032 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 9/7/2007 501944796 REPR Repaired 197999 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 9/23/2005 500648350 REPR Repaired 159277 80296971 EA 3
1023-F306400-411401-00801 50001721 Module, Cutting Fluff Ovrnite/wing 295mm 12/17/2004 500182407 REPR Repaired 118935 80296971 EA 3
1023-A614400-443401-00901-00501 1023-A615400-443401-00901-00201 50001564 50001564 3/21/2005 4/15/2005 500357639 500329094 REPL REPL Replaced Repaired 70404 70321 83 Ultrasonic stack #30 Ultrasonic stack #30 80159257 80159257 EA
1023-A615400-443401-00901-00401 1023-A614400-443401-00901-00501 50001564 50001564 4/15/2005 11/14/2005 500722482 500357639 REPL REPL Replaced Replaced 70628 70404 224 Ultrasonic stack #30 Ultrasonic stack #30 80159257 80159257 EA
—————————————————————————
can anybody has the solution for this?

145.
on February 2, 2009 at 9:47 am | Reply Imran Mohammed

@qute/cute

Dude, Your query is incomplete, I spent nearly an hour working on this, But I could not understand what you want in the select statement.

You say you are trying to calculate difference, where is the difference, you have 4 columns you call them Total_Count, you are not even using any function to calculate difference, query is not clear at all and also the sample data… I mean atleast give me the select query properly, data ( insert statements )you provided are not correct comparing to the out put what you have given in your post.

Please post your complete questions.

Regards,
IM.

146.
on February 2, 2009 at 10:01 am | Reply Imran Mohammed

@dyamond

When ever you use, Group BY, you need to have a aggregate function in your select list. The subquery you are trying to use has group by but no Aggregate function, also you are saying having Count(*) > 2 at one place and having Count(*) > 3 at another place, but where exactly is this count(*), are you writing this any place.

SELECT
[date],
[name],
[desc],
‘Code’ as cdDesc
FROM activity
WHERE date = ‘2009-01-05′
and site = ‘80′
and exists
(
select code , count(*) Counts
from activity
where code in (‘55544′,’33333′, ‘66666′)
GROUP BY code HAVING COUNT(*) > 2
)

UNION ALL

SELECT
[date],
[name],
[desc],
‘Code2′ as cdDesc
FROM activity
WHERE date = ‘2009-01-05′
and site = ‘80′
and exists
(
select code, count(*) Counts
from activity
where code in (‘2222′,’3333′, ‘66666′, ‘88888′)
GROUP BY code HAVING COUNT (*) > 3
)

Try using the code like above.

I did not test the script functionality, since I do not have sample dat, If script still does not work then please post table structure script and some sample data and expected output.

So that we can help you out.

Regards,
IM.

147.
on February 3, 2009 at 12:26 pm | Reply Sunil Kumar Basu

/*
Input Param:
@vchStartDate [in yyyymmdd format]
@intDuration [specifying the no. of days for end date calculation]
This code block will calculate the end date based on the supplied start date and duration.
Finally it will return the dates and the day starting from start date to end date both inclusive.
Ist Resultset- Returns All Dates in the range starting from start date to end date.
IInd Resultset- Returns All Dates in the range starting from start date to end date excluding saturdays and sundays.
IIIrd Resultset- Returns All Dates in the range starting from start date to end date excluding weekdays.
*/
DECLARE @vchStartDate VARCHAR(10)
DECLARE @intDuration INT
DECLARE @LclvchEndDate VARCHAR(10)

/* Set input values */
SET @vchStartDate = ‘20090101′
SET @intDuration = 50

/* Calculate End Date */
SET @LclvchEndDate = CONVERT(VARCHAR(10),DATEADD(dd,@intDuration, CAST(@vchStartDate AS DATETIME)),112);
WITH mycte AS
(
SELECT CAST(@vchStartDate AS DATETIME) DateValue
UNION ALL
SELECT DateValue + 1
FROM mycte
WHERE DateValue + 1 <= @LclvchEndDate ) SELECT DateValue INTO #tbl_data FROM mycte OPTION (MAXRECURSION 0) /* Ist Resultset- Returns All Dates in the range starting from start date to end date. */ SELECT CONVERT(VARCHAR(10),DateValue,121) Date, [Day] = CASE WHEN DATEPART(dw,DateValue)= 1 THEN ‘SUNDAY’ WHEN DATEPART(dw,DateValue)= 2 THEN ‘MONDAY’ WHEN DATEPART(dw,DateValue)= 3 THEN ‘TUESDAY’ WHEN DATEPART(dw,DateValue)= 4 THEN ‘WEDNESDAY’ WHEN DATEPART(dw,DateValue)= 5 THEN ‘THURSDAY’ WHEN DATEPART(dw,DateValue)= 6 THEN ‘FRIDAY’ WHEN DATEPART(dw,DateValue)= 7 THEN ‘SATURDAY’ END FROM #tbl_data /* IInd Resultset- Returns All Dates in the range starting from start date to end date excluding saturdays and sundays. */ SELECT CONVERT(VARCHAR(10),DateValue,121) Date, [Day] = CASE WHEN DATEPART(dw,DateValue)= 2 THEN ‘MONDAY’ WHEN DATEPART(dw,DateValue)= 3 THEN ‘TUESDAY’ WHEN DATEPART(dw,DateValue)= 4 THEN ‘WEDNESDAY’ WHEN DATEPART(dw,DateValue)= 5 THEN ‘THURSDAY’ WHEN DATEPART(dw,DateValue)= 6 THEN ‘FRIDAY’ END FROM #tbl_data WHERE DATEPART(dw,DateValue) NOT IN (1,7) /* IIIrd Resultset- Returns All Dates in the range starting from start date to end date excluding weekdays. */ SELECT CONVERT(VARCHAR(10),DateValue,121) Date, [Day] = CASE WHEN DATEPART(dw,DateValue)= 1 THEN ‘SUNDAY’ WHEN DATEPART(dw,DateValue)= 7 THEN ‘SATURDAY’ END FROM #tbl_data WHERE DATEPART(dw,DateValue) IN (1,7) DROP TABLE #tbl_data 148. on February 5, 2009 at 2:10 pm | Reply vishal kulshreshtha Hi pinal, your quary is good ut it has some limitation like identity col. i have a better idea to do delete duplicate record using CTE. like: CREATE TABLE [dbo].[testing3]( [id] [int] NULL, [name] [varchar](50) ) ON [PRIMARY] with dup as(select *,row_number() over(partition by id order by id)as TID from testing3) delete from dup where tid >1

149.
on February 7, 2009 at 7:45 pm | Reply Divya

Hello sir,

You r genious, I love your articles.

thanks

150.
on February 13, 2009 at 11:56 am | Reply Sitaram

Hi Pinal,

I want to learn sql server.I am new in this field.

Thanks

151.
on February 16, 2009 at 2:17 am | Reply Divya Mahendra Sikarwar

Execute these queries for learning how to remove duplicate record.
CREATE TABLE dbo.Test1 (
[ID] [int] ,
[FirstName] [varchar](25),
[LastName] [varchar](25)
) ON [PRIMARY]

INSERT INTO Test1 VALUES(1, ‘Bob’,'Smith’)
INSERT INTO Test1 VALUES(2, ‘Dave’,'Jones’)
INSERT INTO Test1 VALUES(3, ‘Karen’,'White’)
INSERT INTO Test1 VALUES(1, ‘Bob’,'Smith’)
INSERT INTO Test1 VALUES(4, ‘Bobby’,'Smita’)

select identity(int,1,1) as SlNo,* into #temp from Test1

DELETE
FROM #temp
WHERE SlNo NOT IN
(
SELECT MAX(SlNo)
FROM #temp
GROUP BY ID,FirstName,lastname
)

drop table test1

select * into test1 from #temp

alter table test1 drop column SlNo

select * from test1 order by id

152.
on February 24, 2009 at 5:36 pm | Reply SQLAuthority News - Best Articles on SQLAuthority.com Journey to SQL Authority with Pinal Dave

[...] SQL SERVER – Delete Duplicate Records – Rows [...]

153.
on February 24, 2009 at 8:25 pm | Reply Praney

–Try this to delete multiple records

;with t1 as
(
select id,Row_number() over (partition by a, b, c order by a, b, c) as rnum
from mytable with(nolock)
)
delete from t1 where rnum>1

154.
on February 25, 2009 at 3:32 pm | Reply SQL SERVER - Find Nth Highest Salary of Employee - Query to Retrieve the Nth Maximum value Journey to SQL Authority with Pinal Dave

[...] Same example converted in SQL Server 2005 to work with Database AdventureWorks. USE AdventureWorks; GO SELECT TOP 1 Rate FROM ( SELECT DISTINCT TOP 4 Rate FROM HumanResources.EmployeePayHistory ORDER BY Rate DESC) A ORDER BY Rate GO Reference : Pinal Dave (http://blog.SQLAuthority.com), Pravin Phatangare [...]

155.
on March 1, 2009 at 9:57 am | Reply gaurav moolani

hi guys
i have a problem ??? what if the entire row is duplicate ??
with i dentity column also same .

i am transfering data from a table having duplicate rows to a table having a primary key

pls help me out

156.
on March 2, 2009 at 6:36 am | Reply doug in fl

gaurav,
Praney’s sql in the comments will help you out. I have the same situation and that solved the problem.

;with t1 as
(
select id,Row_number() over (partition by a, b, c order by a, b, c) as rnum
from mytable with(nolock)
)
delete from t1 where rnum>1

157.
on March 4, 2009 at 12:25 pm | Reply Luke

Hi Gaurav,
I have duplicate records in a table,till recently i added a
datetime field in my table called timestamp.Now when
duplicates go into this table i will be able to see the current
date in the timestamp

Firstname Lastname memberno timestamp
Jack Hill 11111 2009-12-11
Jack Hill 11111 null
Simon Philip 222222 2009-12-11
Simon Philip 222222 2009-12-12
Sam Wheat 33333 nulll
Sam Wheat 33333 null

In this table i want to fetch a duplicate set which has
one record with a valid timestamp and other timestamp
as null – something like

Firstname Lastname memberno timestamp
Jack Hill 11111 2009-12-11
Jack Hill 11111 null

select max(timestamp) from member group by firstname,lastname having count(*)>1

How do i modify my display duplicate statement to have the resultset which includes duplicates with and without
a valid timestamp

158.
on March 4, 2009 at 12:28 pm | Reply Luke

Hi Pinal

With reference to the above question
Can anyone help me out with a query which can display only duplicates with or witout a timestamp, they should
not include duplicates members all having timestamps

159.
on March 4, 2009 at 5:06 pm | Reply tejasnshah

Hi Luke,

You can use ROW_Number() to identify Duplicate Rows

Here You need to do as:

;with CTE AS(
select Row_NUmber() (OVER PARTITION BY FirstName, LastName, MemberNo Order BY TimeStamp ASC) AS RowID
FROM table
)
select * from cte where rowID=1

this will give you Records with NULL. If you want data with Dates then you need to change “ORDER BY” to “Order BY TimeStamp DESC”

For further details, you can read it on my blog: http://tejasnshah.wordpress.com/2008/12/31/find-duplicate-rows-with-row_number-sql-server-2005/

Thanks,

Tejas

160.
on March 13, 2009 at 6:16 pm | Reply Krish

Hi Pinal,

Your articles have always helped me. Just curious if I could rationalize data in the example below using a variant of this example (I have an option to use SSIS Fuzzy lookup, but its way off my application scope. So it is ruled out.)

Cheers,
Krish

161.
on April 23, 2009 at 12:47 pm | Reply VIKAS

HI PINAL,

I HAVE READ YOUR ARTICLES AND RESPONSES FOR THE QUERIES AND I FOUND IT GOOD.
I TOO HAVE ONE QUERY.

THERE IS TABLE1 AND TABLE2
TABLE 1 CONTAINS 100 RECORDS.
TABLE 2 CONTAINS 150 RECORDS.

TABLE 2 CONTAINS SOME RECORDS WHICH ARE ALREADY PRESENT IN TABLE 1.

NOW WHEN I MERGE BOTH IT ADDS ALL THE RECORDS GIVING TOTAL NO. TO 250.

FROM THESE I WANT TO REMOVE THE DUPLICATE DATA WHICH I GOT FROM TABLE 2 AND ARE ALREADY PRESET IN TABLE 1.

I have tried above methods but did not yield required results.

Be Helpful…

162.
on April 23, 2009 at 4:49 pm | Reply VIKAS

I got the solution of above problem

FRIEND in a NEED

Another Question.

i am joining TABLE1 (older) And TABLE2 (newer).

i found some records which are common in both table.

i want to remove found common data from TABLE2(newer)

How can i perform that operation?

GENERAL QUESTION
How can we perform the Update,Delete operation on query of join

–common records in two tables using inner join
select *
from TABLE1 as e
inner join TABLE2 as c
on e.respondentid = c.respondentid

163.
on April 23, 2009 at 5:33 pm | Reply Brian Tkatch

Vikas, will UNION work for you?

SELECT … FROM Table_1
UNION
SELECT … FROM Table_2

164.
on April 23, 2009 at 6:36 pm | Reply Brian Tkatch

Vikas, for the second question:

DELETE FROM Table2 WHERE EXISTS(SELECT * FROM Table1 WHERE Table1.respondentid = Table2.respondentid)

165.
on April 24, 2009 at 2:47 pm | Reply VIKAS

thnx buddy tkatch.

166.
on April 28, 2009 at 2:42 pm | Reply VIKAS

i have table TABLE1 with data inside.
i want to add one column name as “Serial Number”.
inside this i want to give Automatic Serial No.
Say there are 5000 records then the Serial Number Column should contain1,2,3,4…..5000 at last record.

Be Beedful

167.
on April 28, 2009 at 3:02 pm | Reply RAJ

i got the solution of above query.

“alter table table1 add ID INT IDENTITY(1,1)”

But this inserts the column at the end.

i want this column to be inserted at the start.

i.e. First Column should be of identity Column

168.
on April 29, 2009 at 2:02 pm | Reply Nayan

i have a table TABLE1 of Five Column.
I want to insert one more column in this table after First Column.
How can i insert the New Column from SQL Editor in between the Columns Present not at the end?

169.
on April 29, 2009 at 2:24 pm | Reply Sandy

Hi Pinal,

I HAVE READ YOUR ARTICLES AND RESPONSES FOR THE QUERIES AND I FOUND IT GOOD.
Actully i m new in SQL SERVER2005.

I stucked in a query, if find the solution:-

Table1 Table2

col1|col2 col1|col2
10 100 10 100
11 150 15 200
15 200 13 210
21 210 23 100

Output should be:

col1|col2
10 100
11 150
21 210
13 210
23 100

170.
on April 29, 2009 at 2:26 pm | Reply Sandy

Hi Pinal,

i have one more query:

Delete the duplicates rows from a table if there is not any primary key.

Please find the solution and Thanks in advance.

171.
on May 1, 2009 at 12:30 pm | Reply Gates

To aLL,

I have one table TABLE! , which contains one column named “EMPLOYEE NAME”.
There are various values inside it.
i want to check every field of EMPLOYEE NAME and find out if there is any Numeric Value inside.

EG.
EMPLOYEE NAME
Billgates
George
A1dam –Here 1 is present in between
Gem9ini –Here 9 is present in between

i want to identify this types of name which contains the numeric value in the field.

172.
on May 6, 2009 at 9:30 pm | Reply Domenic

using the instructions that you posted about adding an id column:

“Good Question. I have been asking this question in interview many times to new candidates. Answer is:
1) Add Identity Col and perform the operation displayed in this blog and drop Identity Col.
2) User Cursor and save your values in temp table. If you receive the same row which you put in temp table delete it.
3) You can use while loop as well do the same as cursor”

i get the error message from sql telling me it cannot insert the value NULL into column “id” column does not allow nulls (i have got this after trying to set the primary key on the id field)

173.
on May 7, 2009 at 8:36 am | Reply Imran Mohammed

@Domenic

I don’t think, I understood your question completely,

I think you asked, why do you get this error:
error message from sql telling me it cannot insert the value NULL into column “id” column does not allow nulls (i have got this after trying to set the primary key on the id field)

Try this,
– Step 1: Create a table
CREATE TABLE example1 ( id INT NOT NULL , ename VARCHAR(10))

– Step 2 : Insert some data in this table, few duplicate records
INSERT INTO example1 VALUES( 1, ‘imran’)
INSERT INTO example1 VALUES( 1, ‘imran’)
INSERT INTO example1 VALUES( 2, ‘imran’)
INSERT INTO example1 VALUES( 2, ‘imran’)
INSERT INTO example1 VALUES( 3, ‘imran’)

– Step 3: lets add Identity Column
ALTER TABLE example1
ADD Iden INT IDENTITY

– Step 4: Lets remove duplicate records.
DELETE
FROM example1
WHERE Iden NOT IN
(
SELECT MAX(Iden)
FROM example1
GROUP BY id, ename)

– Check if duplicate records still exists ?
SELECT * FROM example1

– Step 6: drop Identity column, which we added earlier.
ALTER TABLE example1
DROP COLUMN Iden

– Step 7: Lets make id column Primary key.
ALTER TABLE example1
ADD CONSTRAINT PK_Example1 PRIMARY KEY (id)

This works well with no issues.

if you replace step 1: with below script
– step1 (a) : Create a table

CREATE TABLE example1 ( id INT , ename VARCHAR(10))

if you repeat from step 2 – Step 7, step7 will fail, because id column in example1 table allows Null as you can see, I did not create that column with NOT NULL constraint, that is why you cannot make that column as primary key.

Does this answer your question ? If not please be clear what you actually want to ask ?

~ IM.

174.
on May 12, 2009 at 11:56 am | Reply Mostafa

hello

thanks for this script

175.
on May 12, 2009 at 3:46 pm | Reply Jerry

Table1 Table2

ID Sim_No Sim_No Plate
1 2203080 2203080 AD2345
2 2203081
3 2203082

How I delete The Sim No From Table1(Same Record In Table2 ) ( If I Take any No From Table1 & put In To Table 2)

176.
on May 12, 2009 at 9:46 pm | Reply Brian Tkatch

@Jerry

Whatever routine is used to INSERT the data INTO Table2, can also DELETE from Table1.

Or, for a general solution (which may not be very quick) DELETE any records in Table1 that EXIST in Table2:

DELETE FROM Table1 WHERE EXISTS (SELECT * FROM Table2 WHERE Table2.Sim_No = Table1.Sim_No)

177.
on May 19, 2009 at 3:42 pm | Reply tamilselvan

ItemTable

ItemNo BatchNo Qty
11 101 100
11 102 500
12 101 100
12 101 500
13 105 1000
11 101 600

Now I want like this
ItemNo BatchNo Qty
11 101 700
11 102 500
12 101 600
13 105 1000

how to write query for this
the above formate has Sample record in our project it have 35000 record

178.
on May 19, 2009 at 7:06 pm | Reply Brian Tkatch

@tamilselvan

Do you want a regular GROUP BY?

SELECT ItemNo, BatchNo, SUM(Qty)
GROUP BY ItemNo, BatchNo;

179.
on May 19, 2009 at 7:51 pm | Reply Imran Mohammed

create table example1 (ItemNo int ,BatchNo int ,Qty int)

insert into example1 values (11 ,101, 100)
insert into example1 values (11, 102, 500)
insert into example1 values (12, 101, 100)
insert into example1 values (12, 101, 500)
insert into example1 values (13, 105, 1000)
insert into example1 values (11, 101, 600)

select ItemNo
,BatchNo
,Sum(Qty) Qty
from example1
group by ItemNo
,BatchNo

180.
on June 8, 2009 at 5:52 pm | Reply gpshiburaj

I have a few records that are duplicates, all I need to do is mark them as duplicates.

select field01 from table01
field01
AA
AB
AC
AB
AA
AD

All that I require is
select field01, mark01 from table01 to return
field01 ; mark01
AA; 1
AB; 1
AC; 1
AB; 2
AA; 2
AD; 1

181.
on June 8, 2009 at 9:45 pm | Reply Brian Tkatch

@gpshiburaj

ROW_NUMBER() does that:

WITH
Data(field01)
AS
(
SELECT ‘AA’ UNION ALL
SELECT ‘AB’ UNION ALL
SELECT ‘AC’ UNION ALL
SELECT ‘AB’ UNION ALL
SELECT ‘AA’ UNION ALL
SELECT ‘AD’
)
SELECT
field01,
ROW_NUMBER() OVER(PARTITION BY field01 ORDER BY field01)
FROM
Data;

182.
on June 9, 2009 at 3:25 pm | Reply gpshiburaj

Thanks Brian,
I was thinking that we would have to use a cursor for this, using row_number is easier.

183.
on June 16, 2009 at 4:09 pm | Reply amit jain

Hello Pinal,

Is there any procedure to restore deleted records from particular table in database?

Thank you
Amit Jain

184.
on June 23, 2009 at 2:39 pm | Reply amit

Hello sir,

You r genious, I love your articles.

thanks

185.
on July 3, 2009 at 3:26 pm | Reply Lucky Ahuja

Hey Guys thats all fine . U can store the values in the Temp table remove the duplicate records and empty the original table . Then move all the unique records to the original table . But its a lengthy process suppose i am having 2 million records in a table and from that table if i want to ren=move the duplicate records then it would really affect the performance . Sorry Since i am not having so much of exp and if i had hurted someone . Pinal Sir do let me know if i am wrong

186.
on July 13, 2009 at 11:56 am | Reply aasim abdullah

excellent post, Senthilnathan comments really helped me to delete duplicate rows with most appropriate way.

187.
on July 29, 2009 at 10:24 am | Reply Neto

Thanks for your help man!!!

It helps me a lot!!!

Greats from Lima, Perú.

188.
on August 3, 2009 at 11:29 am | Reply ARNAB SARKAR

TABLE- MASTER,W_CODE IS COLUMN NAME.

DELETE FROM MASTER WHERE W_ID IN(SELECT MAX(W_ID) FROM MASTER WHERE W_CODE IN(SELECT W_CODE FROM MASTER GROUP BY W_CODE HAVING COUNT(W_CODE)>1))

189.
on August 4, 2009 at 5:22 pm | Reply siddique ahmad

Thanks a lot this really help me in removing the duplicate invoice added due to my code

190.
on August 7, 2009 at 7:54 pm | Reply Zoltan

Hello Pinal,

Could you please help me? ..When I ran my script it removed both of the records. Could you please advice me why?
I do not have index in that table.

Thank you!

drop table dupcalref;
create temporary table dupcalref
SELECT MAX(calownerid) as id FROM calref
group by calownerid, calenderid having count(calownerid)>1;

alter table dupcalref add index(id);

delete FROM calref where calownerid in
(select id from dupcalref);

191.
on August 7, 2009 at 10:59 pm | Reply Brian Tkatch

@Zoltan

Remove “calownerid” from the GROUP BY clause.

192.
on August 10, 2009 at 6:39 pm | Reply Zoltan

Brian,

Thank you for your help! Worked like a charm!

Thanks!

193.
on August 10, 2009 at 9:42 pm | Reply Rich

You could use CTE to delete the duplicates if you have SQL 2005 or above.

Create the duplicate table: ———————————–
IF OBJECT_ID(‘SalesHistory’) IS NOT NULLDROP TABLE SalesHistoryCREATE TABLE [dbo].[SalesHistory] ( [Product] [varchar](10) NULL, [SaleDate] [datetime] NULL, [SalePrice] [money] NULL ) GOINSERT INTO SalesHistory(Product, SaleDate, SalePrice)SELECT ‘Computer’,'1919-03-18 00:00:00.000′,1008.00UNION ALLSELECT ‘BigScreen’,'1927-03-18 00:00:00.000′,91.00UNION ALLSELECT ‘PoolTable’,'1927-04-01 00:00:00.000′,139.00UNION ALLSELECT ‘Computer’,'1919-03-18 00:00:00.000′,1008.00UNION ALLSELECT ‘BigScreen’,'1927-03-25 00:00:00.000′,92.00UNION ALLSELECT ‘PoolTable’,'1927-03-25 00:00:00.000′,108.00UNION ALLSELECT ‘Computer’,'1919-04-01 00:00:00.000′,150.00UNION ALLSELECT ‘BigScreen’,'1927-04-01 00:00:00.000′, 123.00UNION ALLSELECT ‘PoolTable’,'1927-04-01 00:00:00.000′, 139.00UNION ALLSELECT ‘Computer’,'1919-04-08 00:00:00.000′, 168.00
=======================================

Remove the duplicates —————————————

;WITH SalesCTE(Product, SaleDate, SalePrice, Ranking)AS(SELECT Product, SaleDate, SalePrice,Ranking = DENSE_RANK() OVER(PARTITION BY Product, SaleDate, SalePrice ORDER BY NEWID() ASC)FROM SalesHistory)DELETE FROM SalesCTEWHERE Ranking > 1

194.
on August 17, 2009 at 7:34 pm | Reply ryan

Pinal,

Thanks a ton, I spent a day experimenting with how to do this before I came across your solution, I added a key to my table and this worked like a charm.

195.
on August 24, 2009 at 11:07 pm | Reply Manoj

I want to delete the duplicate record ( i.e i have 10 duplicate record , i want to delete a particular record leaving all the remaining 9 records means , how can i doit , plz reply me )

*
on January 22, 2010 at 7:39 pm | Reply Mukesh oracle

please clarify your question which particular record u want to delete from 10 record. then i will give u answer.
if u want delete only single record out of ten record then fallow this: ( with respect to oracle database)

1)
delete from table_name
where Rowid in (select max (Rowid) from from table_name
group by all_duplicate_column;

(e.g here table_name: Emp , all_duplicate_column: Emp_id)

note: u can use min() fun at max place result ud be same

2)
delete from table_name x
where Rowid in (select max (Rowid) from from table_name y
where x.rowid = y.rowid);

Note: both quary delete only 1 record and return 9 dublicate record

196.
on August 27, 2009 at 6:32 am | Reply Avineet

HI Manoj, you can do so by setting row count 1 and put the delete statement, this way you will delete only 1 record of the 10 duplicate records. Avineet

197.
on October 1, 2009 at 4:28 pm | Reply Great Man

Very nice and I am proud of you. I am very excited.

198.
on November 5, 2009 at 12:13 pm | Reply Joe

Hi Sir
i use Microsoft access 2003
how to make the same record not to calculate the total or can i count the same record and devided by the count so that i can get original sum account
because duplicate record sum it all up
which i only want to have only one record

199.
on November 10, 2009 at 2:00 pm | Reply Amit Dhall

delete from tablename where NOT IN ( Select distinct * from tablename)

200.
on December 11, 2009 at 3:08 pm | Reply Deepak

Hi Pinal
How are you , You have all the good articale
My question is that
I have table look like
ID Name
1 Deepak
1 Deepak
1 Deepak
1 Deepak
1 Deepak

how to write a single query
Remain one Row eccept All row has been Deleted

201.
on December 19, 2009 at 5:03 am | Reply Sandeep Nallabelli

You can use the following query which has been published in the above posts:-

CREATE TABLE #Table3 (col1 int, col2 int)
INSERT INTO #Table3 VALUES (1, 1000)
INSERT INTO #Table3 VALUES (1, 1000)
INSERT INTO #Table3 VALUES (1, 1000)
INSERT INTO #Table3 VALUES (1, 1000)
INSERT INTO #Table3 VALUES (1, 1000)
INSERT INTO #Table3 VALUES (1, 1000)

SELECT * FROM #Table3;

WITH T1 AS (SELECT (ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col1)) AS RNum FROM #Table3)
DELETE FROM T1 WHERE RNum IN (SELECT a.RNum FROM T1 AS a, T1 AS b WHERE a.RNum > b.RNum GROUP BY a.RNum)

SELECT * FROM #Table3

202.
on December 19, 2009 at 10:57 am | Reply kirankharat

Hi I want to get duplicate records from the table but condition is,
suppose there r 3 columns A,B,C & in that
A column has data like…kiran is working.. & in B column data is …ramesh is working…..

I want to delete particular record i.e delete only common data like ” is working” from both A & B column…

Any body help me the same….

Thanks in Advance….

kiran…

203.
on January 21, 2010 at 5:30 pm | Reply malleswarareddy_m

hi,

Pinal can u please tell me about to getting olny duplicate records.
iam able to get duplicate decords from my table and but there was an unique identifier so we are getting only on record the duplicate record will not be get retrived

204.
on January 21, 2010 at 8:03 pm | Reply Deep

hi
i applied query in my database as

delete from table where id not in (select max(id) from table group by duplicate column name………)

but its not working in sql server 2000 .

this query is only useful for oracle where we can replace id as row id .

so please help me if any one have ans of this question?

Deep

205.
on January 22, 2010 at 7:25 pm | Reply Deep

DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn2)

This qyery is not working can any one tell me why we are using id inthis qury and in oracle it is rowid. so please give me same query to resolve the issue.

*
on January 25, 2010 at 8:13 pm | Reply Brian Tkatch

@Deep

As the article states, this is when there is an identity available for use as the PK.

If there is no identity, use ROW_NUMBER:

http://blog.sqlauthority.com/2009/06/23/sql-server-2005-2008-delete-duplicate-rows/

206.
on January 28, 2010 at 1:29 pm | Reply kewlsql

Hi,

I have 2 tables. I would like to merge the two tables and remove any duplicates of email address.
Table 1: Purchaser (Coy id, coy name & email add)
Table 2: Member Table (Coy id, coy name & email add)

Tried using the UNION sql doesnt work and group by as well.

207.
on January 28, 2010 at 10:17 pm | Reply Pinal Dave

Hi,

WE can get the distinct from the union of these two tables as below:

SELECT DISTINCT * FROM
(SELECT * FROM Purchaser
UNION
SELECT * FROM Member)

Another new method in SQL Server 2008 is using MERGE clause.

Regards,
Pinal Dave

208.
on January 29, 2010 at 12:45 pm | Reply kewlsql

Hi Pinal,

Thanks for the fast respond :)
Tried the one you advise as well. Still unsuccessful
—————————————————————————
SELECT DISTINCT * FROM

(SELECT [Test_Acctron Mbr Email].company_id,[Test_Acctron Mbr Email].company_name, [Test_Acctron Mbr Email].contact_email1
FROM [Test_Acctron Mbr Email]

UNION

SELECT [Publication Purchasers List].CoyID, [Publication Purchasers List].CoyName ,[Publication Purchasers List].Email1
FROM [Publication Purchasers List])
——————————————————————

My Original : – Seems “syntax error in FROM Clause”
=============================
SELECT P.CoyID,P.CoyName,P.Email

FROM [SELECT Max([Test_Acctron Mbr Email].company_id) AS CoyID, Max([Test_Acctron Mbr Email].company_name) AS CoyName, [Test_Acctron Mbr Email].contact_email1 AS Email
FROM [Test_Acctron Mbr Email]
UNION
SELECT Max([Publication Purchasers List].CoyID) AS CoyID, Max([Publication Purchasers List].CoyName) AS CoyName, [Publication Purchasers List.Email] AS Email
FROM [Publication Purchasers List]]. AS P
GROUP BY P.email;
=============================

209.
on January 29, 2010 at 8:52 pm | Reply Pinal Dave

Hi,

We are missing the derived table name. So use the below query:

SELECT DISTINCT * FROM
(SELECT [Test_Acctron Mbr Email].company_id,[Test_Acctron Mbr Email].company_name, [Test_Acctron Mbr Email].contact_email1
FROM [Test_Acctron Mbr Email]
UNION
SELECT [Publication Purchasers List].CoyID, [Publication Purchasers List].CoyName ,[Publication Purchasers List].Email1
FROM [Publication Purchasers List]) as tmp

Regards,
Pinal Dave

210.
on February 1, 2010 at 12:53 pm | Reply kewlsql

Apply the same theory and it works : ) Thanks Pinal.

211.
on February 2, 2010 at 12:50 am | Reply Rathin

hi,

The problem is i am having the columns of products_model, products_id(primary key), products_last_modified_date……etc

Here i have duplicate records of products_model, so i needs to remove old duplicate records and retain the new records based on products_last_modified_date column
kindly replay me

Thanks

212.
on February 2, 2010 at 10:13 am | Reply Imran Mohammed

@Rathin

Check if this helps…

– ** WORKS ONLY IN SQL SERVER 2005 **

declare @Example1 table
( Products_id int –constraint PK_Products_id primary key
,Products_model varchar(50) NULL
, Products_last_modified_date datetime NULL)

– Nothing Loaded
select * from @Example1

insert into @Example1 values (1, ‘ABC’, ‘1/1/2000′)
insert into @Example1 values (23, ‘ABC’, ‘2/1/2000′)
insert into @Example1 values (34, ‘ABC’, ‘3/1/2000′) — This is Latest Record
insert into @Example1 values (46, ‘DEF’, ‘1/1/2000′)
insert into @Example1 values (51, ‘DEF’, ‘3/1/2000′)– This is Latest Record
insert into @Example1 values (63, ‘DEF’, ‘2/1/2000′)

– Everything Loaded
select * from @Example1

– Deleting duplicates, given condition: dont delete latest record.
– Logic: Picking Max of dates and comparing max date with all dates for that specific record, if max date is given date that record will be ignored otherwise it will be deleted.

Delete D From
(select Products_id
,Products_model
,Products_last_modified_date
,MAX(Products_last_modified_date) OVER (PARTITION BY Products_Model )MAX_DATETIME_STAMP
from @Example1) D
Where Products_last_modified_date MAX_DATETIME_STAMP

– ** IMPORTANT ** —
– Please add your complete logic to OVER PARTITION BY

– After Delete
select * from @Example1

~ IM.

213.
on February 2, 2010 at 10:17 am | Reply Ashish Gilhotra

@Rathin

You can do something like that

Select * from
(SELECT *,row_number() Over(partition by products_model order by products_last_modified desc) as RN from youttable)t
where t.RN=1

214.
on February 2, 2010 at 6:46 pm | Reply Gulshan Bareja

Its really good…I was looking for delete duplicate record query finally I got this here.

Thanks
Gulshan Bareja

215.
on February 2, 2010 at 8:50 pm | Reply rathin

hi,

Sorry i am mentioned that having the problem with Mysql DB

(The problem is i am having the columns of products_model, products_id(primary key), products_last_modified_date……etc

Here i have duplicate records of products_model, so i needs to remove old duplicate records and retain the new records based on products_last_modified_date column
kindly replay me)

Is it possible , can we solve this by using Group by & having function?

216.
on February 5, 2010 at 5:46 pm | Reply Souvik

Hi Pinal,

Thanks for your post. I have another idea.
It can also be done in this way and it will work even if the Table does not have any identity column.

Query:

WITH xTab AS
(
SELECT
DuplicateColumn1,
DuplicateColumn2,
ROW_NUMBER() OVER (PARTITION BY DuplicateColumn3 ORDER BY Column3) AS [Count]
FROM MyTable
)
DELETE FROM xTab WHERE [Count] > 1

*
on February 5, 2010 at 10:25 pm | Reply Pinal Dave

Hello Souvik,

Thank you for sharing the method of getting duplicate using CTE. I appreciate this method as it is easy and better.
But the PARTITION BY clause should include all columns of table while you have used only one column.

Regards,
Pinal Dave

217.
on February 5, 2010 at 5:48 pm | Reply Souvik

Hi,

Please replace the sixth line of the query with this:

ROW_NUMBER() OVER (PARTITION BY DuplicateColumn3 ORDER BY DuplicateColumn3 )

218.
on February 23, 2010 at 10:28 am | Reply Jyoti

Hi,

I have one doubt. In this site for the query for fetching the Nth highest salary, we could see that the subquery contains Order By clause.

But when I see the properties of a sub query, I could see the statement that “A subquery cannot contain a ORDER-BY clause.”

Could you please clarify this.

Regards,
Jyoti

No comments:

Post a Comment