Friday, March 12, 2010

Row_Number() function in SQL Server 2005

As we all know, SQL Server 2005 has new features when compared to SQL Server 2000. One of the features that we are going to discuss in this article is the Row_Number() function. SQL Server Database administrators and developers have been longing for this function for a long time--now the wait is over.

Traditionally developers and Database administrators used temporary tables and co-related sub-queries to generate calculated row numbers in a query. Now SQL Server 2005 provides a function, which replaces all of the additional resources we used to generate row numbers.

Let us assume that we have the following database [EMPLOYEE TEST] and the following table [EMPLOYEE] in the database. You can use the below query to create the database, table and all the corresponding rows.

USE [MASTER]
GO
IF EXISTS
(SELECT NAME FROM SYS.DATABASES WHERE NAME = N'EMPLOYEE TEST')
DROP DATABASE [EMPLOYEE TEST]
GO
CREATE DATABASE [EMPLOYEE TEST]
GO
USE [EMPLOYEE TEST]
GO
IF EXISTS
(SELECT * FROM SYS.OBJECTS
WHERE OBJECT_ID =
OBJECT_ID(N'[DBO].[EMPLOYEE]') AND TYPE IN (N'U'))
DROP TABLE [DBO].[EMPLOYEE]
GO
CREATE TABLE EMPLOYEE (EMPID INT, FNAME VARCHAR(50),
LNAME VARCHAR(50))
GO
INSERT INTO EMPLOYEE (EMPID, FNAME, LNAME)
VALUES (2021110, 'MICHAEL', 'POLAND')
GO
INSERT INTO EMPLOYEE (EMPID, FNAME, LNAME)
VALUES (2021110, 'MICHAEL', 'POLAND')
GO
INSERT INTO EMPLOYEE (EMPID, FNAME, LNAME)
VALUES (2021115, 'JIM', 'KENNEDY')
GO
INSERT INTO EMPLOYEE (EMPID, FNAME, LNAME)
VALUES (2121000, 'JAMES', 'SMITH')
GO
INSERT INTO EMPLOYEE (EMPID, FNAME, LNAME)
VALUES (2011111, 'ADAM', 'ACKERMAN')
GO
INSERT INTO EMPLOYEE (EMPID, FNAME, LNAME)
VALUES (3015670, 'MARTHA', 'LEDERER')
GO
INSERT INTO EMPLOYEE (EMPID, FNAME, LNAME)
VALUES (1021710, 'MARIAH', 'MANDEZ')
GO

Let us browse the table Employee by using the following SQL Query.

SELECT EMPID, FNAME, LNAME FROM EMPLOYEE

The results of the above query look like illustration 1.0.

2021110


MICHAEL


POLAND

2021110


MICHAEL


POLAND

2021115


JIM


KENNEDY

2121000


JAMES


SMITH

2011111


ADAM


ACKERMAN

3015670


MARTHA


LEDERER

1021710


MARIAH


MANDEZ

Illustration 1.0

Traditionally in SQL Server 2000, in order to create row numbers based on the rows available in a table, we used to use the following query.

SELECT ROWID=IDENTITY(int,1,1) , EMPID, FNAME, LNAME
INTO EMPLOYEE2 FROM EMPLOYEE ORDER BY EMPID

This query created a new table using the identity function in order to generate RowId.

Let us query the table by using the following query.

SELECT RowID, EMPID, FNAME, LNAME FROM EMPLOYEE2

The results of the above query would look like illustration 1.1.

1


1021710


MARIAH


MANDEZ

2


2011111


ADAM


ACKERMAN

3


2021110


MICHAEL


POLAND

4


2021110


MICHAEL


POLAND

5


2021115


JIM


KENNEDY

6


2121000


JAMES


SMITH

7


3015670


MARTHA


LEDERER

Illustration 1.1

In this illustration it is clear that the table has a duplicate row with EMPID = 2021110.

To delete the duplicate row with EMPID = 2021110, we have to delete the row in employee2 table and I cannot delete the duplicate row directly from the Employee table.

SQL Server 2005 provides a new function, Row_Number(), for generating row numbers. In order to delete the duplicate row from the original table we can use the features, Common Table Expression and Row_Number() together.

Let us generate the ROWID using the Row_Number() function based on EMPID.

SELECT ROW_NUMBER() OVER (ORDER BY EMPID ASC) AS ROWID, * FROM EMPLOYEE

The results of the above query would look like illustration 1.2.

1


1021710


MARIAH


MANDEZ

2


2011111


ADAM


ACKERMAN

3


2021110


MICHAEL


POLAND

4


2021110


MICHAEL


POLAND

5


2021115


JIM


KENNEDY

6


2121000


JAMES


SMITH

7


3015670


MARTHA


LEDERER

Illustration 1.2

In this result set, we can identify the duplicate row for the EMPID 2021110.

Let us display the duplicate row using the Common Table expression and Row_Number() function by using the following query.

WITH [EMPLOYEE ORDERED BY ROWID] AS
(SELECT ROW_NUMBER() OVER (ORDER BY EMPID ASC) AS ROWID, * FROM EMPLOYEE)
SELECT * FROM [EMPLOYEE ORDERED BY ROWID] WHERE ROWID =4

The results of the above query would look like illustration 1.3.

4


2021110


MICHAEL


POLAND

Illustration 1.3

This duplicate row can be deleted using the Common Table expression and Row_Number() function by using the following query.

WITH [EMPLOYEE ORDERED BY ROWID] AS
(SELECT ROW_NUMBER() OVER (ORDER BY EMPID ASC) AS ROWID, * FROM EMPLOYEE)
DELETE FROM [EMPLOYEE ORDERED BY ROWID] WHERE ROWID =4

Let us query the table using the following query.

SELECT * FROM EMPLOYEE

The results of the above query would look like illustration 1.4.

2021110


MICHAEL


POLAND

2021115


JIM


KENNEDY

2121000


JAMES


SMITH

2011111


ADAM


ACKERMAN

3015670


MARTHA


LEDERER

1021710


MARIAH


MANDEZ

Illustration 1.4

In this illustration, we can see that the duplicate row has been deleted.
Conclusion

In this article, we have discussed the new features of Row_Number() function and Common Table Expression and used both the features together to delete duplicate rows.

No comments:

Post a Comment