.NET Journey: SQL Ranking functions

Ranking functions return a ranking value for each rows in your result set. Many tasks, like generating sequential numbers, finding ranks, and so on, which in pre-2005 SQL Server versions requires many lines of code, now can be implemented much easier and faster.

Look at the following example:

SELECT FirstName, LastName, PostalCode, 
       ,ROW_NUMBER() OVER (ORDER BY PostalCode) AS 'Row Number'
       ,RANK() OVER (ORDER BY PostalCode) AS 'Rank'
       ,DENSE_RANK() OVER (ORDER BY PostalCode) AS 'Dense Rank'
       ,NTILE(2) OVER (ORDER BY PostalCode) AS 'Bi-tile'
FROM   Persons

Result Set:

FirstName	LastName	PostalCode	Row Number	Rank	Dense Rank	Bi-tile
Joe	Harley	3007	1	1	1	1
David	Bigel	3007	2	1	1	1
Joel	Friedlaender	3007	3	1	1	1
Kenny	Lam	3128	4	4	2	2
Nirmal	Parrera	3128	5	4	2	2
Gurnam	Madan	3128	6	4	2	3
Troy	Parker	3150	7	7	3	3
Melanie	????	3150	8	7	3	4
Jeremy	Pickhaver	3150	9	7	3	4

Lets look at each of these ranking functions:

1. ROW_NUMBER()

ROW_NUMBER ()     OVER ( [ <partition_by_clause> ] <order_by_clause> )

This function returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition. The ORDER BY clause determines the sequence in which the rows are assigned their unique ROW_NUMBER within a specified partition. In the above example we have not specified any partition columns, so the ROW_NUMBER is assigned based on the ordering of postal code column for each row.

2. RANK()

RANK ()    OVER ( [ < partition_by_clause > ] < order_by_clause > )

This function returns the rank of each row within the partition of a result set. The rank of a row is one plus the number of ranks that come before the row in question. You can also group the the rankings by using the PARTITION BY clause. In the above example the rankings are the same for rows having the same value in postal code column. But the ranking value keeps on incrementing for each row, so that when a new postal code value is encountered, ranking value on that new row will be one more than the number of proceeding rows.

3. DENSE_RANK()

DENSE_RANK ()    OVER ( [ < partition_by_clause > ] < order_by_clause > )

This function returns the rank of rows within the partition of a result set, without any gaps in the ranking. The rank of a row is one plus the number of distinct ranks that come before the row in question. In the above example rankings are incremented as the value in postal code column changes.

4. NTILE()

NTILE (integer_expression)    OVER ( [ <partition_by_clause> ] < order_by_clause > )

This function distributes the rows in an ordered partition into a specified number of groups. The groups are numbered, starting at one. For each row, NTILE returns the number of the group to which the row belongs. The integer constant expression specifies the number of groups into which each partition must be divided. In the above examples the rows are divided in 2 groups. Because the total number of rows(9) is not divisible by the number of groups, the first group has three rows and the remaining groups have two rows each.

These functions can be really handy for complex sorting and generating sequential record set. These windowed functions can only be used in SELECT and ORDER BY statements. You can read more about them here.

.NET Journey

Pages

Thursday, November 11, 2010

SQL Ranking functions

No comments:

Post a Comment