How do you number SELECT query results using SQL?
Let’s say you have a team
table in your database that stores the name and
number of gold, silver, and bronze medals won for each team:
team | ||||
---|---|---|---|---|
id | name | gold | silver | bronze |
1 | Taiwan #1 | 4 | 2 | 1 |
2 | There was an attempt | 0 | 0 | 0 |
4 | Team 4 Plays | 3 | 2 | 2 |
5 | Five Gals | 3 | 2 | 2 |
6 | Sixcess guaranteed | 5 | 4 | 1 |
7 | We ate nine | 0 | 6 | 3 |
8 | AMERICA F*** YEAH | 5 | 4 | 0 |
As you can see, this table is currently ordered by id
. We can use the ORDER BY
clause to request a version of this data that shows teams with the most gold,
silver, and bronze medals first:
While this gives you all the information you need, it would be more convenient if the rows were also numbered. You could number them yourself, but why do that when you can do it just as easily using SQL?
There are two so-called window functions, ROW_NUMBER()
and RANK()
, that can
be used to number rows in SQL output.
The first window function, ROW_NUMBER()
, simply assigns a unique number from 1
to whatever the number of rows in the query result is. The snippet below shows
how ROW_NUMBER()
can be used to number teams. Note that the ORDER BY
clause
has been moved to ROW_NUMBER() OVER ()
.
This can be useful if the result is meant to represent a sequence or if you simply need a way to refer to each row that is not based on an actual ID.
The other window function, RANK()
, ranks your rows using the ordering criteria
that you define within the OVER ()
part. This is particularly useful when the
results represent a list from best to worst (as in this example) from largest
to smallest, smallest to largest, etc.
ROW_NUMBER()
should be widely supported by now. Nevertheless, there are still
many older databases that do not support it yet. It wouldn’t surprise me if most
of those are older versions of MySQL on shared hosting plans.
If that happens to be the case for you, there’s a workaround that you can use to
emulate the functionality of ROW_NUMBER()
.
The snippet below shows how you can do this. Note that it consists of two queries, which , in the correct order.
It works by first initialising a numeric variable, which you can then increment and add to each record that you retrieve from the database.
RANK()
is , but it’s still quite doable. The example
below shows how ranks can be generated by joining
the team
table with itself.
First, the original team
table is given an alias, team_result
. This where
most of the data in our query output will come from.
Then we join the original team
table (now called team_result
) with another
instance of team
, which we call better_team
. The “better” part is defined in
the JOIN
condition as any other team (team_result.id != better_team.id
)
that has won more gold, silver, and/or bronze medals (the entire part after the
AND
).
We can use GROUP BY
and COUNT(better_team.id)
to count how many better-performing
competitors each team has. This gets us very close to what we want. However,
because the best team obviously isn’t bested by any other team, this would make
the numbering begin at 0. We can fix this by incrementing the value of
COUNT(better_team.id)
by 1.
Finally, we sort the results using the newly computed rank
field to obtain the
same results that we saw earlier when we used RANK()
.
Easy! (Sort of.)