This article brings about the 178th question of LeetCode-SQL, which explains the ranking problem in MySQL. It is a very important and practical article. It is really recommended to collect and save:
- Topic introduction
- analysis of idea
- 3 different window functions
- Implement window function in MySQL5
I first came into contact with the ranking in SQL in a book written by a Japanese author MICK: "Advanced SQL Tutorial", if you are interested, you can read it carefully, which is very helpful for SQL improvement.
First introduce the specific topic: write a SQL query to achieve score ranking. If the two scores are the same, the ranks (Rank) of the two scores are the same. Please note that the next ranking after the split should be the next consecutive integer value. In other words, there should be no "gap" between the rankings.
+----+-------+ | Id | Score | +----+-------+ | 1 | 3.50 | | 2 | 3.65 | | 3 | 4.00 | | 4 | 3.85 | | 5 | 4.00 | | 6 | 3.65 | +----+-------+
For example, according to the Scores table given above, your query should return (sorted from highest to lowest score): the same score is taken in the same ranking, and there is no interval between the next rankings.
+-------+------+ | Score | Rank | +-------+------+ | 4.00 | 1 | | 4.00 | 1 | | 3.85 | 2 | | 3.65 | 3 | | 3.65 | 3 | | 3.50 | 4 | +-------+------+
This kind of thinking is a method I learned from Japanese author books.
select s1.Score -- 分数 ,(select count(distinct s2.Score) -- 大于等于此分数的分数值的不重复个数 from Scores s2 where s2.Score > s1.Score) + 1 as `Rank` -- 排名 from Scores s1 order by 2; -- 2表示第2个字段 -- 变换方式 select s1.score, (select count(*) + 1 -- 排名不能从1开始 from (select distinct score from Scores) s2 -- 1、找出表中有多少种不同的分数：有多少分数就有多少排名 where s2.score > s1.score) as Rank -- 2、筛选大于某个分数 from Scores s1 order by 2;
Idea 2 and Idea 1 have the same idea, the use of where sentence
select s1.Score ,count(distinct(s2.Score)) `Rank` from Scores s1, Scores s2 where s1.Score <= s2.Score group by s1.Id order by 2; -- 2表示第2个字段
The code is interpreted as:
1. Two identical tables are commanded as s1 and s2 respectively
2. Given s1.Score, find out how many scores meet: s2.Score >= s1.Score. For example, s1.Score=3.65, then: [4.00 ,4.00, 3.85, 3.65, 3.65] meets the requirements, but the same score has the same ranking, so the score is deduplicated: count(distinct s2.Score), then 3.65 Is ranked 3
3. Group by will group and rank the data of s1, otherwise only one piece of data will be returned
4. Ascending order of ranking
Whether it is idea 1 or idea 2, basically it is realized in two steps:
- The first part is the score in descending order
- The second part is the ranking corresponding to each score
1. Regarding the realization of the first part: direct ranking in descending order
select a.Score from Scores a order by a.Score DESC -- 直接根据分数降序实现
2. Regarding the realization of the second part: Suppose you are now given a score S, how do we calculate its rank?
We can first extract all score sets H greater than or equal to S, and the number of elements after deduplication of H is the ranking of S. For example, in a certain test, Xiao Ming scored 98 points, which is greater than Xiao Ming’s score [100,99,100,99,98]; after de-duplication, it is [100,99,98], with a total of 3 elements; therefore, Xiao Ming’s ranking is 3rd. .
First extract the set H that meets the requirements:
select b.Score from Scores b where b.Score >= S;
Then the number after deduplication of the set H is used as the ranking:
select count(distinct b.Score) -- 去重后的个数当做排名 from Scores b where b.Score >= S as Rank;
In fact, the S here corresponds to a.Score:
select a.Score as Score, (select count(distinct b.Score) from Scores b where b.Score >= a.Score) as Rank -- 将S替代成了a.Score from Scores a order by a.Score DESC;
Window functions, also called
OLAPfunctions (Online Anallytical Processing, online analysis and processing), can perform real-time analysis and processing of database data.
The basic syntax of window functions:
<窗口函数> over (partition by <用于分组的字段名> -- partition子句可省略，不指定分组 order by <用于排序的列名>)
<窗口函数>Two functions can be placed in the position:
- Dedicated window functions, such as rank, dense_rank, row_number, etc.
- Aggregate functions, such as sum, avg, count, max, min, etc.
- Has the function of grouping and sorting at the same time
- Do not change the number of rows in the original table
- In principle, window functions can only be written in
There are 3 dedicated window functions in MySQL 8.X or hive:
rank: Tied jump ranking
dense_rank: Tied for consecutive rankings
row_number: Consecutive ranking
An example is used to illustrate where the ranking differences of the three functions are reflected. Now given five results: 93, 93, 85, 80, 75, the results obtained by using 3 different windowing functions respectively are:
1. Using DENSE_RANK() for ranking will get: 1, 1, 2, 3, 4
2. Using RANK() to rank will get: 1, 1, 3, 4, 5
3. Using ROW_NUMBER() to rank will get: 1, 2, 3, 4, 5
Finally, a table is used to illustrate the difference: the figure below is the data to be sorted
Tables and differences after ranking by 3 functions:
select name,price, row_number() (order by price desc) as `row_number`, rank() over (order by price desc) as `rank`, dense_rank() (order by price desc) as `dense_rank` from products;
MySQL5 implements window functions
There are built-in window functions in MySQL8, but they are not available in MySQL. The following describes how to implement the functions of the above three window functions in MySQL5.
1. Realize row_number function: continuous ranking
The implementation process is not complicated: directly sort in descending order, just need to add a ranking and automatically increase the function of 1; refer to an English article for the realization of row_number; https://www.mysqltutorial.org/mysql-row_number/, pass a Variable to achieve
SET @row_number = 0; -- 设置变量 SELECT (@row_number:[email protected]_number + 1) AS num, -- 变量的自加1 firstName, lastName FROM employees ORDER BY firstName, lastName LIMIT 5;
set @rank := 0; -- 设置初始值为0 select name, price, (@rank := @rank + 1) as row_number from products order by price desc;
2. Realize the rank function: jump side by side
select p1.name,p1.price, (select count(p2.price) -- 价格不同去重 from products p2 where p2.price > p1.price) + 1 as rank_1 from products order by rank_1;
3. Realize dense_rank function: parallel and continuous ranking
select p1.name, p1.price, (select count(distinct p2.price) -- 价格去重处理 from products p2 where p2.price > p1.price) + 1 as dense_rank from products p1 order by dense_rank;
to sum up
The ranking question in SQL is a very important test site, which is often asked during interviews, especially the use of three windowing functions, which is a high-frequency test site. To summarize:
- Hive or MySQL8 already exists functions that can be implemented
- In MySQL5, you need to write script statistics according to different scenarios.
- The use of three window functions must be mastered