LeetCode-178-score ranking (required for interview)

LeetCode-178-Score Ranking

This article brings about the 178th question of LeetCode-SQL, which explains the ranking problem in MySQL. It is a very important and practical article. It is really recommended to collect and save:

  • Topic introduction
  • analysis of idea
  • 3 different window functions
  • Implement window function in MySQL5

I first came into contact with the ranking in SQL in a book written by a Japanese author MICK: "Advanced SQL Tutorial", if you are interested, you can read it carefully, which is very helpful for SQL improvement.

topic

First introduce the specific topic: write a SQL query to achieve score ranking. If the two scores are the same, the ranks (Rank) of the two scores are the same. Please note that the next ranking after the split should be the next consecutive integer value. In other words, there should be no "gap" between the rankings.

+----+-------+
| Id | Score |
+----+-------+
| 1  | 3.50  |
| 2  | 3.65  |
| 3  | 4.00  |
| 4  | 3.85  |
| 5  | 4.00  |
| 6  | 3.65  |
+----+-------+

For example, according to the Scores table given above, your query should return (sorted from highest to lowest score): the same score is taken in the same ranking, and there is no interval between the next rankings.

+-------+------+
| Score | Rank |
+-------+------+
| 4.00  | 1    |
| 4.00  | 1    |
| 3.85  | 2    |
| 3.65  | 3    |
| 3.65  | 3    |
| 3.50  | 4    |
+-------+------+

Ideas

Idea 1

This kind of thinking is a method I learned from Japanese author books.

select 
    s1.Score  -- 分数
    ,(select count(distinct s2.Score)  -- 大于等于此分数的分数值的不重复个数
      from Scores s2 
      where s2.Score > s1.Score) + 1 as `Rank`  -- 排名
from Scores s1
order by 2;  -- 2表示第2个字段

-- 变换方式
select 
	s1.score,
	(select count(*) + 1   -- 排名不能从1开始
   from (select distinct score from Scores) s2   -- 1、找出表中有多少种不同的分数:有多少分数就有多少排名
   where s2.score > s1.score) as Rank   -- 2、筛选大于某个分数
from Scores s1 
order by 2;

Idea 2

Idea 2 and Idea 1 have the same idea, the use of where sentence

select 
    s1.Score
    ,count(distinct(s2.Score)) `Rank`
from Scores s1, Scores s2
where s1.Score <= s2.Score
group by s1.Id
order by 2;  -- 2表示第2个字段

The code is interpreted as:

1. Two identical tables are commanded as s1 and s2 respectively

2. Given s1.Score, find out how many scores meet: s2.Score >= s1.Score. For example, s1.Score=3.65, then: [4.00 ,4.00, 3.85, 3.65, 3.65] meets the requirements, but the same score has the same ranking, so the score is deduplicated: count(distinct s2.Score), then 3.65 Is ranked 3

3. Group by will group and rank the data of s1, otherwise only one piece of data will be returned

4. Ascending order of ranking

Overall thought

Whether it is idea 1 or idea 2, basically it is realized in two steps:

  • The first part is the score in descending order
  • The second part is the ranking corresponding to each score

1. Regarding the realization of the first part: direct ranking in descending order

select a.Score   
from Scores a
order by a.Score DESC   -- 直接根据分数降序实现

2. Regarding the realization of the second part: Suppose you are now given a score S, how do we calculate its rank?

We can first extract all score sets H greater than or equal to S, and the number of elements after deduplication of H is the ranking of S. For example, in a certain test, Xiao Ming scored 98 points, which is greater than Xiao Ming’s score [100,99,100,99,98]; after de-duplication, it is [100,99,98], with a total of 3 elements; therefore, Xiao Ming’s ranking is 3rd. .

First extract the set H that meets the requirements:

select b.Score 
from Scores b 
where b.Score >= S;

Then the number after deduplication of the set H is used as the ranking:

select count(distinct b.Score)   -- 去重后的个数当做排名
from Scores b 
where b.Score >= S as Rank;

In fact, the S here corresponds to a.Score:

select 
	a.Score as Score,
	(select count(distinct b.Score) 
   from Scores b 
   where b.Score >= a.Score) as Rank  -- 将S替代成了a.Score
from Scores a
order by a.Score DESC;

Window function

Window functions, also called OLAPfunctions (Online Anallytical Processing, online analysis and processing), can perform real-time analysis and processing of database data.

grammar

The basic syntax of window functions:

<窗口函数> over (partition by <用于分组的字段名>  -- partition子句可省略,不指定分组
             order by <用于排序的列名>)

<窗口函数>Two functions can be placed in the position:

  • Dedicated window functions, such as rank, dense_rank, row_number, etc.
  • Aggregate functions, such as sum, avg, count, max, min, etc.

Features

  • Has the function of grouping and sorting at the same time
  • Do not change the number of rows in the original table
  • In principle, window functions can only be written in selectclauses

rank/dense_rank/row_number

There are 3 dedicated window functions in MySQL 8.X or hive:

  • rank: Tied jump ranking
  • dense_rank: Tied for consecutive rankings
  • row_number: Consecutive ranking

An example is used to illustrate where the ranking differences of the three functions are reflected. Now given five results: 93, 93, 85, 80, 75, the results obtained by using 3 different windowing functions respectively are:

1. Using DENSE_RANK() for ranking will get: 1, 1, 2, 3, 4

2. Using RANK() to rank will get: 1, 1, 3, 4, 5

3. Using ROW_NUMBER() to rank will get: 1, 2, 3, 4, 5

Finally, a table is used to illustrate the difference: the figure below is the data to be sorted

Tables and differences after ranking by 3 functions:

select
	name,price,
	row_number() (order by price desc) as `row_number`,
	rank() over (order by price desc) as `rank`,
	dense_rank() (order by price desc) as `dense_rank`
from products;

MySQL5 implements window functions

There are built-in window functions in MySQL8, but they are not available in MySQL. The following describes how to implement the functions of the above three window functions in MySQL5.

1. Realize row_number function: continuous ranking

The implementation process is not complicated: directly sort in descending order, just need to add a ranking and automatically increase the function of 1; refer to an English article for the realization of row_number; https://www.mysqltutorial.org/mysql-row_number/, pass a Variable to achieve

SET @row_number = 0;   -- 设置变量

SELECT 
    (@row_number:[email protected]_number + 1) AS num,  -- 变量的自加1
    firstName, 
    lastName
FROM
    employees
ORDER BY firstName, lastName    
LIMIT 5;
set @rank := 0;  -- 设置初始值为0
select 
	name, 
	price,
	(@rank := @rank + 1) as row_number 
from products
order by price desc;

2. Realize the rank function: jump side by side

select p1.name,p1.price,
			(select count(p2.price)  -- 价格不同去重
       from products p2
       where p2.price > p1.price) + 1   
       as rank_1
from products
order by rank_1;

3. Realize dense_rank function: parallel and continuous ranking

select p1.name, p1.price,
    (select count(distinct p2.price)   -- 价格去重处理
     from products p2 
     where p2.price > p1.price) + 1 as dense_rank
from products p1
order by dense_rank;

to sum up

The ranking question in SQL is a very important test site, which is often asked during interviews, especially the use of three windowing functions, which is a high-frequency test site. To summarize:

  • Hive or MySQL8 already exists functions that can be implemented
  • In MySQL5, you need to write script statistics according to different scenarios.
  • The use of three window functions must be mastered