MySQL8.0 Skip Scan Range skip scan

Today I have the honor to get the monthly report from the Wen Ali database. Found that there is a more practical function Loose Skip Scan in MySQL 8.0. MySQL has supported a new range scan method since version 8.0.13, this feature is contributed by Facebook.

That is to say, when combining indexes (f1, f2), the function of index scanning can be realized when there is only f2 in the query condition. Range scanning is more effective than full index scanning. The optimizer can perform multiple range scanning, each value corresponds to one f1, using a method called Skip Scan, similar to a loose index scan.

Let's take a look at the specific implementation:

Prepare test data according to official examples

CREATE TABLE t1 (f1 INT NOT NULL, f2 INT NOT NULL);INSERT INTO t1 VALUES   (1,1), (1,2), (1,3), (1,4), (1,5),  (2,1), (2,2), (2,3), (2,4), (2,5);INSERT INTO t1 SELECT f1, f2 + 5 FROM t1;INSERT INTO t1 SELECT f1, f2 + 10 FROM t1;INSERT INTO t1 SELECT f1, f2 + 20 FROM t1;INSERT INTO t1 SELECT f1, f2 + 40 FROM t1;ANALYZE TABLE t1; mysql> SELECT COUNT(*) FROM t1;+----------+| COUNT(*) |+----------+|      160 |+----------+

Add a composite primary key

mysql> ALTER TABLE t1 ADD PRIMARY KEY(f1, f2);mysql> EXPLAIN SELECT f1, f2 FROM t1 WHERE f2 > 40;+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                  |+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+|  1 | SIMPLE      | t1    | NULL       | range | PRIMARY       | PRIMARY | 8       | NULL |   53 |   100.00 | Using where; Using index for skip scan |+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+------------------------------

Here is indeed the index PRIMARY, rows 53

Add ordinary composite index

mysql> ALTER TABLE t1 DROP PRIMARY KEY;mysql> ALTER TABLE t1 ADD KEY(f1, f2);Query OK, 0 rows affected (0.01 sec)Records: 0  Duplicates: 0  Warnings: 0 mysql> EXPLAIN SELECT f1, f2 FROM t1 WHERE f2 > 40;+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+----------------------------------------+| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                  |+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+----------------------------------------+|  1 | SIMPLE      | t1    | NULL       | range | f1            | f1   | 8       | NULL |   53 |   100.00 | Using where; Using index for skip scan |+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+----------------------------------------+1 row in set, 1 warning (0.00 sec)

Here is also the index key f1, rows 53 rows

Analysis by optimizer_trace

mysql> set optimizer_trace="enabled=on";Query OK, 0 rows affected (0.00 sec)mysql> show variables like '%optimizer_trace%';+------------------------------+----------------------------------------------------------------------------+| Variable_name                | Value                                                                      |+------------------------------+----------------------------------------------------------------------------+| optimizer_trace              | enabled=on,one_line=off                                                    || optimizer_trace_features     | greedy_search=on,range_optimizer=on,dynamic_range=on,repeated_subselect=on || optimizer_trace_limit        | 1                                                                          || optimizer_trace_max_mem_size | 1048576                                                                    || optimizer_trace_offset       | -1                                                                         |+------------------------------+----------------------------------------------------------------------------+mysql> SET OPTIMIZER_TRACE_MAX_MEM_SIZE=1000000 ; mysql> EXPLAIN SELECT f1, f2 FROM t1 WHERE f2 > 40; mysql> select * from information_schema.optimizer_trace\G;1 row in set, 1 warning (0.00 sec)                  "skip_scan_range": {                    "potential_skip_scan_indexes": [                      {                        "index": "PRIMARY",                        "tree_travel_cost": 0.4,                        "num_groups": 3,                        "rows": 53,                        "cost": 10.625                      }                    ]                  },                  "best_skip_scan_summary": {                    "type": "skip_scan",                    "index": "PRIMARY",                    "key_parts_used_for_access": [                      "f1",                      "f2"                    ],                    "range": [                      "40 < f2"                    ],                    "chosen": true                  },
  • Get the first different value of the first key part (f1 = 1).
  • Construct the range based on the first and second key components (f1 = 1 and f2> 40).
  • Perform a range scan.
  • Get the next different value of the first key part (f1 = 2).
  • Construct the range based on the first and second key parts (f1 = 2 and f2> 40).
  • Perform a range scan.
    From the above description, we can see that the use of skip-scan avoids the full index scan, thereby improving performance, especially when the index prefix column discrimination is relatively low.
optimizer_switch to control

Conditional skip scan can be controlled by Hint or optimizer_switch (skip_scan), which is enabled by default. According to the description of the worklog, for the following query:

mysql> show variables like 'optimizer_switch'\G;*************************** 1. row ***************************Variable_name: optimizer_switch        Value: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,use_invisible_indexes=off,skip_scan=on,hash_join=off1 row in set (0.01 sec)
Use restrictions

Using this strategy can reduce the number of rows accessed, because MySQL will skip rows that do not meet the scope of each build. This skip scan access method is suitable for the following situations:

  • Table T has at least one composite index ([A_1,..., A_k,] B_1,..., B_m, C [, D_1,...D_n]) that contains key parts of the form. The key parts A and D may be empty, but B and C must be non-empty.
  • The query refers to only one table.
  • The query does not use GROUP BY or DISTINCT.
  • The query only references the columns in the index.
  • The predicates on A_1,...A_k must be equality predicates, and they must be constants. This includes the IN() operator.
  • The query must be a concatenated query; (cond1(key_part1) or cond2(key_part1)) and (cond1(key_part2) or...) and...
  • There must be a range condition on C.
  • It is allowed to set conditions on column D. The conditions on D must be combined with the range conditions on C.
to sum up

Recall that in the actual MySQL environment, the maintenance cost of an excessive number of indexes, because the business needs to create indexes separately again, many problems have been solved through this function.