MySQL advanced-index failure

Scenarios where the index fails

Reference URL

Seven kinds of index cases of failure - liehen2046 - blog Park
MySQL index failed - Programming grapefruit - blog Park
MySQL index misses situation

Introduction

Failure scenario

Description/Example

like starts with% or _

As long as% and _ are not at the beginning, the index is valid.

Indexes are not used at the same time before and after the or statement

When only one of the query fields on the left and right of or is an index, the index is invalid, and it will take effect only when the query fields on the left and right of or are all indexes.

The joint index does not follow the leftmost prefix

See "Joint Index" above

Implicit conversion of data type

Varchar without single quotes may be automatically converted to int type, invalidating the index and generating a full table scan.

For example: the name field in the tb1 table is a string type

Index is valid: select * from tb1 where name = '123'

Index failure: select * from tb1 where name = 123

Use IS NULL or IS NOT NULL on index columns

Reason: The index cannot calculate the location to store it through a null value.

Solution:

Method 1: Convert the NULL value to a specific value, when searching in WHERE, use the specific value to find

Method 2: Create a composite index. For example: create index ind_a on table(col1,1); By specifying a non-empty constant value in the composite index, it is impossible to have all empty values ​​in the combination of the columns that constitute the index.

Use not, <>, != on the index field

The inequality operator will never use the index, so its processing will only produce a full table scan. Optimization method: Change key<>0 to key>0 or key<0.

Calculate the index field or use a function

Example:

select * from t where id*3=3000

select * from t where ABS(id)=3000

The reason for the index failure is that the index is a binary tree built for the original value. After the column value is calculated, the original binary tree is not available.

Wrong use of order by

See below

Full table scan speed is faster than index speed

At this time, mysql will use a full table scan, and the index is invalid.

E.g:

Tables with very little data

The data repetition rate of the index field in the table is too high

Use parameters in the where clause (may be invalid)

It may cause the index to become invalid, depending on the version.

Because SQL only parses local variables at runtime, the optimizer cannot defer the choice of access plan until runtime; it must choose at compile time. However, if the access plan is created at compile time, the value of the variable is still unknown and therefore cannot be used as an input item for index selection. This statement will perform a full table scan (possibly): select id from t where num = @num
can be changed to force the query to use an index: select id from t with(index(index name)) where num = @num

Joint index

Other URL

Joint index (multi-column index)_database_lm1060891265的博客-CSDN blog
mysql principles of creating indexes_database_boss_way的博客-CSDN blog
[Original] Interviewer: Tell me about your understanding of mysql joint index? -Lonely Smoke-Blog Garden

Introduction

  • Joint index: Index multiple columns. The joint index is also a B+ tree. The difference is that the number of key values ​​of the joint index is not 1, but greater than or equal to 2.
  • Joint Index: Follow the principle of the leftmost prefix (leftmost matching).
Assume that the joint index of the above figure is (a, b). The joint index is also a B+ tree, the difference is that the B+ tree sorts the index b on the basis of sorting the index a. So the data is arranged in the order of (1,1), (1,2), (2,1), (2,4), (2,4), (3,1), (3,2).
  • a is ordered: 1, 1, 2, 2, 3, 3
  • b is disordered: 1, 2, 1, 4, 1, 2
From a partial point of view, when the value of a is determined, b is ordered. For example, when a = 1, the value of b is 1, and 2 is an ordered state. When a=2, the value of b is 1, and 4 is also in an orderly state.
Therefore, if you execute a = 1 and b = 2, the a and b fields can be indexed. When you execute a> 1 and b = 2, the a field can use the index, but the b field does not use the index. Because the value of a is a range at this time, it is not fixed, and the value of b is not ordered in this range, so the b field does not need to be indexed.
In summary, the leftmost matching principle will stop matching when encountering a range query.

When the joint index is effective

Assumption: Create a joint index: index index_name (a, b);
commandDescription
selete * from table where a=XXWill use the index.
selete * from table where a=XX and b=YYObviously it is possible to use (a, b) joint index
select * from table where b=YY and a=XX

        This sentence does not comply with the leftmost matching principle. However, due to the existence of the query optimizer, the MySQL optimizer will automatically adjust the order of a and b after where to be consistent with the index order.
        Optimization: Put the most selective column at the top of the index in the joint index. For example, in a company, age and gender are indexed. Obviously, age should be placed first, because there are two choices for gender, male or female, and the choice is not as good as age.

The case of joint index failure

Assumption: Create a joint index: index index_name (a, b);
commandDescription
selete * from table where b=YYThe b value of the leaf node is 1, 2, 1, 4, 1, 2, and it is not ordered, so the (a, b) joint index cannot be used.
select * from table where a>XX and b=YY

Only the a index will be used here, not the b index.

For a composite index, it will continue to match to the right until it encounters a range query (>, <, between, like) and then stop matching.

For example: a = 1 and b = 2 and c> 3 and d = 4. If an (a, b, c, d) index is created, the index is not used for d.

The solution to make all indexes effective : put c at the end of the indexing statement, for example: create an index of (a, b, d, c) (the order of a, b, d can be adjusted arbitrarily), and execute sql in this way At that time, the optimizer will help us adjust the order of a, b, c, and d after where (put c at the end), let us use the index.

order by

Other URL

The use mechanism of Mysql index by order by-Programmer
Share
[Original] Interviewer: Tell me about your understanding of mysql joint index? -Lonely Smoke-Blog Garden

Introduction

There are two ways to implement ORDER BY:
  1. Automatically implement using ordered index. (Using the orderliness of the ordered index instead of doing another sorting (fast speed))
  2. That is, the Using index of the explain result
  3. Sort the results after selecting them. (Slow speed)
  4. That is, the Using filesort of the explain result
Query and sorting are both indexed
  1. The select field is exactly the same as the order by field, or the select field and the order by field form a joint index.
  2. Example: select col1 from tb1 order by col1;
  3. The where field is exactly the same as the order by field, or the where field and the order by field form a joint index.
  4. Example: select col1 from tb1 where col2=2 order by col2;
The query does not take the index
  1. The select field does not contain the order by field.
  2. Example: select col1 from tb1 order by col2
  3. The select field contains order by field + other fields.
  4. Example: select col1,col3 from tb1 order by col2
  5. The where field does not include the order by field.
  6. Example: select col1 from tb1 where col2=2 order by col3

Instance

drop table if exists test;create table test(id int primary key auto_increment,c1 varchar(10),c2 varchar(10),c3 varchar(10),c4 varchar(10),c5 varchar(10)) ENGINE=INNODB default CHARSET=utf8; insert into test(c1,c2,c3,c4,c5) values('a1','a2','a3','a4','a5');insert into test(c1,c2,c3,c4,c5) values('b1','b2','b3','b4','b5');insert into test(c1,c2,c3,c4,c5) values('c1','c2','c3','c4','c5');insert into test(c1,c2,c3,c4,c5) values('d1','d2','d3','d4','d5');insert into test(c1,c2,c3,c4,c5) values('e1','e2','e3','e4','e5'); CREATE INDEX idx_c1234 ON test(c1,c2,c3,c4);
result:

Equivalent query

Case 1: SELECT *, WHERE does not follow the leftmost prefix, ORDER BY does not follow the leftmost prefix, WHERE and ORDER BY do not follow the leftmost prefix

EXPLAIN SELECT * FROM test WHERE c2 = 'b2' AND c3 = 'b3' ORDER BY c4;
Result: query does not take index; sort does not take index

Case 2: SELECT *, WHERE does not follow the leftmost prefix, ORDER BY follows the leftmost prefix, WHERE and ORDER BY do not follow the leftmost prefix

EXPLAIN SELECT * FROM test WHERE c2 = 'b2' AND c3 = 'b3' ORDER BY c1;
Result: query does not take index; sort does not take index (same as above)

Case 3: SELECT and ORDER BY are the same, WHERE does not follow the leftmost prefix, ORDER BY follows the leftmost prefix, and WHERE and ORDER BY do not follow the leftmost prefix

EXPLAIN SELECT c1 FROM test WHERE c2 = 'b2' AND c3 = 'b3' ORDER BY c1;
Result: query does not take index; sorting takes index (covering index)

Case 4: SELECT *, WHERE follows the leftmost prefix, WHERE and ORDER BY do not follow the leftmost prefix

EXPLAIN SELECT * FROM test WHERE c1 = 'b1' AND c2 = 'b2' ORDER BY c3;
Result: query by index; sort by index

Range query

Case 1: SELECT *, and WHERE and ORDER BY are not the same

EXPLAIN SELECT * FROM test WHERE c2 > 'b1' ORDER BY c1;
Result: query does not take index; sort does not take index

Case 2: SELECT is not *, WHERE is not the same as ORDER BY, the leftmost prefix of ORDER BY

EXPLAIN SELECT c3 FROM test WHERE c2 > 'b1' ORDER BY c1;
Result: the query does not take the index, the sorting takes the index. This situation is equivalent to only sorting using the index
(Note: Replace c3 with c1, c2, etc., the results are the same)

Case 3: SELECT *, WHERE and ORDER BY are the same and follow the leftmost

EXPLAIN SELECT * FROM test WHERE c1 > 'b1' ORDER BY c1;
Result: the query is indexed; no operation is required for sorting, just use the result of the query directly

Case 4: SELECT *, WHERE and ORDER BY are not the same and follow the leftmost

EXPLAIN SELECT * FROM test WHERE c1 > 'b1' ORDER BY c2;
Result: the query takes the index; the sort does not take the index

like fuzzy query

Reference URL

MySQL - how to optimize fuzzy queries (like fuzzy query) _ database _Soinice the blog -CSDN blog
[turn] [MySQL] Mysql query like fuzzy speed optimization - Mengqi niche - blog Park
SQL query optimization using fuzzy like it - the book
index And optimize like query

Introduction

In MySQL, the fuzzy query must use the like keyword, and then add %, which represents the front fuzzy or the back fuzzy. When the amount of data is small, it is not easy to see the efficiency of the query, but when the amount of data reaches millions, tens of millions or even higher, the efficiency of the query is easy to show.
It is assumed here that the fields to be queried (where clause) have been indexed.

item

Whether the index is invalid

Solution

col_name like'ABC%'

Index is valid

 

col_name like'%ABC'

Index failure, use full table scan

Flip function + fuzzy query before like + build flip function index = take flip function index

col_name like'%ABC%'

Index failure

locate, position, instr, find_in_set and other methods

col_name like'%ABC'

This statement does not take the index:

select object_name from t1 where object_name like '%ABC';

Solution:

create index idx_t1_objectname2 on t1(reverse(object_name));select object_name from t1 where reverse(object_name) like reverse('%ABC');

Look at the execution plan: (SQL has gone index after rewriting.)

col_name like'%ABC%'

There are three situations:
1. The position of ABC in the string is not fixed, it can be optimized by rewriting SQL
2. ABC always appears from a fixed position at the beginning of the string, and a functional index can be created for optimization
3. ABC always starts from the end of the string Appears in a fixed position, you can create a function combination index for optimization

Situation 1

function

effect

Example/description

LOCATE(substr,str)

Returns the position of the first occurrence of substr in str. If substr does not exist in str, the return value is 0.

If str is a field, search for the content of the field; if str is not a field, use str as the library to be checked.

SELECT LOCATE('xbar',`foobar`); #return 0

#If `foobar` is a field, go to this field to query instead of the "foobar" string

 

SELECT LOCATE('bar',`foobarbar`); #Return 4

SELECT `column` FROM `table` WHERE LOCATE('keyword', `field`)>0

LOCATE(substr,str, pos)

Returns the position of the first occurrence of substr after the pos position in str. If substr does not exist in str, the return value is 0.

SELECT LOCATE('bar',`foobarbar`,5); #return 7

 

POSITION('substr' IN `field`)

Same function as locate

SELECT `col` FROM `tb1` WHERE POSITION('keyword' IN `filed`)

SELECT `column` FROM `table` WHERE POSITION('keyword' IN `filed`)

FIND_IN_SET(str,strlist)

The query field (strlist) contains the result of (str), and the record is returned (null is returned if no)

 

str string to be queried

strlist field name parameters are separated by "," such as (1,2,6,8,10,22)

SELECT FIND_IN_SET('b','a,b,c,d'); #Return 2

 

SELECT name,role FROM `user` WHERE find_in_set('2',role)

 

The difference between find_in_set() and like :

The main difference is that like is a broad fuzzy query, while find_in_set() is an exact match, and the field values ​​are separated by','.

 

SELECT userid, username, userrole role FROM `user` WHERE userrole LIKE'%2%';

 

SELECT userid,username,userrole role FROM `user` WHERE find_in_set('2',userrole)

Situation 2

Create the substr function index first, and then use like'ABC%'.
If ABC appears from the fifth position of the string:
create index idx_substr_t1_objname on t1 (substr(object_name,5,30));select object_id,object_type,object_name from t1where substr(object_name,5,30) like 'TAB%';

Situation 3:

Query records of %xx
select count(c.c_ply_no) as COUNT  from Policy_Data_All c, Item_Data_All i  where c.c_ply_no = i.c_ply_no  and i.C_LCN_NO like '%245'
During execution, the execution plan shows that the consumption value, io value, and cpu value are all very large. The reason is that the fuzzy query after the like causes the index to fail, and the full table scan is performed.
Solution: This kind of sql, which is only fuzzy before, can be modified as follows
select count(c.c_ply_no) as COUNT  from Policy_Data_All c, Item_Data_All i  where c.c_ply_no = i.c_ply_no  and reverse(i.C_LCN_NO) like reverse('%245')

Optimization principle

Use index full scan instead of table full scan. Because the cost of an index full scan is 1/N of the full table scan (that is, the ratio of the number of index blocks to the number of data blocks), the larger the table, the more obvious the optimization effect.
        After rewriting the SQL execution plan, the cost of returning to the table according to the index depends on the number of eligible records: if the number of records returned by the in subquery is small, then the optimization effect is equivalent to an increase in efficiency by N times; if in The query returns a large number of records, and the performance difference between the two SQLs is not obvious