Mysql: remember an online index failure

1. Cause

When looking at the service quality dashboard today, I found that one of my service quality is obviously low, and the availability is less than three 9. According to the monitoring panel to find a paging data query interface, it frequently fails, so look for the reason.

Second, the investigation process

By querying the error log, I found that the interface select count(*) from table where ... frequently timed out. I suspected that the index was not taken. I took out the sql at that time to explain it and found that it did not go. The pseudo sql is as follows:

-- 注:bus_id和create_time都单独建了索引select count(*) from table_awhere bus_id > 0    -- 一个业务id,bigint类型(数据库有极少部分是空值)and create_time >= '2021-05-01 00:00:00'and create_time < '2021-05-31 23:59:59'

In order to explain why I did not use the index, I will first describe several phenomena that I tried to sql, and analyze the problem from the phenomenon:

  1. The original SQL performed a full table scan;
  2. Delete bus_id>0, explain explanation and go to index, type is range;
  3. Delete the create_time condition and find that the index is not taken;
  4. Both conditions are retained, the time range of create_time is reduced to less than one day, and explain finds that the index is gone;

Three, analysis

In view of the above: 2-4 situations, a step-by-step analysis

Situation 2:

In this case, it is normal to find the total sql, and no analysis is done;

Situation 3:

After deleting create_time, it is found that bus_id>0 does not take the index

First of all, we have clarified such a knowledge point:

  • If you use not in, not exists, (<> is not equal to! =) these do not take the index;
  • <,>, <=, >= This is judged based on the actual query data. If the full scan speed is faster than the index speed, the index will not be used;

Because I have a book on the innoDB storage engine, I found an explanation for the second item in chapter 5.6.4:

Take the greater than sign as an example, bus_id>0 scans too much data, which causes the optimizer to automatically choose not to index. Other <, <=, >= results are the same .

Situation 4:

After knowing the situation 3 without indexing, I tried to narrow the create_time time range to within one day, but I was surprised to find that the index was gone. In response to this situation, I asked the company's DBA.

In fact, it is caused by the optimizer. The time range is too long. The optimizer thinks that the range search is not much different from bus_id>0, and the result is not indexed.

If the time range is relatively short, the time range index will be taken;

Of course, the specific implementation logic inside the optimizer must be very complicated, and if it is too specific, continue to study;

You can use force index to force create_time index;

-- 注:bus_id和create_time都单独建了索引select count(*) from table_aFORCE INDEX(idx_create_time)where bus_id > 0    -- 一个业务id,bigint类型(数据库有极少部分是空值)and create_time >= '2021-05-01 00:00:00'and create_time < '2021-05-31 23:59:59'