[MySQL] # Optimize your SQL statement

1. Try not to use select * when querying SQL, but select specific fields

Use it correctly :

SELECT id, name FROM student;

Counter-example: SELECT * FROM student;

  • Take only the fields you need, save resources
  • select * When querying, it is very likely that the covering index will not be used, resulting in back-to-table query

Use it correctly :

SELECT id, name FROM student WHERE name = '张三' limit 1;

Counter-example: SELECT id, name FROM student WHERE name ='Zhang San';

  • After adding limit 1, as long as the corresponding record is found, it will not continue to scan downwards, which improves efficiency
  • If name is a unique index , limit 1 can be omitted

3. Try to avoid using or in the where clause to connect conditions, use union all

Use it correctly :

# 查询 id 为 1,或者 age 为 18 岁的用户
SELECT id, name, age FROM student WHERE id = 1
UNION ALL
SELECT id, name, age FROM student WHERE age = 18;

Counter-example: SELECT id, name, age FROM student WHERE id = 1 OR age = 18;

  • The use of or may invalidate the index and thus a full table scan

4. Optimize limit paging, use order by + index

Use it correctly :

SELECT id, name FROM student ORDER BY id LIMIT 10000, 10;

Counter example: SELECT id, name FROM student LIMIT 10000, 10;

When the offset is large, the query efficiency will be very low.

Because MySQL does not skip the offset and directly fetch the following data, but first adds the offset to the number of items to be fetched, and then discards the data of the previous offset and returns.

5. Optimize the like statement, try not to blur it before

Use it correctly :

SELECT id, name FROM student WHERE name like '张%';

Counter-example: SELECT id, name FROM student WHERE name like'%张';

  • Put% to the front without indexing

6. Use where conditions to limit the data to be queried to avoid returning redundant rows

Use it correctly :

# 查询某个学生是否为VIP
SELECT id, name FROM student WHERE id = '1' and isVip = '1';

Negative example: SELECT id, name FROM student WHERE isVip = '1'; Then determine whether the id contains 1 in the result of the check

  • Check what data is needed to avoid returning redundant data

7. Try to avoid using mysql's built-in functions on the index

Use it correctly :

# 查询最近七天内登陆过的用户(假设loginTime加了索引)
SELECT id, loginTime FROM student WHERE loginTime >= DATE_ADD(NOW(), INTERVAL -7 DAY)

Counter-example: SELECT id, loginTime FROM student WHERE DATE_ADD(loginTime, INTERVAL 7 DAY) >= now();

  • Use MySQL's built-in function on the index column, the index is invalid

8. Try to avoid performing expression operations on fields in the where clause

Use it correctly :

SELECT id, name FROM student WHERE age = 18;

Counter-example: SELECT id, name FROM student WHERE age-1 = 10;

  • Performing operations on the indexed field will invalidate the index and perform a full table scan

9. inner join, left join, right join, inner join is preferred, if left join is used, the result set of the left table should be as small as possible

  • inner join : inner join , when the two tables are connected for query, only the result set that exactly matches in the two tables is retained
  • left join : When the two tables are connected for query, all rows of the left table will be returned, even if there is no matching record in the right table
  • right join : When the two tables are connected for query, all rows of the right table will be returned, even if there is no matching record in the left table

Use it correctly :

SELECT s.* FROM (SELECT * FROM student WHERE id > 2) s
LEFT JOIN teacher t ON s.id = t.teachId;

Counter-example: SELECT s.* FROM student s LEFT JOIN teacher t ON s.id = t.teachId WHERE s.id> 2;

10. Try to avoid using the != or <> operator in the where clause

Use it correctly :

SELECT id, name FROM student WHERE age > 18;
SELECT id, name FROM student WHERE age < 18;

Counter-example: SELECT id, name FROM student WHERE age <> 18;

  • Using != and <> is likely to invalidate the index

11. When using a joint index, pay attention to the order of the index columns, generally follow the principle of leftmost matching

Table structure : (there is a joint index idx_userid_age, userId in the front, age in the back)

CREATE TABLE `user`(
  `id` int (11) NOT NULL AUTO_INCREMENT,
  `userId` int(11) NOT NULL,
  `age` int (11) DEFAULT NULL,
  `name` varchar (255) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_userid_age` (`userId`,`age`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;

Use it correctly :

SELECT id, name FROM student WHERE userId = 1 and age = 18;

Counter-example: SELECT id, name FROM student WHERE age = 18;

  • When we create a joint index, such as (k1, k2, k3), it is equivalent to creating three indexes (k1), (k1, k2) and (k1, k2, k3), which is the leftmost matching principle
  • The joint index does not meet the leftmost principle, the index will generally fail, but this is also related to the Mysql optimizer

12. To optimize the query, you should consider creating indexes on the columns involved in where and order by, and try to avoid full table scans

Use it correctly :

# 添加索引
ALTER TABLE student ADD INDEX idx_address_age ( address, age );

SELECT id, name FROM student WHERE address = '山东' ORDER BY age;

Counter example: SELECT id, name FROM student WHERE address ='Shandong' ORDER BY age; use directly

13. If you insert too much data, use bulk insert

<foreach>

14. When appropriate, use covering indexes

Covering index can make SQL statements do not need to return to the table, just access the index to get all the data needed

15. Use the distinct keyword with caution

The cpu time and occupancy time of the statement with distinct are higher than the statement without distinct.

Because when querying many fields, if distinct is used, the database engine will compare the data and filter out duplicate data. However, this comparison and filtering process will take up system resources and CPU time.

16. Remove redundant and duplicate indexes

For example, if one index is KEY idx_userId (userId)one KEY idx_userid_age (userId, age), then the individual userId index can be deleted

17. If the amount of data is large, optimize the modification and deletion statements

It can be operated in batches. If too much data is operated at one time, there may be a lock wait timeout exceed error, so it is recommended to operate in batches.

18. Consider using default values ​​instead of null in the where clause

When designing the table, you can set the default value for the field (for example, set the default value of age to 0)

Use it correctly :

SELECT * FROM student WHERE age > 0;

Counter-example: SELECT * FROM student WHERE age IS NOT NULL;

19. Do not have more than 5 table connections

If you must connect many tables to get the data, it means a bad design.

20. Reasonable use of exist & in

Use it correctly :

# 查询某企业所有部门的所有员工
SELECT * FROM employee e EXISTS (SELECT 1 FROM department  d WHERE e.deptId = d.id);

Counter-example: SELECT * FROM employee WHERE deptId IN (SELECT id FROM department);

  • exists query : execute the main query first , obtain the data (one connection), and then put it in the subquery for conditional verification (one connection), according to the verification result (true or false), determine whether the data result of the main query is retained.
  • If you use in: first query the department table (one connection), and then query the employee table by the obtained department id (N connections)

21. Try to use union all instead of union

If there are no duplicate records in the search results, union all is recommended to replace union.

  • If you use union, regardless of whether the search results are duplicated, they will try to merge them, and then sort them before outputting the final results

22. Indexes should not be too many, generally within 5

Indexes are not as many as possible. Although indexes improve the efficiency of queries , they also reduce the efficiency of inserts and updates .

23. Use numeric fields as much as possible, and try not to design fields that contain only numeric information as character types

Compared with numeric fields, character type will reduce the performance of query and connection, and will increase storage overhead.

24. Indexes are not suitable for building on fields with a lot of repeated data, such as gender

Because the SQL optimizer is based on the table the amount of data to query optimization, if the index column has a lot of duplicate data, Mysql query optimizer find a lower cost projections do not take the index, the index is likely to give up.

25. Try to avoid returning too much data to the client

Use it correctly :

# 查询最近一年注册的学生信息
SELECT * FROM student WHERE create_time >= DATE_SUB(NOW(), INTERVAL 1 Y) LIMIT 0, 200;
# 例如先显示200条,往后翻的时候再重新获取

Counter-example: SELECT * FROM student WHERE create_time >= DATE_SUB(NOW(), INTERVAL 1 Y);

26. When connecting multiple tables in SQL statements, use table aliases

27. Use varchar / nvarchar instead of char / nchar as much as possible

28. In order to improve the efficiency of the group by statement, you can use having to filter out unnecessary records before the statement is executed.

Use it correctly :

SELECT sex, AVG(age) FROM student 
GROUP BY sex HAVING sex = '男';

Counter-example: SELECT sex, AVG(age) FROM student GROUP BY sex;

29. If the character type is a string, the where must be enclosed in quotation marks, otherwise the index will be invalid

Use it correctly :

SELECT id, name FROM student WHERE phone = '12345';

Counter-example: SELECT id, name FROM student WHERE phone = 12345;

  • Because when the single quotation marks are not added, it is a comparison between a string and a number, and their types do not match. MySQL will do an implicit type conversion, convert them to floating point numbers and then compare them.