mysql index

1. What is an index, why should an index be created, how to create an index, what are the general principles of index creation, what kind of columns cannot be indexed, and what are the advantages and disadvantages of indexes?

2. Classification of indexes and how to create them, what are the characteristics of each index, and what are the differences

3. What is an aggregate index and what is a non-aggregated index?

4. What is a back-to-table query and how to avoid back-to-table query

5. What is the underlying data structure of the index, and why should we choose this data organization?

6. What is the difference between several types of trees

1. Introduction

MySQL currently has the following index types:
1. Ordinary index
2. Unique index
3. Primary key index
4. Composite index
5. Full-text index

Second, the statement

CREATE TABLE table_name[col_name data type][unique|fulltext][index|key][index_name](col_name[length])[asc|desc]

1.unique|fulltext is an optional parameter, which respectively represent unique index, full text index
2.index and key are synonyms, both have the same function, used to specify the index creation
3.col_name is the field column that needs to create the index, the column must be from Choose from multiple columns of the definition in the data table.
4. index_name specifies the name of the index, which is an optional parameter. If not specified, the default col_name is the index value.
5. length is an optional parameter, indicating the length of the index, and only string type The index length can be specified only when the field is specified.
6.asc or desc specifies the index value storage in ascending or descending order

Three, index type

1. Ordinary index
is the most basic index, it has no restrictions. It has the following creation methods:
(1) Create index directly

CREATE INDEX index_name ON table(column(length))

(2) Add indexes by modifying the table structure

ALTER TABLE table_name ADD INDEX index_name ON (column(length))

(3) Create an index while creating a table

Copy code
CREATE TABLE `table` (    `id` int(11) NOT NULL AUTO_INCREMENT ,    `title` char(255) CHARACTER NOT NULL ,    `content` text CHARACTER NULL ,    `time` int(10) NULL DEFAULT NULL ,    PRIMARY KEY (`id`),    INDEX index_name (title(length)))
Copy code

(4) Delete the index

DROP INDEX index_name ON table

2. Unique index
Similar to the previous ordinary index, the difference is: the value of the index column must be unique, but null values ​​are allowed. If it is a composite index, the combination of column values ​​must be unique. It has the following creation methods:
(1) Create a unique index

CREATE UNIQUE INDEX indexName ON table(column(length))

(2) Modify the table structure

ALTER TABLE table_name ADD UNIQUE indexName ON (column(length))

(3) Specify directly when creating a table

Copy code
CREATE TABLE `table` (    `id` int(11) NOT NULL AUTO_INCREMENT ,    `title` char(255) CHARACTER NOT NULL ,    `content` text CHARACTER NULL ,    `time` int(10) NULL DEFAULT NULL ,    UNIQUE indexName (title(length)));
Copy code

3. Primary key index
is a special unique index, a table can only have one primary key, no null values ​​are allowed. Generally, the primary key index is created at the same time when the table is built:

CREATE TABLE `table` (    `id` int(11) NOT NULL AUTO_INCREMENT ,    `title` char(255) NOT NULL ,    PRIMARY KEY (`id`));

4. Combined index
refers to an index created on multiple fields. The index will only be used when the first field when creating the index is used in the query conditions. Follow the leftmost prefix set when using composite indexes

ALTER TABLE `table` ADD INDEX name_city_age (name,city,age); 

5. The full-text index is
mainly used to find keywords in the text, rather than directly comparing with the value in the index. The fulltext index is very different from other indexes. It is more like a search engine, rather than a simple where statement parameter matching. The fulltext index is used in conjunction with the match against operation, rather than the general where statement plus like. It can be used in create table, alter table, and create index, but currently only char, varchar, and text columns can create full-text indexes. It is worth mentioning that when the amount of data is large, now put the data into a table without a global index, and then use CREATE index to create a fulltext index, rather than creating fulltext for a table and then writing the data The speed is much faster.
(1) Create a table suitable for adding a full-text index

Copy code
CREATE TABLE `table` (    `id` int(11) NOT NULL AUTO_INCREMENT ,    `title` char(255) CHARACTER NOT NULL ,    `content` text CHARACTER NULL ,    `time` int(10) NULL DEFAULT NULL ,    PRIMARY KEY (`id`),    FULLTEXT (content));
Copy code

(2) Modify the table structure to add a full-text index

ALTER TABLE article ADD FULLTEXT index_content(content)

(3) Create index directly

CREATE FULLTEXT INDEX index_content ON article(content)

Four, shortcomings

1. Although the index greatly improves the query speed, it will also reduce the speed of updating the table, such as insert, update and delete the table. Because when you update the table, you must not only save the data, but also save the index file.
2. Index files that will occupy disk space when indexing. In general, this problem is not serious, but if you create multiple composite indexes on a large table, the index file will grow very quickly.
Indexes are just a factor to improve efficiency. If there are tables with a large amount of data, you need to spend time researching and establishing the best indexes or optimizing query statements.

Five, matters needing attention

When using an index, there are some tips and precautions:
1. The index will not contain columns with null values.
As long as the columns contain null values, they will not be included in the index. As long as one column in the composite index contains null values, Then this column is invalid for this composite index. So we don't let the default value of the field be null when designing the database.
2. Use short index
to index the list, if possible, you should specify a prefix length. For example, if there is a char(255) column, if the multi-value is unique within the first 10 or 20 characters, then do not index the entire column. Short index can not only improve query speed but also save disk space and I/O operations.
3. Index column sorting The
query uses only one index, so if the index has been used in the where clause, the column in the order by will not use the index. Therefore, do not use the sort operation when the database default sorting can meet the requirements; try not to include multiple column sorts, and if necessary, it is best to create a composite index for these columns.
4. Like statement operation In
general, it is not recommended to use like operation. If it must be used, how to use it is also a problem. Like "%aaa%" will not use the index and like "aaa%" can use the index.
5. Do not perform operations on the columns.
This will cause the index to fail and perform a full table scan, such as

SELECT * FROM table_name WHERE YEAR(column_name)<2017;

6. Do not use not in and <> operations

The underlying data structure of the index

B+ tree

Features: 1.
n-ary tree, each level can store multiple nodes, reducing the height of the tree
2. Non-leaf nodes only store keys, not data
3. Leaf nodes store keys and data
4. Inside the leaf nodes, there are Multiple pieces of data are arranged in an orderly manner according to the index, which satisfies the range search.
5. The leaf nodes are connected to each other through a doubly linked list.
6. In the case of index failure, the leaf nodes can be traversed in order. 7.
B+Tree Each time a node is created, it directly applies for a page space, and sets the size of a node equal to a page
8. In this way, each node only needs one I/O to be fully loaded.

During the index search process, disk I/O consumption occurs, and the I/O speed is quite slow.
In other words, the index organization should minimize the number of disk I/O accesses during the search process, and reduce the number of disk IO. Greatly improve MySQL performance.
When the page
reads data from the disk, the system will send the logical address to the disk, and the disk will convert the logical address to a physical address-which track, which sector. The magnetic head performs mechanical movement, first finds the corresponding track, and then finds the corresponding sector of the track. The sector is the smallest storage unit of the disk.
Main memory and disk exchange data in units of pages, usually 4KB in size.
Need to find data with key 6

1. Load disk block 0 into the memory, an IO occurs, and use binary search in the memory to determine that 6 is between 3 and 9;
2. Load disk 2 into the memory through the disk address of the pointer P2, and the second IO occurs , And then perform a binary search in the memory to find 6, and end
3. There are only two IOs here. In fact, the size of each page block is 4K, and a 3-level B+ tree can represent millions of data, that is, every Search up to 3 IO times, so the performance improvement of the index will be huge.
Why not use the hash index (because it is a hash, and it is not stored in a sequence)
1. The hash index data is not in accordance with the index value It is stored sequentially, so it cannot be used for sorting, and does not support any range query, such as WHERE price> 100 (need to perform a hash operation on each piece of data, almost traversing the entire table, so the performance will be very low, basically nothing Improve)
2. When a hash conflict occurs, the storage engine must traverse all the row pointers in the linked list and compare row by row until it finds all eligible row data