MySQL (12)-Planning Index

Index introduction

Rows of data stored in the data page in

Heap is the integration of all data pages of a table

Each page includes 8KB data information, referred to as a page eight neighboring extents area

Storing data line is out of order , the data page is disorder of

Data pages are not connected through a linked list

When a row is inserted into a full page, the data page is split

Methods of storage and access

Table scan

Scan the data pages in the table

Start at the beginning of the table

Scan the table page by page

Extract rows that meet the query criteria

Index to the data on the page

Traverse the index tree structure to find the column requested by the query

Only extract columns that meet the criteria

Under what circumstances

Reasons for index creation and precautions

It can usually speed up queries with join tables, sorting or grouping operations

If a unique index is created, the uniqueness of the value can be enforced

Index creation and maintenance are in ascending or descending order

Indexes are best created on highly selective columns, that is, most of the values ​​of the column or column combination are unique

Reasons and precautions for not creating indexes

Modify data on the indexed column, SQL Server automatically updates the associated index

Maintaining the index requires time and resources. Therefore, do not create an index for columns that are used less

It is almost useless to create an index on a column with a lot of duplicate data

Index structure

Use heap

Use clustered index-fast

Each table can only have one clustered index (general value primary key)

The physical position of the row and the position of the row in the index are the same

The uniqueness of the key value is maintained explicitly through the UNIQUE keyword, or implicitly maintained through an internal unique identifier that is invisible to the user

When a row is deleted, the space is reclaimed for use by a new row

The leaf node of the clustered index is the actual data page (the order of the index table and the order of the work table are one-to-one correspondence)

Data is physically stored in ascending order in the data page

The value of the index page is in ascending order of

Clustered index is very useful for columns that are frequently searched by key range or accessed in sorted order

Use non-clustered index-the default index type

Built on pile

When a nonclustered index is built on the heap, SQL Server uses the row identifier in the index page to point to the data page where the row is located. Row identifier stores data location information

Built on the clustered index

When a nonclustered index is built on a clustered index, SQL Server uses the clustered index key in the index page to point to the clustered index. The clustered index key stores data location information

Features:

If the index type is not specified, the default is a non-clustered index

The order of leaf node pages is different from the physical storage order of the table . Leaf node is ascending the

Uniqueness is maintained at leaf nodes through clustered index keys or row identifiers

Each table can have up to 249 non-clustered indexes

Non-clustered indexes are best created on columns with high selectivity

Create a clustered index before the non-clustered index is created (take the order of the clustered index as the first, and then consider the order of the non-clustered index)

Search

Find rows without index

Use a non-clustered index to find rows in the heap

The index is organized into a B-tree structure

Find the row in the clustered index

The clustered index is like a phone book, people with the same surname gather in the same place on the phone book

The leaf nodes of the clustered index B-tree structure are the data pages of the index

The data rows of the clustered index are sorted and stored in the order of their clustered index keys

The key value of the clustered index should be as small as possible, so that an index page can fit more index rows, thereby improving performance

maintain

Update operation

Does not affect the structure of the data row

Update operations may be recorded as a delete operation followed by an insert operation

Delete row

Rows deleted from the leaf-level nodes of the index are not removed immediately, but are marked as invalid, called "ghost records"

Delete rows in the index: space can be used by adjacent rows

Delete rows in the heap: space can be used by newly inserted rows

Decide which columns need to be indexed

data

Logical and physical design

Data characteristics

How the data is used

Type of query executed

Frequency of typical queries

Guidelines for using indexes

Primary key

Foreign keys or columns frequently used in table join operations

Often in the same group of columns in the aggregation operation

Columns that are often accessed in sorted form

Often in the same group of columns in the aggregation operation

Choose the right clustered index

Tables undergoing a large number of update operations

The establishment of a clustered index on the primary key identification column can make the inserted data concentrated at the end of the table, increasing the speed. Frequent access keeps these pages in memory

Sort

For data that is often sorted, grouped, and range searched, it can improve performance because it has been pre-sorted when the clustered index is created.

Use indexes to support queries

Specify the WHERE clause in the query

The search parameters limit the search to two or more items that are an exact match, a range of values, or connected by the AND operator

Avoid using leading wildcards

Decide on selectivity

Determine the density

For high-density, perform a table scan

For low density, use index access

Determine data distribution