Notes on MYSQL MVCC implementation and UNDO

MVCC mulit version concurrency control is basically implemented in every traditional database. Recently, I suddenly asked myself how does MYSQL realize this function. It seems that I understand it but I can’t explain it. Then I will review it myself. What is MYSQL? How to accomplish this function.

The following is a brief explanation of MVCC:

Multi version concurrency control (mvcc) Concurrency control is achieved by saving a snapshot of data at a certain point in time. That is to say, no matter how long the transaction is executed, the data seen inside the transaction is not affected by other transactions . According to the start time of the transaction, each transaction may see different data for the same table at the same time.

The main function of multi-version control MVCC is to save data at a time. By saving the data at this point in time, no matter when the transaction is executed, the data he sees will not be affected by other external transactions. According to the start time Each transaction will have its own data state obtained at the time of execution.

In short, Multi version concurrency control The idea is to save the historical version of the data, and realize the concurrency control of the database by managing multiple versions of the data row. In this way, we can determine whether the data is displayed or not by comparing the version number. When reading the data, we do not need to lock it, which can also ensure the isolation effect of transactions.


In other words, the main service of MVCC is to save the historical version of the data, when the transaction reads the data, it will not be locked, and through the isolation level setting, different transactions are allowed to read different transaction versions. Data row information.

________________________________________________________________

Then the main problem that MVCC solves is to avoid the BLOCK between read and write, so that the read and read operations can be parallelized, and the read and write operations can be parallelized. Therefore, deadlocks generally occur between write and write operations. And this method can solve the problem of read version consistency.

The basis for MYSQL to implement multi-version control is that MYSQL has an UNDO tablespace, and INNODB only looks at the version of the data row. The transaction version of the row lower than the current system. Ensure that the read row is stored before this transaction.

Inserting data here will generate the version number of the current row transaction, and deleting the row will turn the version number of the current transaction into a delete mark. The UPDATE operation mainly uses insert and delete in multi-version control. Operation to perform the UPDATE operation.

MYSQL's MVCC supports READ COMMIT and REPEATABLE READ methods.The other two methods READ UNCOMMIT have the problem that all new rows are read, while the problem with Serializable is that the rows read are all locked rows.

Here undo logs are divided into two parts, 1 INSERT UNDO LOG 2 UPDATE UNDO LOG, the functions of the two different LOGs are different. The main function of INSERT UNDO LOG is to roll back the data after insertion. After the data is inserted and COMMIT, it will be immediately Abandon the information in the undo log, the UNDO LOG information in the UPDATE position is not only used as rollback data, but also as a consistent read, that is, it supports the role of data reading in MVCC.

If a relatively large transaction operation is performed here, it will cause the undo log to expand, and if the commit operation is not performed in time, the UNDO space will not be released, and even after the commit, the cleanup of the UNDO tablespace data is not immediate, it is It needs to be processed by the purge thread.If the large continuous transaction is too large, it will also cause the UNDO table space to be unable to clean up quickly, making the use of the UNDO table space tense.

How to calculate how many undo rollback segments can be used in MYSQL (one rollback segment and one transaction), according to the following formula.

(innodb_page_size / 16) * innodb_rollback_segments * number of undo tablespaces

Through this formula, it can be observed that the support of concurrent transactions in the MYSQL server can be increased by adding the undo tablespace.

In addition, we should also know whether our MYSQL can support the number of concurrent transactions at a certain moment.

(innodb_page_size / 16 / 2) * innodb_rollback_segments * number undo tablespaces

For example, when we have three files, 3*128/2= 192, which means that our database can support 192 transactions in parallel at the same time.

The initial size of UNDO LOG of MYSQL 8.023 is set by 25% of the innodb_max_ undo_ log_ size value. The initial size is two files undo001 undo002 by default.

Starting from MYSQL 8.014, you can create an UNDO tablespace and specify the file name

create undo tablespace undo_name add datafile'file name.ibu'

Several factors affecting the performance of UNDO table space

1 The number of UNDO tablespaces and the number of UNDO files

2 The performance of the disk responsible for undo tablespace operations

3 The size of the transaction in the application

4 Concurrency of transactions and UNDO PURGE thread settings and working conditions

In the MYSQL 8.016 version of MYSQL, you can use the following statement to monitor MYSQL UNDO LOG truncation performance

SELECT NAME, SUBSYSTEM, COMMENT

FROM INFORMATION_SCHEMA.INNODB_METRICS

WHERE NAME LIKE '%truncate%';