MySQL technical insider reading notes VI. MySQL transaction

First image

Article Directory

One, MySQL transaction

The transactions in the InnoDB storage engine fully comply with the characteristics of ACID. ACID is an abbreviation of the following 4 words:

  • Atomicity
  • Consistency
  • Isolation
  • Durability

1. Know the affairs

1.1 Overview

A transaction can be composed of a very simple SQL statement or a group of complex SQL statements. A transaction is a program execution unit that accesses and updates various data items in the database. The operations in the transaction are either modified or not done . This is the purpose of the transaction and one of the important characteristics of the transaction model that distinguishes it from the file system.

For the InnoDB storage engine, the default transaction isolation level is READ REPEATABLE, which fully complies with and satisfies the ACID characteristics of transactions.

A (Atomicity), atomicity means that the entire database transaction is an indivisible unit of work . Only when all database operations in the transaction are executed successfully can the entire transaction be considered as successful. If any SQL statement in the transaction fails to execute, the SQL statement that has been successfully executed must also be cancelled, and the database state should return to the state before the transaction was executed.

C (consistency). Consistency refers to the transformation of the database from one state to the next consistent state by a transaction . Before the start of the transaction and after the end of the transaction, the integrity constraints of the database have not been destroyed. For example, in the table, there is a field named name, which is a unique constraint, that is, the name cannot be repeated in the table. If a transaction modifies the name field, but after the transaction is committed or the transaction operation is rolled back, the name in the table becomes non-unique, which destroys the consistency requirements of the transaction, that is, the transaction changes the database from a state Become an inconsistent state.

I (isolation), isolation has other names, such as concurrency control, serializability, locking, etc. The isolation of the transaction requires that the object of each read-write transaction can be separated from the operating objects of other transactions, that is, the transaction is invisible to other transactions before the transaction is committed. This is usually achieved by using locks.

D (durability), durability. Once the transaction is committed, the result is permanent. Even if a failure such as a downtime occurs, the database can recover data . It should be noted that the permanence of the result can only be guaranteed from the perspective of the transaction itself. For example, after the transaction is committed, all changes are permanent. Even when the database needs to be restored due to a crash, it can be guaranteed that the submitted data will not be lost after the restoration. However, if it is not the failure of the database itself, but some external reasons, such as damage to the RAID card, natural disasters, etc., that cause the database to have problems, then all submitted data may be lost.

1.2 Classification

From the perspective of transaction theory, transactions can be divided into the following types:

  • Flat Transactions (Flat Transactions)
  • Flat transactions with savepoints (Flat Transactions with Savepoints)
  • Chained Transactions
  • Nested Transactions
  • Distributed Transactions

Flat transaction (Flat Transaction) is the simplest type of transaction, but in the actual production environment, this may be the most frequently used transaction. In a flat transaction, all operations are at the same level. They start with BEGIN WORK and end with COMMIT WORK or ROLLBACK WORK. The operations during this period are atomic, either all are executed or all are rolled back. Therefore, flat transactions are the basic building blocks for applications to become atomic operations .


Flat transactions with savepoint (Flat Transactions with Savepoint), in addition to supporting operations supported by flat transactions, allow rollback to an earlier state in the same transaction during the execution of the transaction . This is because errors that may occur during the execution of certain transactions will not cause all operations to be invalid, and it is not satisfactory to give up the entire transaction, and the overhead is too large. Savepoint is used to inform the system that the current state of the transaction should be remembered so that when an error occurs later, the transaction can return to the state at the savepoint.

For flat transactions, it implicitly sets a save point. However, in the entire transaction, there is only this save point, therefore, rollback can only be rolled back to the state at the beginning of the transaction.


Chained Transaction can be regarded as a variant of the savepoint model. For a flat transaction with savepoints, when a system crash occurs, all savepoints will disappear because the savepoints are volatile rather than persistent. This means that when recovering, the transaction needs to be re-executed from the beginning, and cannot be continued from the most recent save point.

The idea of ​​the chain transaction is: when a transaction is submitted, the unnecessary data objects are released, and the necessary processing context is implicitly passed to the next transaction to be started. Note that the commit transaction operation and the start of the next transaction operation will be merged into one atomic operation.


Distributed Transactions are usually flat transactions that run in a distributed environment, so different nodes in the network need to be accessed according to the location of the data.

Suppose a user performs a bank transfer operation on an ATM, for example, a cardholder transfers 10,000 yuan from a savings card of China Merchants Bank to a savings card of ICBC. In this case, the ATM machine can be regarded as node A, the back-end database of China Merchants Bank as node B, and the back-end database of ICBC as C. This transfer operation can be decomposed into the following steps:

  • 1) Node A issues a transfer command.
  • 2) Node B subtracts 10 000 from the balance value in the savings card.
  • 3) Node C adds 10 000 to the balance value in the savings card.
  • 4) Node A notifies the user that the operation is complete or node A notifies the user that the operation has failed.

Need to use distributed transactions, because node A can not complete the task by calling a database. It needs to access the databases of two nodes in the network, and the transaction operations performed in the database of each node are flat. For distributed transactions, it also needs to meet the ACID characteristics, either all happen or all fail.

2. The realization of the transaction

Transaction isolation is achieved by locks . Atomicity, consistency, and durability are accomplished through the redo log and undo log of the database. The redo log is called the redo log and is used to ensure the atomicity and durability of the transaction . The undo log is used to ensure the consistency of the transaction .

The role of redo and undo can be regarded as a recovery operation, redo resumes the page operation that commits transaction modification, and undo rollback records to a specific version . Therefore, the contents recorded by the two are different. Redo is usually a physical log, which records the physical modification operations of the page . Undo is a logical log, which is recorded according to each line of record .

2.1 redo

1. Basic concepts

Redo log is used to achieve transaction durability, that is, D in transaction ACID. It consists of two parts:

  • One is the redo log buffer in memory, which is volatile;
  • The second is the redo log file, which is persistent.

InnoDB transactional storage engine, which is achieved through a transaction Force Log at Commit mechanism persistence, that is, when the transaction commits (COMMIT), you must first write the log all transactions to redo log files persist until the transaction The COMMIT operation is completed.

In the InnoDB storage engine, it consists of two parts, namely redo log and undo log. The redo log is used to ensure the durability of the transaction , and the undo log is used to help the transaction rollback and MVCC functions . The redo log is basically written sequentially, and there is no need to read the redo log file when the database is running. The undo log needs to be read and written randomly.

In order to ensure that each log is written to the redo log file, the InnoDB storage engine needs to call an fsync operation after each redo log buffer is written to the redo log file . Since the redo log file is opened and the O_DIRECT option is not used, the redo log buffer is written to the file system cache first. In order to ensure that the redo log is written to disk, an fsync operation must be performed. Since the efficiency of fsync depends on the performance of the disk, the performance of the disk determines the performance of transaction submission, that is, the performance of the database.

The parameter innodb_flush_log_at_trx_commit is used to control the strategy of flushing the redo log to disk . The default value of this parameter is 1, which means that an fsync operation must be called when the transaction is committed. You can also set the value of this parameter to 0 and 2. 0 means that the redo log operation will not be written when the transaction is committed. This operation is only completed in the master thread, and the redo log file will be performed every 1 second in the master thread The fsync operation. 2 means that the redo log is written to the redo log file when the transaction is committed, but only in the cache of the file system, and no fsync operation is performed.

CREATE TABLE test_load(
    a INT,
    b CHAR(80)
WHILE s<=count DO
SET s=s+1;

The function of the stored procedure p_load is to continuously insert data into the table test_load, and an explicit COMMIT operation is performed every time an entry is inserted. In the default setting, that is, when the parameter innodb_flush_log_at_trx_commit is 1, the InnoDB storage engine will write the log in the redo log buffer to the file and call an fsync operation. If the command CALL p_load (500 000) is executed, 500,000 rows of records will be inserted into the table, and 500,000 fsync operations will be performed

mysql>CALL p_load(500000);
Query OK,0 rows affected(1 min 53.11 sec)

The reason for the longer time is the time required for the fsync operation. Then look at the situation where the parameter innodb_flush_log_at_trx_commit is set to 0:

mysql>SHOW VARIABLES LIKE'innodb_flush_log_at_trx_commit'\G
1 row in set(0.00 sec)
mysql>CALL p_load(500000);
Query OK,0 rows affected(13.90 sec)

It can be seen that after the parameter innodb_flush_log_at_trx_commit is set to 0, the time to insert 500,000 rows of records is shortened to 13.90 seconds, which is almost 12% of the previous. The main reason for this phenomenon is that the latter greatly reduces the number of fsyncs, thereby improving the performance of database execution.


Although users can improve the performance of transaction commit by setting the parameter innodb_flush_log_at_trx_commit to 0 or 2, it should be kept in mind that this setting method loses the ACID characteristics of the transaction. In order to improve the commit performance of the transaction, a COMMIT operation should be performed after 500,000 rows of records are inserted into the table, rather than a COMMIT operation after each record is inserted . The advantage of this is that the transaction method can also be rolled back to a certain state at the beginning of the transaction when it is rolled back.

2.log block

In the InnoDB storage engine, redo logs are stored in 512 bytes . This means that the redo log cache and redo log files are stored in blocks, called redo log blocks, each of which has a size of 512 bytes.

If the number of redo logs generated in a page is greater than 512 bytes, it needs to be divided into multiple redo log blocks for storage. In addition, since the size of the redo log block is the same as the disk sector size, both are 512 bytes, the redo log writing can ensure atomicity and does not require doublewrite technology.


The redo log cache is composed of log blocks each 512 bytes in size. The log block consists of three parts, which are log block header, log body, and log block tailer.

3.log group

Log group is a logical concept, and there is no physical file actually stored to represent log group information. The log group is composed of multiple redo log files, and the size of the log files in each log group is the same.

The redo log file stores the log block previously saved in the log buffer, so it also manages the physical storage according to the block method. The size of each block is the same as the log block, which is also 512 bytes. During the operation of the InnoDB storage engine, the log buffer flushes the log block in the memory to the disk according to certain rules. Specifically, this rule is:

  • When the transaction is committed
  • When half of the memory space in the log buffer has been used
  • log checkpoint

4. Redo log format

Different database operations will have corresponding redo log formats. In addition, since the storage management of the InnoDB storage engine is page-based, its redo log format is also page-based. Although there are different redo log formats, they have a common header format.


The general header format consists of the following 3 parts:

  • redo_log_type: The type of redo log.
  • space: ID of the table space.
  • page_no: The offset of the page.

The redo log body part has different storage contents depending on the type of redo log.


2.2 undo

1. Basic concepts

The redo log records the behavior of the transaction, which can be used to "redo" the page. But the transaction sometimes needs to be rolled back, and then undo is needed. Therefore, when the database is modified, the InnoDB storage engine will not only generate redo, but also a certain amount of undo . In this way, if the transaction or statement executed by the user fails for some reason, or the user requests a rollback with a ROLLBACK statement, the undo information can be used to roll back the data to the way it was before the modification.

Undo is a logical log, so it just logically restores the database to its original state. All modifications are logically cancelled, but the data structure and the page itself may be quite different after the rollback . For example, a transaction is modifying a few records in the current page, while other transactions are modifying other records in the same page. Therefore, it is not possible to roll back a page to the beginning of the transaction, because this will affect the ongoing work of other transactions.

In addition to the rollback operation, another role of undo is MVCC , that is , the realization of MVCC in the InnoDB storage engine is done through undo. When the user reads a row of records, if the record is already occupied by other transactions, the current transaction can read the previous row version information through undo to achieve non-locking reading.

2.undo storage management

The InnoDB storage engine also uses the segment method for undo management. But this paragraph is different from the previous paragraph. First of all, the InnoDB storage engine has a rollback segment, each rollback segment records 1024 undo log segments, and an undo page application is made in each undo log segment segment. The page (0, 5) of the shared tablespace offset of 5 records the page where all the rollback segment headers are located. The type of this page is FIL_PAGE_TYPE_SYS.


It is important to note that the process of transaction allocating pages in the undo log segment and writing to the undo log also needs to be written to the redo log. When the transaction is committed, the InnoDB storage engine will do the following two things:

  • Put the undo log in the list for subsequent purge operations
  • Determine whether the page where the undo log is located can be reused, and if it can be allocated to the next transaction

After the transaction is committed, the undo log and the page where the undo log is located cannot be deleted immediately. This is because there may be other transactions that need to use undo log to get the previous version (MVCC) of the row record . When the story is submitted, the undo log is put into a linked list, and whether the undo log and the page where the undo log is located can be finally deleted by the purge thread to judge.

In addition, if a separate undo page is allocated for each transaction, it will waste storage space, especially for OLTP application types. Therefore, undo pages can be reused in the design of the InnoDB storage engine. Specifically, when the transaction is committed, the undo log is first put into the linked list, and then it is judged whether the usage space of the undo page is less than 3/4, if it is, it means that the undo page can be reused, and then the new undo log is recorded in the current undo Behind the log. Since the undo log list is organized by records, and the undo page may store the undo logs of different transactions, the purge operation requires discrete read operations involving disks, which is a relatively slow process.

You can use the command SHOW ENGINE INNODB STATUS to view the number of undo logs in the linked list, such as:

Trx id counter 3000
Purge done for trx's n:o<2C03 undo n:o<0
History list length 12
---TRANSACTION 0,not started
MySQL thread id 1,OS thread handle 0x1500f1000,query id 4 localhost root
show engine innodb status

3.undo log format

In the InnoDB storage engine, undo log is divided into:

  • insert undo log
  • update undo log

Insert undo log refers to the undo log generated in the insert operation. Because the record of the insert operation is only visible to the transaction itself and not to other transactions (this is a requirement of transaction isolation), the undo log can be deleted directly after the transaction is committed. No purge operation is required.


The pdate undo log records the undo log generated for delete and update operations. The undo log may need to provide an MVCC mechanism, so it cannot be deleted when the transaction is committed. Put it into the undo log linked list when submitting, and wait for the purge thread to perform the final deletion.


2.3 purge

The delete and update operations may not directly delete the original data.


Table t has a clustered index on column a and an auxiliary index on column b. For the above delete operation, it is known from the previous introduction about undo log that only the delete flag of the record whose primary key column is equal to 1 is set to 1, and the record is not deleted, that is, the record still exists in the B+ tree . Secondly, the records with a equal to 1 and b equal to 1 on the auxiliary index are also not processed , and no undo log is even generated. The operation of actually deleting this row of records is actually "delayed" and is finally completed in the purge operation.

Purge is used to finally complete delete and update operations. This design is because the InnoDB storage engine supports MVCC, so records cannot be processed immediately when the transaction is committed . At this time, other things may be referencing this line, so the InnoDB storage engine needs to save the previous version of the record. Whether the record can be deleted is judged by purge. If the row record has not been referenced by any other transaction, then the real delete operation can be performed.

In order to save storage space, the undo log design of the InnoDB storage engine is like this: the undo log of multiple transactions is allowed to exist on a page . Although this does not represent the order in which transactions are committed in the global process, the undo log generated by subsequent transactions is always at the end. In addition, the InnoDB storage engine also has a history list, which links the undo log according to the order of transaction submission.


In the design of the InnoDB storage engine, the first committed transaction is always at the end. The undo page stores the undo log. Because it can be reused, an undo page may store the undo logs of multiple different transactions. The gray shading of trx5 indicates that the undo log is also referenced by other transactions.

The global dynamic parameter innodb_purge_batch_size is used to set the number of undo pages that need to be cleaned up for each purge operation.

2.4 group commit

If the transaction is a non-read-only transaction, an fsync operation is required every time the transaction commits to ensure that the redo log has been written to disk . When the database is down, it can be recovered through the redo log. Although the appearance of solid-state drives has improved the performance of disks, the fsync performance of disks is limited. In order to improve the efficiency of disk fsync, the current database provides the function of group commit, that is, one fsync can be refreshed to ensure that multiple transaction logs are written to the file . For the InnoDB storage engine, two phases of operations are performed when a transaction is committed:

  • 1) Modify the information corresponding to the transaction in the memory, and write the log to the redo log buffer.
  • 2) Calling fsync will ensure that all logs are written to disk from the redo log buffer.

Step 2) is a slower process than step 1) because the storage engine needs to deal with disks. However, when there is a transaction in this process, other transactions can perform step 1). After the commit operation is completed, the redo log of multiple transactions can be flushed to disk through one fsync. , This greatly reduces the pressure on the disk, thereby improving the overall performance of the database.

3. Transaction control statement

Under the default settings of the MySQL command line, transactions are automatically committed (auto commit) , that is , the COMMIT operation will be executed immediately after the SQL statement is executed. Therefore, to explicitly open a transaction, you need to use the command BEGIN, START TRANSACTION, or execute the command SET AUTOCOMMIT=0 to disable the automatic commit of the current session.

  • START TRANSACTION|BEGIN: Explicitly start a transaction.
  • COMMIT: To use the simplest form of this statement, just issue COMMIT. It can also be more detailed, written as COMMIT WORK, but the two are almost equivalent. COMMIT commits the transaction and makes all changes made to the database permanent.
  • ROLLBACK: To use the simplest form of this statement, just issue a ROLLBACK. Similarly, it can also be written as ROLLBACK WORK, but the two are almost equivalent. The rollback will end the user's transaction and undo all uncommitted changes in progress.
  • SAVEPOINT identifier: SAVEPOINT allows to create a savepoint in a transaction, and there can be multiple SAVEPOINTs in a transaction.
  • RELEASE SAVEPOINT identifier: Delete the savepoint of a transaction. When there is no savepoint to execute this sentence, an exception will be thrown.
  • ROLLBACK TO[SAVEPOINT]identifier: This statement is used with the SAVEPOINT command. You can roll back the transaction to the marked point without rolling back any work before the marked point. For example, you can issue two UPDATE statements, followed by a SAVEPOINT, and then two DELETE statements. If an exception occurs during the execution of the DELETE statement, and the exception is caught, and the ROLLBACK TO SAVEPOINT command is issued at the same time, the transaction will be rolled back to the specified SAVEPOINT, and all the work completed by the DELETE will be undone, but the work completed by the UPDATE statement will not Affected.
  • SET TRANSACTION: This statement is used to set the isolation level of the transaction. The transaction isolation levels provided by the InnoDB storage engine are: READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE.

4. The isolation level of the transaction

The four isolation levels defined by the SQL standard are:


READ UNCOMMITTED is called browse access, and is only for transactions. READ COMMITTED is called cursor stability. REPEATABLE READ is 2.9999° isolation, without phantom read protection. SERIALIZABLE is called isolation, or 3° isolation. The default transaction isolation level for SQL and SQL2 standards is SERIALIZABLE.

The InnoDB storage engine uses the Next-Key Lock algorithm at the REPEATABLE READ transaction isolation level, thus avoiding the generation of phantom reads . Therefore, the InnoDB storage engine can fully guarantee the isolation requirements of transactions under the default REPEATABLE READ transaction isolation level , that is, reach the SQL standard SERIALIZABLE isolation level.

In the InnoDB storage engine, you can use the following commands to set the current session or global transaction isolation level:


If you want to set the default isolation level of transactions when the MySQL database is started, you need to modify the MySQL configuration file and add the following line in [mysqld]:


To view the transaction isolation level of the current session, you can use:

mysql>[email protected]@tx_isolation\G;
1 row in set(0.01 sec)

At the SERIALIABLE transaction isolation level, the InnoDB storage engine will automatically add LOCK IN SHARE MODE after each SELECT statement , that is, add a shared lock for each read operation. Therefore, under this transaction isolation level, reads occupy locks, and consistent non-locking reads are no longer supported. At this time, the transaction isolation level SERIALIZABLE meets the theoretical requirements of the database.

Because the InnoDB storage engine can achieve 3° isolation under the REPEATABLE READ isolation level, the SERIALIABLE isolation level is generally not used in local transactions. The transaction isolation level of SERIALIABLE is mainly used for distributed transactions of the InnoDB storage engine.

5. Distributed transactions

5.1 MySQL database distributed transaction

The InnoDB storage engine provides support for XA transactions, and supports the implementation of distributed transactions through XA transactions . Distributed transaction refers to allowing multiple independent transactional resources to participate in a global transaction. Transaction resources are usually relational database systems, but they can also be other types of resources. The global transaction requires that all participating transactions are either committed or rolled back.

When using distributed transactions, the transaction isolation level of the InnoDB storage engine must be set to SERIALIZABLE.

XA transactions allow distributed transactions between different databases . For example, one server is a MySQL database, the other is an Oracle database, and there may be another server that is a SQL Server database, as long as each node participates in the global transaction Both support XA transactions. Distributed transactions may be more common in bank system transfers. For example, user David needs to transfer 10,000 yuan from Shanghai to the bank card of user Mariah in Beijing:

#[email protected]:
UPDATE account SET money=money-10000 WHERE user='David';
#[email protected]
UPDATE account SET money=money+10000 WHERE user='Mariah';

XA transaction consists of one or more resource managers (Resource Managers), a transaction manager (Transaction Manager) and an application program (Application Program).

  • Resource Manager: Provides methods to access transaction resources. Usually a database is a resource manager.
  • Transaction manager: Coordinating the various transactions participating in the global transaction. Need to communicate with all resource managers participating in the global transaction.
  • Application: Define the boundary of the transaction and specify the operations in the global transaction.

Distributed transactions use two-phase commit (two-phase commit):

  • In the first phase, all nodes participating in the global transaction begin to prepare (PREPARE), telling the transaction manager that they are ready to commit;
  • In the second stage, the transaction manager tells the resource manager to perform ROLLBACK or COMMIT. If any node shows that it cannot be submitted, all nodes are told to roll back.

The SQL syntax of MySQL database XA transaction is as follows:


You can check whether XA transaction support is enabled through the parameter innodb_support_xa (default is ON):

mysql>SHOW VARIABLES LIKE'innodb_support_xa'\G;
1 row in set(0.01 sec)

5.2 Internal XA transactions

The distributed transaction discussed earlier is an external transaction, that is, the resource manager is the MySQL database itself. There is another distributed transaction in the MySQL database, which is called internal XA transaction between the storage engine and the plug-in, or between the storage engine and the storage engine.

The most common internal XA transaction exists between binlog and the InnoDB storage engine.

Due to the need for replication, most of the current databases have the binlog function enabled. When the transaction is committed, the binary log is written first, and then the redo log of the InnoDB storage engine. The requirements for the above two operations are also atomic, that is, the binary log and the redo log must be written at the same time . If the binary log is written first, and a downtime occurs when writing to the InnoDB storage engine, the slave may receive the binary log passed by the master and execute it, which will eventually lead to the inconsistency of the master and the slave.


If the MySQL database is down before step ③ after executing ① and ②, the master-slave inconsistency will occur. To solve this problem, MySQL database uses XA transactions between binlog and InnoDB storage engine. When the transaction is committed, the InnoDB storage engine will first perform a PREPARE operation, write the xid of the transaction, and then write the binary log.


6. Bad business habits

6.1 Submit in the loop

mysql>CALL load1(10000);
Query OK,0 rows affected(1 min 3.15 sec)
Query OK,0 rows affected(0.05 sec)
mysql>CALL load2(10000);
Query OK,1 row affected(1 min 1.69 sec)
Query OK,0 rows affected(0.05 sec)
mysql>CALL load3(10000);
Query OK,0 rows affected(0.63 sec)

The third method is much faster! This is because each commit must write a redo log. The stored procedures load1 and load2 actually write 10,000 redo log files, while for the stored procedure load3, only write It took 1 time.

6.2 Use automatic submission

The default setting of the MySQL database uses autocommit. You can use the following statement to change the current autocommit mode:

mysql>SET autocommit=0;
Query OK,0 rows affected(0.00 sec)

6.3 Using automatic rollback

7. Long business

Long-Lived Transactions, as the name suggests, are transactions that take a long time to execute. For example, for the database of the banking system, the interest of the corresponding account may need to be updated at each stage. If the number of corresponding accounts is very large, for example, for a table account with 100 million users, the following statement needs to be executed:

UPDATE account
SET account_total=account_total+(1+interest_rate)

During the execution process, when there is a problem with the database, operating system, hardware, etc., the cost of restarting the transaction becomes unacceptable. The database needs to roll back all changes that have occurred, and this process may take longer than the time to produce these changes . Therefore, the problem of long transactions can sometimes be handled by converting them into mini batches. When an error occurs in a transaction, only part of the data needs to be rolled back, and then the last completed transaction continues .