Let’s talk about business expansion

Four characteristics of transactions

ACID

  • Atomicity: The operations in the transaction either all succeed or all fail. Realized by undo log
  • Consistency: The database is in a correct state before and after the transaction is executed.
  • Isolation: During the execution of the transaction, you should not be disturbed by other transactions, and concurrent transactions should be isolated. Realized by lock, MVCC
  • Persistence: After the transaction is executed, the data will always be stored in the database. Even if there is an unexpected downtime, it should not have any impact on this part of the data. Realized by redo log

Four isolation levels of transactions

  • Read uncommitted: The modification of the transaction, even if it is not committed, is visible to other transactions. This phenomenon is called dirty read.
  • Read committed: Transaction reads committed data, the default isolation level of most databases. When a transaction is executed, the information read before and after is different, this situation is called non-repeatable read.
  • Repeatable read: The default isolation level of mysql. It solves the problems of dirty reads, non-repeatable reads, and the existence of phantom reads. Phantom read: When a transaction A reads a certain range of data, another transaction B inserts a row in this range, and when transaction A reads the data in this range again, a phantom read will occur.
  • Serializable: All transactions are executed one by one without interference with each other.

redo log & bin log

Data is stored in the disk. If disk IO operations are required to read and write data every time, the performance will be poor in concurrent scenarios. For this reason, Mysql introduces a buffer pool to alleviate the disk pressure of the database.

When reading data from the database, first read it from the cache, if it is not in the cache, read it from the disk and put it into the cache; when writing data to the database, write it to the cache first, and then the data page in the cache For data changes, this data page is called a dirty page; the data in the Buffer Pool is periodically flushed to the disk, and this process is called flushing dirty pages.

Mysql is down. If the dirty page flushing has not been completed, the update will be lost, and the durability of the transaction cannot be guaranteed.

In order to solve this problem, redo log was introduced

Redo log is a transaction log belonging to InnoDB. It is a physical log. It records the modification of each page in the database, not how a certain row or rows is modified. It can be used to restore the submitted physical data pages, and only Can be restored to the last submitted position.

The redo log uses the write ahead logging technology, that is, the log is written first, and then the data in the Buffer Pool is modified. As for when the cache is flushed, a background thread is used for asynchronous processing. If the writing order is reversed, there may be inconsistencies between the log and the data.

When you talk about redo log, you have to talk about bin log

The bin log is at the Mysql Server level and records all DDL and DML operations, used to back up data and synchronize data between master and slave. If you want to achieve the consistency of primary and backup data, you must ensure the consistency of redo log and binlog, so redo log writing uses two-phase commit. .

image

The problem of data inconsistency occurs at the two points of BC. After the server restarts, it finds the record in the prepare state in the redo log, and then checks whether the binlog contains the updated content of the redo log according to the transaction ID. If it does not, the redo log discards the change. If it does, the transaction will be committed.

undo log

The transaction log belonging to InnoDB is a logical log, and its rollback function is the key to guaranteeing transaction atomicity.

What is recorded is the state before the data is modified.

  • For example, when a transaction executes an update statement, it will first write a logical log of the opposite operation in the undo log.
  • The modification of the same record in the same transaction will not record multiple logs. The undo log only saves the original version of the data.

MVCC

MVCC, or multi-version concurrency control, is implemented in InnoDB mainly to improve database concurrency performance, and to deal with read-write conflicts in a better way, so that even if there is a read-write conflict, it can be locked without locking. Block concurrent reads.

Current read and snapshot read:

  • Current read: select for update, update, insert, delete these operations are all current reads, what is current read? The latest version of the record is read, and the read record will be locked to ensure that other concurrent transactions cannot modify the current record
  • Snapshot read: Snapshot read is a non-blocking read concurrency function implemented by Mysql for MVCC. The read may not be the latest record, but may also be a historical record.

Problems solved by MVCC

There are three database concurrency scenarios:

  • Reading: There are no problems, and no concurrency control is required
  • Read and write: There are thread safety issues, which may cause transaction isolation issues, and may encounter dirty reads, phantom reads, and non-repeatable reads
  • Write and write: there are thread safety issues, there may be missing updates

Problems solved by MVCC

  • When reading and writing to the database concurrently, it can be achieved that there is no need to block the write operation during the read operation, and the write operation does not need to block the read operation, which improves the performance of concurrent read and write of the database.
  • Solve transaction isolation problems such as dirty reads, phantom reads, and non-repeatable reads, but cannot solve the problem of update loss (locked solution)

The realization principle of MVCC

MVCC is implemented by three components: three implicit fields, undo log, and read view.

Three implicit fields: In addition to our custom fields, each row of records also has some implicit fields defined by the database

  • DB_TRX_ID: The transaction ID of the most recent modification, the transaction ID of the record that created this record or the last modification of the record
  • DB_ROLL_PTR: Rollback pointer, pointing to the previous data version in undolog
  • DB_ROW_JD: hidden primary key, if the data table does not have a primary key, then innodb will automatically generate a 6-byte row_id

undo log:

undolog will store all historical versions of a record in a linked list, with the latest version at the beginning of the chain and the oldest version at the end of the chain.

image
image
image

Read View:

Read View is a view generated when a transaction performs a snapshot read. At the moment the transaction executes a snapshot read, a current snapshot of the data system is generated, records and maintains the id of the current active transaction of the system, and the id value of the transaction is incremented.

The biggest role of Read View is to make visibility judgments. There are three global attributes:

  • trx_list: a list of values, storing active transaction IDs
  • up_limit_id: record the ID with the smallest transaction ID in the trx_list list
  • low_limit_id: Record the next transaction ID that has not been allocated by the system when the Read View is generated

The specific comparison rules are as follows:

  1. First compare DB_TRX_ID <up_limit_id, yes: the current transaction can see the record where DB_TRX_ID is located, no: go to the next judgment
  2. Judge DB_TRX_ID >= low_limit_id, yes: it means that the record where DB_TRX_ID appears only after the Read View is generated, and it is definitely not visible to the current transaction, no: go to the next step to judge
  3. Determine whether DB_TRX_ID is in an active transaction. Yes: the current transaction cannot see the active transaction. No: It means that this transaction has been submitted before the Read View is generated, and the result of the modification can be seen.

What is the difference between snapshot reading at RC and RR levels

  • Under the RC isolation level, each snapshot read will generate and get the latest Read View
  • Under the RR isolation level, the first snapshot read in the same transaction will create the Read View, and subsequent snapshot reads will obtain the same Read View.

How to solve the phantom reading problem

  • If snapshot reads are used in the transaction, then there will be no phantom read problem
  • A phantom read will occur when the snapshot read is used together with the current read
  • If they are all currently read, use gap lock to solve the phantom reading problem