[PostgreSQL Tutorial]· MVCC (Multi-Version Concurrency Control)

Article Directory

1. Brief description

MVCC ( Multi-Version Concurrency Control, multi-version concurrency control ), it is a technology for PostgreSQL database to achieve concurrency control, when multiple transactions are running in the database at the same time, it guarantees consistency and isolation , which is ACID Two of these attributes. In order to have a deeper understanding of the operating principles and mechanisms behind MVCC, we need to have a clear understanding of the concepts of database transactions, isolation levels, and concurrency control. If you learn MVCC on this basis, you will get twice the result with half the effort.

2. Affairs

2.1 What is a transaction

For transactions, there is the following definition:

A transaction is a set of operations performed by an application that transfers a database from one correct state to another correct state (consistency), provided that the transaction is completed (atomicity) and without interference from other transactions (isolation).

The short translation is: a transaction is a set of operations performed by an application that transfers the database from one correct state to another correct state (consistency), provided that the transaction has been completed (atomic) and there is no interference from other transactions (Isolation). In other words, a transaction is one or more statements packaged together. They either all succeed and are "committed" or are unsuccessful. In this case, any state changes will be rolled back to the beginning of the transaction. . The transaction includes a single statement transaction and multiple statement transactions wrapped in BEGIN and COMMIT commands.

Explaining affairs, the closest thing to life is the bank transfer scenario that you often encounter. If there are two tables in the PostgreSQL database, they are used to record the type, amount, account (just for example) and other information of the two card types of the customer. Now the customer goes to the counter to handle the business, hoping to transfer 1K amount from account A to account B. This transfer process requires three processes:

(1) Check whether the amount of account A has 1K.

(2) Subtract 1K from the balance of account A.

(3) 1K is added to the balance of account B.

During the operation, if any one of these three processes is abnormal, it will lead to catastrophic results. For example, in step (2), if the server suddenly crashes, the network fails, the program is abnormal, etc., then the customer will lose 1K for no reason. Therefore, the above three operations must be packaged together and executed, either all succeed or all fail.

The schematic diagram is as follows:

Insert picture description here

In order to make the above sequence diagram easier to understand, we create two data tables account_1 and account_2 respectively, which represent the user's account A and account B (for simplicity, the number and type of attribute fields in the accout_1 and account_2 tables are the same, and the name The field cannot be empty). As follows:

CREATE TABLE account_2(id SERIAL PRIMARY KEY, name VARCHAR NOT NULL, balance NUMERIC);

test=# \d+ account_2;
                                                      Table "public.account_2"
 Column  |       Type        | Collation | Nullable |                Default                | Storage  | Stats target | Description
---------+-------------------+-----------+----------+---------------------------------------+----------+--------------+-------------
 id      | integer           |           | not null | nextval('account_2_id_seq'::regclass) | plain    |              |
 name    | character varying |           | not null |                                       | extended |              |
 balance | numeric           |           |          |                                       | main     |              |
Indexes:
    "account_2_pkey" PRIMARY KEY, btree (id)

Now insert a record into the account_1 table, where the name field value is'A', and the balance value is 1000; insert a record into the account_2 table, the name is'B', but the balance is 0.

test=#
test=# INSERT INTO account_1(name, balance) VALUES ('A', 1000.0);
INSERT 0 1
test=# SELECT * FROM account_1;
 id | name | balance
----+------+---------
  2 | A    |  1000.0
(1 row)

test=#
test=# INSERT INTO account_2(name, balance) VALUES ('B', 0.0);
INSERT 0 1
test=# SELECT * FROM account_2;
 id | name | balance
----+------+---------
  2 | B    |     0.0
(1 row)

test=#

We open two terminals and simulate two transactions. One terminal performs the operation of transferring 1K from account A to account B without submitting the transaction; the other terminal cannot see the 1K increase in account B's balance.

As shown in the figure below, terminal 1 has completed four process operations, but has not yet committed the transaction (COMMIT).

Insert picture description here

Now open another terminal, you can see that the balance of account A in the account_1 table and account B in the account_2 table has not changed.

Insert picture description here

Now that the COMMIT command account is executed in terminal 1, the balance of account A in account_1 and account B in account_2 will both change.

// 终端1执行事务提交;
test=# COMMIT ;
COMMIT
test=#

//终端2查询两个表中的账户的余额变化
test=# SELECT * FROM account_1;
 id | name | balance
----+------+---------
  2 | A    |     0.0
(1 row)

test=# SELECT * FROM account_2;
 id | name | balance
----+------+---------
  2 | B    |  1000.0
(1 row)

2.2 Database ACID Features

Database transactions have four characteristics, which is the common abbreviation ACID. They are: Atomicity , Consistency , Isolation and Durability .

· Atomicity

The transaction is executed as a whole, and the operations on the database contained in it are either all executed or not executed; part of the execution is not allowed.

· Consistency

The transaction should ensure that the state of the database changes from one state to another. The meaning of consistency is that the data in the database should satisfy integrity constraints. Take the transfer in the above figure as an example. When step (5) is executed, the system crashes at this time, and the final account A decreases by 1K, while the account B balance does not increase by 1K.

· Isolation

When multiple transactions are executed concurrently, the execution of one transaction should not affect the execution of other transactions. For example, when terminal 1 does not COMMIT the transaction, the data seen by terminal 2 (another transaction) is still the original data, and the data operated by terminal 1's transaction is invisible to terminal 2's transaction.

· Persistence

Persistence means even when the device breaks or crashes. Modifications to the database by the committed transaction should be permanently stored in the database.

Everything in ACID is designed to maintain data integrity. If there is no atomicity, you may leave the data in a partial or invalid state, and everything that depends on the data will be interrupted.

2.3 transaction nesting

PostgreSQL does not fully support nested transactions. Full support for nested transactions means that if the parent transaction is rolled back (directly or indirectly), the successful child transaction will not be rolled back. PostgreSQL allows the use of savepoints, which allows you to reach a consistent state, but this means that if any transaction in the nested transaction sequence is rolled back, then everything in the transaction will be rolled back. If you use macros with logical transactions, PostgreSQL will automatically provide savepoints in the case of nesting. If you need to handle them manually, it also provides a macro with a save point.

If savepoints are not used, the results of nested transactions may be inconsistent. If an error occurs anywhere in the nest, some or all of the subtransactions may or may not be committed. The official PostgreSQL documentation states that " issuing begin within a transaction will cause a warning message. The state of the transaction is not affected. " Although this sounds like the sub-transaction is just treated as part of an external transaction, you cannot rely on this explanation.

2.4 Concurrency control technology


    
    
    
    
    
    
    
    
    
    
    

2.5 Isolation level


    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    

2. Disadvantages of MVCC