ACID of the transaction
The SQL statements executed in a transaction either all succeed or all fail, and it is impossible to partially succeed.
The data before and after the execution of the transaction is the same, that is to say, the SQL statement in the transaction cannot be executed only partly. This kind of payment generally occurs when the transaction is abnormally interrupted, and the server is abnormally down. At this time, it is necessary to ensure that the transaction being executed is committed or rolled back after the database is restarted.
Do not interfere with each other during the execution of multiple transactions.
After the transaction is submitted successfully, it is permanently saved on the disk. After the transaction is submitted successfully, the data is not saved. The abnormal downtime will be restored according to the log. The rollback should be rolled back and the commit should be submitted.
The isolation level of the transaction (efficiency decreases in order)
A transaction can read the uncommitted data of other transactions. Dirty read
In a transaction, data that has been submitted by other transactions can be read, and phantom reads will occur, which cannot be read repeatedly.
In a transaction, the same filter condition reads the same data every time.
Transactions are executed in order of submission.
Dirty reads, phantom reads and non-repeatable reads
Use the following table to reproduce dirty reads, phantom reads and non-repeatable reads
create table testtable ( id bigint primary key auto_increment comment '主键', bus1 int comment '业务字段1', bus2 varchar(20) comment '业务字段2' );
Under the read uncommitted isolation level, transaction A can read the uncommitted data of transaction B. Assuming that transaction B is finally rolled back, the data read by transaction A is regarded as dirty read because this data is not persisted.
-- 事务A insert一条语句，不进行commit -- 设置当前session事务不自动提交 set SESSION autocommit = 0; START TRANSACTION; insert into testtable values(NULL,2,'我是事务A2');
bus1 = 2Data read by transaction B
-- 事务B -- 设置当前session的数据库隔离级别为读未提交 set session transaction isolation level read uncommitted; select * FROM testtable where bus1 = 2;
You can see that the data of transaction A that has not yet been committed is queried. At this time, we roll back transaction A, execute transaction B again, select
at this time, execute the query of transaction B:
at this time, the uncommitted data of transaction A cannot be queried. This is a dirty read. Transaction B reads the data, but in the end the data is not committed but rolled back. We can't find this data in the library.
Under the read-committed isolation level, transaction A performs an update (add, modify, delete) operation on a certain piece of data, and the data read by transaction B before transaction A is committed is inconsistent with the data read after transaction A is committed.
Let's reproduce this scenario:
first set the transaction isolation level to read committed:
set session transaction isolation level read committed;
Then execute a query:
select * FROM testtable where bus1 = 1;
The result at this time is:
Then transaction A starts and commits:
-- 设置当前session事务不自动提交 set SESSION autocommit = 0; START TRANSACTION; update testtable set bus2 = '我是事务A提交后' where bus1 = 1; commit;
Then transaction B reads again with the same conditions: it is
found that the data submitted by transaction A has been read. In this case, transaction B first reads the data, then transaction A executes and commits, and then transaction B reads the data again , It is found that the data read at this time is inconsistent with the data read last time. This is non-repeatable reading.
In the repeatable read transaction isolation level, when the same range is queried twice in the same transaction, the row that did not exist in the previous query is found in the next query.
There are two points different from non-repeatable reading
- The phantom reading is only for the number of rows, not for whether the content in each row is updated
- The isolation level of phantom read is repeatable read, while the isolation level of non-repeatable read is read submitted.
Let’s reproduce this scenario:
In order to remove the interference, I empty the data table and insert a piece of data:
TRUNCATE testtable; START TRANSACTION; insert into testtable values(NULL,1,'我是事务A'); COMMIT;
Then write a range query to compare the data before and after:
-- 事务B -- 设置当前session的事务隔离级别为可重复读 set SESSION autocommit = 0; set session transaction isolation level repeatable read; START TRANSACTION; select testtable.* FROM testtable where bus1 >= 1 and bus1 <= 3; -- 每遇到一条记录 就sleep 30秒 当前条件中会查出一条记录 select SLEEP(30) ,testtable.* from testtable where bus1 = 1; select testtable.* FROM testtable where bus1 >= 1 and bus1 <= 3; commit;
After execution, it will
select SLEEP(30) ,testtable.* from testtable where bus1 = 1;sleep here for 30 seconds, and then execute:
-- 设置当前session事务不自动提交 set SESSION autocommit = 0; START TRANSACTION; insert into testtable values(NULL,2,'我是事务A在事务B中插入的数据'); COMMIT;
After waiting for 30 seconds, the result of the first select statement of transaction B is found: the result of the
second select statement is: the result of the
third select statement is:
aside the second select is used for sleep for 30 seconds, the first One is the same as the third, what about a good phantom reading ? ? ? ? ? ? ? ?
Innodb's MVCC mechanism
If the storage engine is Innodb, the above-mentioned phantom read under the repeatable read isolation level does not exist, because Innodb uses MVCC multi-version concurrency control. In the same transaction, the first select statement will let Innodb compare the following from from The table takes a snapshot, and the reading of this table is frozen at the moment when the snapshot is taken. No matter how the table is read in the future, the data after the snapshot cannot be read. Therefore, the phantom reading is under the Innodb storage engine. nonexistent.
Primary key conflict scenario
There are opinions on the Internet that when transaction A is queried, a certain primary key does not exist, transaction B inserts this primary key, and then transaction A also inserts this primary key, causing conflicts. This is phantom reading. Reproduce this scenario.
- Empty the table
testtable, and then insert 4 pieces of data:
insert into testtable values(1,11,'第一条数据'); insert into testtable values (3,22,'第二条数据'); insert into testtable values (4,33,'第三条数据'); insert into testtable values (10,44,'第四条数据');
- Query all data in transaction A, then update the data with id=3, and then query all the data without committing the transaction. Transaction B adds a new piece of data with primary key id=2, submit the transaction, and look at the result:
-- 事务A -- 设置当前session的事务隔离级别为可重复读 set SESSION autocommit = 0; set session transaction isolation level repeatable read; START TRANSACTION; select * from testtable ; -- select * from testtable t2 ; -- insert into testtable values(2, '66','第六条数据'); -- commit;
-- 事务B -- 设置当前session的事务隔离级别为可重复读 set SESSION autocommit = 0; set session transaction isolation level repeatable read; START TRANSACTION; insert into testtable values(2, '55','第五条数据'); commit;
Execute transaction A first, then execute transaction B, and then execute the last 3 lines of transaction A:
-- 事务A -- 设置当前session的事务隔离级别为可重复读 -- set SESSION autocommit = 0; -- set session transaction isolation level repeatable read; -- START TRANSACTION; -- select * from testtable ; select * from testtable t2 ; insert into testtable values(2, '66','第六条数据'); commit;
In the query statement of transaction A, the result is the same as the previous query:
However, when it is executed to insert, the primary key conflict is reported.
I don’t know if this is considered a phantom read, but it can be explained that under the RR isolation level, because the first select of transaction A is before transaction B, a snapshot will be formed at this time, so the operation of transaction B is invisible (MVCC Mechanism), if transaction A is in the following situation, you can see the data submitted by transaction B:
-- 事务A -- 设置当前session的事务隔离级别为可重复读 set SESSION autocommit = 0; set session transaction isolation level repeatable read; START TRANSACTION; update testtable set bus1 = '77' where id = 3; -- select * from testtable ; -- insert into testtable values(2, '66','第六条数据'); -- commit;
The operation steps are the same as above. The difference between this sql and the previous transaction A is that the first select is replaced with update, the reason is that the transaction will be officially started in the first sql, if only
START TRANSACTION;it will not start. After executing transaction B, when you execute the commented code, you can see the data submitted by transaction B.
At this time, it is obvious that the primary key conflicts when executing the following insert.
As for whether this is considered a phantom reading, the benevolent sees the benevolent and the wise.
But can this kind of primary key conflict problem be solved? The answer is yes.
select lock in share mode and select for update
The select lock in share statement will take the qualified data as the sharing mode, and it is the current read (that is, reading the latest database data, not snapshot reading). Updates are not allowed in the sharing mode.
- Clear data and reinsert
TRUNCATE table testtable; insert into testtable values(1,11,'第一条数据'); insert into testtable values (3,22,'第二条数据'); insert into testtable values (4,33,'第三条数据'); insert into testtable values (10,44,'第四条数据');
- Compare snapshot read and current read
- This is a snapshot read, read the data in the current table
-- 事务A set autocommit = 0; start transaction; select * from testtable t ; -- select * from testtable lock in share mode;
Then update with another session
-- 事务B set autocommit = 0; start transaction; update testtable set bus2 = '更新更新更新' where id = 3; commit;
Now go back to transaction A and execute the commented statement:
select * from testtable lock in share mode;
It can be seen that the latest data is read: it is different
from the above without lock in share mode, then we will read the full table data again in the current transaction, without lock in share mode, this is the case:
this time is a snapshot Read, the read is the same as the first read. This is the difference between the current read and the snapshot read.
At this time, all data is locked by lock in share mode. Let's try to update it. I modified the sql of transaction B and re-executed it. The result suggested: the
modified sql is as follows:
-- 事务B set autocommit = 0; start transaction; update testtable set bus2 = '我是lock in share mode更新测试' where id = 3; commit;
In this case, all rows in the filter range can be locked. If the transaction A is longer, it will particularly affect the update performance, and it will not be affected if it is read. There is also a select for update, which is called an exclusive lock, which means
complete Exclusive, not even reading, this kind of lock is similar to pessimistic lock, even more ruthless than lock in share mode.
So is there a way to solve this problem? Yes, it is Next-Key Lock
Row lock, gap lock, Next-Key Lock
Row lock: The lock is added directly to a row.
Gap lock: The lock is added to the non-existent free space, which may be between two index records, or the space before the first index record or after the last index record.
Next-Key lock: It is used together with row lock + gap lock.
For example, in the above example, if you lock id=3, then **[1,3),(3,4]** will be added a gap lock, the lock itself does not lock the current row, the current row is locked by the row lock.
By default, Mysql does not turn on Next-Key Lock, you can use the following command to view
show variables like 'innodb_locks_unsafe_for_binlog';
I personally think that the reasons why Mysql does not open the gap lock by default are as follows:
- Gap lock affects performance. If the lock locks the largest id in the table, then the largest id is locked to positive infinity, which affects the normal insertion of data by other transactions, resulting in performance degradation
- The scenario of first querying whether it gradually exists, and then inserting it is not very likely to appear in the database. One reason is that databases designed with self-incrementing primary keys generally do not insert data in the gap, and let the database self-increase, and the other One reason is that if this kind of problem does occur, it is generally solved through distributed locks, and is rarely solved at the database level. After all, this is an operation that greatly affects database throughput.