How to ensure distributed idempotence

In this article, how to guarantee distributed idempotence

Article Directory


Idempotence

definition

The concept of idempotence comes from mathematics, which means that the results of N transformation and 1 transformation of the data source are the same. In engineering, idempotence is used to indicate that the results of one request or multiple requests initiated by the user for the same operation are consistent, and there will be no side effects due to multiple clicks .

  • Idempotence includes the side effect on the resource when the first request is made, but the subsequent multiple requests will no longer have side effects on the resource.
  • Idempotence is concerned with whether the subsequent multiple requests will have side effects on the resource, rather than the result.
  • Issues such as network timeouts are not the scope of idempotent discussion.
  • Idempotence is a promise of system services to the outside world, not a realization. As long as the interface is successfully called, the effect of multiple external calls on the system is consistent. Services declared as idempotent will consider that external call failures are normal, and there will inevitably be retries after failure.

Scenes

During business development, you may encounter situations where the retry mechanism is triggered because the request cannot be received due to network shocks, or the front-end jitter causes the form to be repeatedly submitted. For example, in a transaction system, the user submits a shopping request that has been correctly processed by the server, but the result returned by the server is lost due to reasons such as the network, so that the client cannot know the processing result. If it is on the web page, some inappropriate design may make the user think that the last operation failed, and then refresh the page, which causes the deduction to be called twice and the account is deducted once more. At this time, the idempotent interface needs to be introduced.

Let's take MySQL as an example. Only the third scenario requires developers to use other strategies to ensure idempotence :


SELECT col1 FROM tab1 WHER col2=2; 
-- 无论执行多少次都不会改变状态,是天然的幂等。

UPDATE tab1 SET col1=1 WHERE col2=2; 
-- 无论执行成功多少次状态都是一致的,因此也是幂等操作。

UPDATE tab1 SET col1=col1+1 WHERE col2=2; 
-- 每次执行的结果都会发生变化,这种不是幂等的。

Here is the difference between repeated submission and idempotence :

  1. Repeated submissions are artificially performed multiple operations when the first request has been successful, causing services that do not meet the idempotent requirements to change their status multiple times.
  2. The more idempotent use case is when the first request does not know the result (such as timeout) or fails, multiple requests are initiated, the purpose is to confirm the success of the first request multiple times, but it will not be caused by multiple requests. There have been multiple status changes.

Idempotent thinking

The introduction of idempotence will make the server-side logic more complicated. Services that meet idempotence need to include at least two points in the logic :

  1. First go to query the last execution status, if not, it is considered the first request.
  2. Before the business logic of the service changes state, ensure the logic of preventing duplicate submission.

Idempotence can simplify the client logic processing, but it increases the logic and cost of the service provider. Therefore, whether to use it or not needs to be analyzed based on specific scenarios. Therefore, in addition to special business requirements, try not to provide idempotent interfaces .

  1. Additional control idempotent business logic is added, which complicates business functions.
  2. Changing the function of parallel execution to serial execution reduces the execution efficiency.

Idempotent resolution

Frontend settings

After the user clicks the submit button, we can set the button to be unavailable or hidden .

Insert picture description here

The front-end restriction is relatively simple, but there is a fatal error. If you encounter a knowledgeable user who repeats the request by simulating a web page request, the front-end restriction is bypassed.


Unique index

The easiest and most direct way to prevent multiple insertions of orders is to create a unique index, and then the statement may be slightly different when inserting. But the purpose is to ensure that only one identical record exists in the database.

  • Method 1: Add a unique index to the database, and then if the DuplicateKeyException is caught during execution, you will understand that it is caused by repeated insertion, and you can continue to execute the business.
  • Method 2: Use MySQL's own keyword ON DUPLICATE KEY UPDATE to implement the operation of inserting if it does not exist, and updating if it exists. This keyword will not delete the original record.
  • Method 3: Replace into is similar to INSERT. The bottom layer of replace into is to delete and then insert data, which will destroy the index and re-maintain the index. Note that there must be a primary key or a unique index to be effective, otherwise replace into will only be added.

Deduplication table

The mechanism of deduplication table is based on the characteristics of mysql unique index. The general process is :

  1. The client first requests the server, and the server first stores the requested information in a mysql deduplication table. This table needs to establish a unique index or primary key index based on a special field of the request.
  2. Judge whether the insertion is successful, and if the insertion is successful, continue to make follow-up business requests. If the insertion fails, it means that the current request has been executed.
Insert picture description here

Pessimistic lock

Insert picture description here


Method 1: Simply use the syn or lock that comes with Java to achieve idempotence. The core point is to switch the important execution part from parallel to serial. The disadvantage is that this lock cannot be used in a distributed scenario, because it is all across JVM! At this time, a distributed lock needs to be introduced.

Insert picture description here


Rely on MySQL's own for update to operate the database to achieve serialization. The focus here is for update, a brief description:

  1. When thread A executes for update, the data will lock the current record. When other threads execute this line of code, they will wait for thread A to release the lock before acquiring the lock and continue subsequent operations.
  2. When the transaction is submitted, the lock acquired by for update will be automatically released.

The disadvantage of this mode is that if the business processing is relatively time-consuming, and in the case of concurrency, the subsequent threads will be in a waiting state for a long time, occupying many threads, leaving these threads in an invalid waiting state, and the number of threads in web services is generally limited. A large number of threads are in a waiting state for acquiring the for update lock, which is not conducive to concurrent operation of the system.


Optimistic lock

Add a version field to each row of data. This is actually similar to the idea in the spike design, using the current read update operation that comes with MySQL. When updating data, first query to obtain the corresponding version number, and then try the update operation, according to whether the return value is 0 to ensure whether it is a repeated submission.


select id,name,account,version from user where id = 1412; // 假设获得的 version = 10

update user set account = account + 10,version = version + 1 
where id = 1412 and version = 10;

Insert picture description here

Distributed lock

Use the setnx operation in Redis to set the idempotent guarantee barrier in the distributed lock. If setnx succeeds, it means that this is the first data insertion, and you can continue to execute the SQL statement. If setnx fails, it means it has been executed.

Insert picture description here

token scheme

This method is divided into two stages: the token application stage and the payment stage .

  1. The first stage: Before entering the order submission page, the order system needs to initiate a token request to the payment system based on the user information. The payment system saves the token in the Redis cache and uses it for the second stage of payment.
  2. The second stage: The order system initiates a payment request with the applied token. The payment system will check whether the token exists in Redis. If it exists, it means the first payment request is initiated. After the token in the cache is deleted, the payment logic processing will start; if cached Does not exist, indicating an illegal request.

In fact, the token here can be regarded as a token, and the payment system confirms the uniqueness of the insertion according to the token. The disadvantage of the token model is that it requires two interactions between systems, and the process is more complicated than the above methods.

Insert picture description here

Summary of this article

This article introduces the knowledge of distributed idempotence in detail.