How Redis implements master-slave replication

1. High reliability of Redis

The high reliability of Redis:

  • As little data as possible is lost. AOF and RDB can restore data by replaying logs and re-reading RDB files, respectively.
  • The service is interrupted as little as possible. Increase the instance copy of Redis. Save a piece of data on multiple instances at the same time. Even if one instance fails and needs to be restored after a period of time, other instances can also provide services to the outside world without affecting business use.

2. How does Redis achieve data consistency between multiple instances

Redis provides a master-slave library mode to ensure the consistency of data copies. The master-slave library adopts a read-write separation method.
Read operation: Both the main library and the slave library can receive.
Write operation: first go to the main library for execution, and then the main library will synchronize the write operation to the slave library.

Insert picture description here

3. Why does the master-slave library need to adopt a read-write separation method?

If both the master library and the slave library can receive client write operations, without locking, multiple requests may be distributed to different databases, and the copies on the instance may be inconsistent.
And if it is locked, it will bring huge overhead.

Once the master-slave library model adopts read-write separation, all data modifications will only be carried out on the master library, and there is no need to coordinate the three instances. After the master library has the latest data, it will be synchronized to the slave library, so that the data of the master and slave libraries are consistent.

4. How is master-slave replication done?

When we start multiple Redis instances, they can form the relationship between the main library and the slave library through the relicaof (used slaveof before 5.0) command, and then the first data synchronization will be completed in three stages.

Insert picture description here


**The first stage: ** is the process of establishing a connection and negotiating synchronization between the master and slave libraries, mainly to prepare for full replication. In this step, the slave library and the master library are connected, and the master library is notified that synchronization is about to be carried out. After the master library confirms the reply, the master and slave libraries can start synchronization.

Specific: The slave library will send the psync command to the main library, indicating that data synchronization is to be performed, and the main library will start the replication according to the parameters of this command. The psync command includes two parameters that query the runID of the main library and the copy progress offset.

  • runID is a random ID that is automatically generated when each Redis instance starts, and is used to uniquely mark this instance. When the slave library and the main library are copied for the first time, because the runID of the main library is not known, the runID is set to? .
  • offset: set to -1 at this time, which means the first copy.

After the main library receives the psync command, it will use the FULLRESYNC corresponding command with two parameters: the main library runID and the current copy progress offset of the main library, which are returned to the slave library. After receiving the response from the library, these two parameters will be recorded.

There is one thing to note here. The FULLRESYNC response indicates that the full copy is used for the first copy, that is, the master library will copy all the current data to the slave library.

The second stage: The main library synchronizes all data to the slave library. After receiving the data from the library, the data is loaded locally. This process relies on the RDB file generated by the memory snapshot.

Specifically, the master library executes the BGSAVE command to generate an RDB file, and then sends the file to the slave library. After receiving the RDB file from the library, the current database will be cleared first, and then the RDB file will be loaded. This is because the slave library may have saved other data before synchronizing with the master library through the relicaof command. In order to avoid the influence of the previous data, the slave database needs to clear the current database first.

In the process of the master library synchronizing data to the slave library, the master library will not be blocked and can still receive requests normally. Otherwise, the Redis service will be interrupted. However, the write operations in these requests were not recorded in the RDB file just generated. In order to ensure the data consistency of the master and slave libraries, the master library will use a special refreshing buffer in the memory to record all write operations received after the RDB file is generated.

**The third stage: **The master library will send the write commands newly received during the execution of the second stage to the slave library.

The specific lock is that when the main library finishes sending the RDB file, it will send the modification operations in the replication buffer at this time to the slave library, and the slave library will perform these operations again. In this way, Zhu Cong library is synchronized.

5. Master-slave cascade mode

It can be seen that in a full copy process, two time-consuming operations need to be completed for the main library: generating RDB files and transferring RDB files.

If there are a large number of slave libraries, and they all need to be fully replicated with the main library, the main library will be busy with fork sub-processes to generate RDB files and perform full data synchronization. The fork operation will block the main thread from processing normal requests, causing the main library to slow down in responding to application requests. In addition, the transmission of RDB files will also occupy the network bandwidth of the main library, which will also put pressure on the resource usage of the main library.

Therefore, the master-slave-slave cascade mode can be used to share the pressure of the master library during full replication, and it can be distributed to the slave library in a cascaded manner.

That is to say, when we deploy the cluster, we can manually select the slave library (for example, select the slave library with a higher memory resource configuration) for cascading other slave libraries.

Then, we can select some more slave libraries, execute the replicaof command on these slave libraries, and let them establish a master-slave relationship with the selected slave library just now.

In this way, these slave libraries will know that there is no need to interact with the main library during synchronization, as long as the write operation is synchronized with the cascaded slave library, which can reduce the pressure on the main library.

Insert picture description here


Once the master and slave libraries have completed the full replication, they will always maintain a network connection, and the master library will resynchronize the subsequent command operations received successively to the slave library through this connection. This process is also called command propagation based on long connections. , Can avoid the overhead of frequent connection establishment.

6. What should I do if the network segment of the master-slave library is

Before Redis2.8, if the master-slave library has a network flash during command propagation, then the slave library will perform a full copy with the master library again, which is very expensive.

Starting from Redis 2.8, after the network is disconnected, the master and slave libraries will continue to synchronize using incremental replication.

Full copy: Synchronize all data.
Incremental replication: Only the commands received by the master library during the network disconnection of the master and slave libraries will be copied and synchronized to the slave libraries.

7. How does incremental replication ensure synchronization?

After the master-slave library segment is connected, the master library will write the write operation commands received during the disconnection into the replication buffer, and also write these operation commands into the repl_backlog_buffer buffer.

repl_backlog_buffer is a ring buffer. The main library will record the location it has written to, and the slave library will record the location it has read.

At the beginning, the writing and reading positions of the main library and the slave library will be together, which counts as their starting positions. As the main library continues to receive new write operations, its writing position in the buffer will gradually deviate from the starting position. We usually use the offset to measure the offset distance. For the main library, the corresponding deviation The shift amount is master_repl_offset. The more new write operations the main library receives, the larger this value will be.

Similarly, after the slave library finishes copying and writing operations, its read position in the buffer will gradually shift from the previous starting position. At this time, the offset slave_repl_offset that has been copied from the library is also increasing. Under normal circumstances, these two offsets are basically equal.

Insert picture description here


After the connection of the master library is restored, the slave library will first send the psync command to the master library, and send its current slave_repl_offset to the slave library, and the master library will judge the gap between its master_repl_offset.

At the network disconnected node, the main library may receive a new write operation command, so generally the first master_repl_offset will be greater than the slave_repl_offset. At this time, the master library only needs to synchronize the command operations between master_repl_offset and slave_repl_offset to the slave library.

Insert picture description here


Note:
repl_backlog_buffer is a ring buffer, so after the buffer is full, if the main library continues to write, it will overwrite the previous write operation. If the reading speed of the slave library is relatively slow, it may cause the operations that have not been read from the slave library to be overwritten by the newly written operations of the master library, which will cause the data inconsistency between the master and slave libraries.

Solution:
We can adjust the repl_backlog_size parameter.
The size of the cache space = the speed of writing commands to the master library * the size of the operation-the speed of network transmission commands from the master and the slave database * the size of the operation.
In practical applications, the buffer space needs to be doubled.
repl_backlog_size = buffer size * 2

For example, if the main library writes 2000 operations per second, and the size of each operation is 2KB, and the network can transmit 1000 operations per second, then 1000 operations need to be buffered, which requires at least 2MB of buffer space . Otherwise, the newly written command will overwrite the old operation. In order to deal with possible sudden pressure, we finally set repl_backlog_size to 4MB.

In this way, the risk of data inconsistency in the master and slave libraries during incremental replication is reduced. However, if the amount of concurrent requests is very large, and the double buffer space cannot hold new operation requests, it may still lead to inconsistent data in the master and slave libraries.

In this situation:

  1. You can appropriately increase the repl_backlog_size value according to the memory resources of the server where Redis is located, for example, set it to 4 times the size of the buffer space.
  2. You can consider using a slice cluster to share the request pressure of a single main library.

Summary and thinking:
Although the master-slave library model uses read-write separation to avoid the data inconsistency caused by writing multiple instances at the same time, it still faces the potential risk of the master library.
What if the main library fails?
Can the data be guaranteed to be consistent?
Can Redis still provide services normally?

Therefore, the sentinel mechanism should be used to monitor the status of the main library.