Redis: slicing cluster

Redis: slicing cluster

In the actual business development, the cloud host is used to run the Redis instance, with 50 million key-value pairs, each key-value pair is 512B, then the memory capacity of the cloud host usually needs to select a 32GB cloud host to deploy Redis. Among them, 25GB is used to save data, and the remaining 7GB is used to ensure the normal operation of the system. Use RDB to persist data to ensure that data can be recovered from RDB after a Redis instance fails.

But when the amount of data increases, you will find that Redis is sometimes relatively slow. You can use INFOcommands to view the latest_fork_usecindicator values ​​in Redis (representing the time consumed by the last fork). This is because when RDB is used for persistence, Redis will fork. The process is completed, the time of the fork operation is positively related to the amount of data in Redis, and the fork will block the main thread when it is executed. The larger the amount of data, the longer the main thread blocking time caused by the fork operation.

Then we need to consider a question: as the amount of data increases, should we continue to increase the memory of the cloud host or the instance?

Obviously, it is not feasible to continue to increase the memory of the cloud host. When the RDB forks the child process for persistence, it is very time-consuming, so you can increase the instance to reduce the blocking of the Redis main thread.

Slice cluster

Slicing cluster, also called sharding cluster, refers to starting multiple Redis instances to form a cluster, and then dividing the received data into multiple copies according to certain rules, and each copy is saved by one instance. Back to our previous scenario, if you divide 25GB of data into 5 equally (of course, you don’t have to do the same), and use 5 instances to save, each instance only needs to save 5GB of data. As shown in the following figure:

Slice cluster


Each instance stores 5GB of data, and the amount of data will be reduced a lot when the RDB file is generated for persistence, reducing the blocking of the main thread. In actual business, it is unavoidable to save a large amount of data. If a large amount of data is used to save it is usually unavoidable, a slice cluster is a very good solution.

How to save more data

Increasing the memory and slicing clusters of the cloud host is Redis's solution to deal with more data, corresponding to: scale up and scale out ;

  • **Vertical expansion: **Increase the configuration of the cloud host, increase the memory capacity, disk capacity, etc.;
  • **Horizontal expansion: **Increase the number of Redis instances. The original instance that used 1 8GB memory and 50GB disk, now uses three instances of the same configuration.
Scale out and scale up


The advantages of vertical expansion are simple and straightforward to implement , but the disadvantages are:

  • If the amount of data increases, the required memory will also increase, and the main thread may block when it forks the child process;
  • Vertical expansion will be limited by hardware and cost;

Horizontal expansion does not need to consider the hardware and cost constraints of a single instance. When facing millions or tens of millions of applications, the horizontally expanded Redis slice cluster will be a very good choice. However, it is very clear where the data exists and where the client accesses it. But it will involve the distributed management of multiple instances. To use the slice cluster, the problems that need to be solved are as follows:

  • How is the data sliced ​​among multiple instances distributed?
  • How does the client determine which instance the data it wants to access is on?

Distribution relationship between data slices and instances

In a slice cluster, data needs to be distributed on different instances, so what is the correspondence between data and instances?

Redis officially provides a Redis Cluster solution for implementing slice clusters, which specifies the corresponding rules for data and instances.

Redis Cluster uses Hash Slot to process the mapping relationship between data and instances. There are 16384 hash slots in a slice cluster. The hash slots are similar to data partitions. Each key value is mapped to a hash slot according to the corresponding key.

The first key-value pairs key, by the CRC16 computation algorithm of the value of a 16bit, then modulo of 16284 can be obtained from 0 to 16384 modulo range, each representing a respective modulo hash slot number. When we deploy the Redis Cluster solution, we can use cluster createcommands to create a cluster. At this time, Redis will automatically distribute these slots evenly on the cluster instances. For example, if there are N instances in the cluster, then the number of slots on each instance is 16384/N . You can also use cluster meetcommands to manually establish connections between instances to form a cluster, and then use cluster addslotscommands to specify the number of hash slots on each instance.

The relationship between data slices and instances

How does the client find data?

When locating key-value pair data, the corresponding hash slot can be obtained by calculation, but it is necessary to further locate the instance on which the hash slot is distributed. After the client establishes a connection with the cluster instance, the instance will send the allocation information of the hash slot to the client. However, when the cluster is just created, each instance only knows which hash slots it has been allocated, and does not know the hash slot information owned by other instances.

Why can the client obtain all the hash slot information when accessing any instance?

The Redis instance will send its hash slot information to other instances connected to it to complete the diffusion of hash slot allocation information. When the instances are connected to each other, each instance has a mapping relationship for all hash slots.

After the client receives the hash slot information, it will cache the hash slot information locally. When the client requests a key-value pair, the hash slot corresponding to the key is calculated first, and then the request can be sent to the corresponding instance.

In a cluster, the correspondence between instances and hash slots is not static, and there are two common changes:

  • In the cluster, when an instance is added or deleted, the hash slot needs to be re-allocated;
  • For load balancing, Redis needs to redistribute all instances on the hash slot;

Instances can also pass messages to each other to obtain the latest hash slot allocation information, but the client cannot actively perceive these changes. This will lead to inconsistencies between the allocation information it caches and the latest allocation information.

The Redis Cluster solution provides a redirection mechanism . When the client sends data read and write operations to an instance, there is no corresponding data on the instance, and the client needs to send operation commands to the new instance.

When the client sends a key-value pair operation request to an instance, if there is no hash slot for the key-value pair mapping on this instance, then the instance will return the following MOVEDcommand response result to the client , this The result contains the access address of the new instance.

GET hello:key
(error) MOVED 13320 172.16.19.3:6379

MOVEDThe command indicates that the hash slot 13320 where the key-value pair requested by the client is located is actually on the instance 172.16.19.3 . Through the returned MOVEDcommand, it is equivalent to telling the client the information of the new instance where the hash slot is located. In this way, the client can directly connect to 172.16.19.3 and send operation requests.

When slot 3 is migrated from instance 2 to instance 3, the cache in the client is still not updated, so continue to send commands to instance 2. Instance 2 returns the MOVEDcommand to return the latest position of Slot 3 (instance 3) to the client. The client will send a request to instance 3 again and update the local cache at the same time. One thing to note is that the client sends a request to instance 2, but at this time, only part of the data in Slot 3 is migrated to instance 3, and some data is not migrated. When this part of the migration is completed, the client will receive an ASK error message, as shown below:

GET hello:key
(error) ASK 13320 172.16.19.3:6379

ASKThe command has two meanings: First, it indicates that the Slot data is still being migrated; second, the ASKcommand returns the latest instance address of the data requested by the client to the client. At this time, the client needs to send a ASKINGcommand to instance 3 and then Send operation commands. ASKINGThe meaning of the command is to allow this instance to execute the command that the client needs to send next, and then the client is sending a GET command to this instance to read the data.

Insert picture description here

When dealing with data volume expansion, although the vertical expansion method of increasing memory is simple and straightforward, it will cause excessive database memory and slow performance. Redis slice cluster provides a horizontal expansion mode, that is, multiple instances are used, and a certain number of hash slots are configured for each instance. Data can be mapped to the hash slots through the hash value of the key, and then distributed through the hash slots Save to a different instance. The advantage of this is that the scalability is good, no matter how much data there is, the slicing cluster can handle it.

In addition, the increase or decrease of cluster instances, or the redistribution of data for load balancing, will cause the mapping relationship between hash slots and instances to change. When the client sends a request, it will receive a command execution error message. Knowing the MOVED and ASK commands, you will not have a headache for this type of error.