Ali two sides: What should I do if the AOF file in Redis is too large?

Welcome everyone to pay attention to my WeChat public account [ Lao Zhouchao Architecture ], the principle of Java back-end mainstream technology stack, source code analysis, architecture, and various Internet solutions with high concurrency, high performance, and high availability.

I. Introduction

The purpose of writing this article is a contribution from one of my fans, saying that Ali was asked this question during the interview. I have to say that Ali's interview questions are quite quality. Generally, we will only pay attention to the two persistence methods of Redis, RDB and AOF. But the blind guessing interview process in Lao Zhou must start with the persistence method first, and then gradually ask what should I do if the AOF file is too large? In the attitude of knowing what is happening, Lao Zhou will take you here to thoroughly understand the persistence method of Redis from the implementation principles of RDB and AOF, their respective triggering methods, and their respective application scenarios.

Before entering the main text, let's ask ourselves these two questions first? What is Redis persistence? Why do we need Redis persistence?

Q1: What is Redis persistence?

Persistence is to save the data in the memory to the hard disk so that the data can be persisted. There are two ways to implement Redis persistence: RDB and AOF.

Q2: Why do we need Redis persistence?

Redis is an in-memory database, and data will disappear after a downtime.
After Redis restarts to quickly restore data, it must provide a persistence mechanism.

Okay, after knowing these two issues, let's take a look at how Redis stores data in the hard disk so that the data still exists after Redis restarts.

Two, RDB

Save the memory data of Redis at a certain moment to a file on the hard disk. The file name saved by default is dump.rdb, and when the Redis server starts, the data of the dump.rdb file will be reloaded into the memory.

2.1 Open RDB persistence mode

2.1.1 save command

127.0.0.1:6379> save
OK

saveCommand is an 同步operation. When the client sends a save command request to the server for persistence, the server will block other client requests after the save command until the data synchronization is completed.

If the amount of data is too large, the synchronization of data will take a long time. During this period, the Redis server cannot receive other requests, resulting in unavailability.

It is not recommended to use this command in the online Redis environment

2.1.2 bgsave command

127.0.0.1:6379> bgsave 
Background saving started

Unlike the save command, the bgsavecommand is an 异步operation. When the client sends the bgsave command, the Redis server main process will fork a child process, and the snapshot persistence is completely handled by the child process. The parent process continues to process the client request, and the child process exits after the data is saved to the rdb file.

Here we come to think about a problem, since Redis has to process client requests and needs to be persisted at the same time, while persisting, the memory data structure is still changing, for example, a hash dictionary is being persisted, and a request comes and deletes it. , But the persistence has not been completed yet, will this lead to inconsistencies in the persisted data?

Redis uses the multi-process COW (Copy On Write) mechanism of the operating system to implement snapshot persistence, which is our RDB persistence method here.

Insert picture description here


As shown above, Redis will use the COW mechanism of the operating system to separate the data segment pages. The data segment is composed of pages of many operating systems. When the parent process modifies the data of one of the pages, it will be shared The page is copied and separated, and then the copied page is modified. At this time, the corresponding page of the child process is unchanged. The data in the memory that it can see is frozen at the moment the process is generated, and never Will change, which is why RDB persistence is called snapshot persistence.

2.1.3 The server configuration is automatically triggered periodically

Configure in redis.conf: How many seconds save how much data has changed

# save "" # 不使用RDB存储 不能主从
# save 900 1 # 表示15分钟(900秒钟)内至少1个键被更改则进行快照。
# save 300 10 # 表示5分钟(300秒)内至少10个键被更改则进行快照。 
# save 60 10000 # 表示1分钟内至少10000个键被更改则进行快照。

The way to trigger persistence through the configuration file is similar to the bgsave command. When the trigger condition is reached, a child process will be fork for data saving.

This method is not recommended for online Redis environments. Because the setting trigger time is too short, it is easy to write rdb files frequently, which affects server performance, and too long time setting will cause data loss.

2.2 RDB execution process (principle)

Insert picture description here
  • The Redis parent process first judges whether the child process of save, bgsave or bgrewriteaof (aof file rewriting command) is currently being executed. If it is executing, the bgsave command will return directly. (The reason why it cannot be executed at the same time is due to performance considerations, and two child processes are issued, and both of these child processes perform a large number of disk write operations at the same time, which will affect performance.
  • The parent process executes fork (calling the OS function to copy the main process) operation to create a child process. In this process, the parent process is non-blocking, and Redis can execute other commands from the client.
  • The child process creates an RDB file, generates a temporary snapshot file based on the memory snapshot of the parent process, and atomically replaces the original file after completion. (RDB is always complete)
  • The child process sends a signal to the parent process to indicate completion, and the parent process updates statistics.
  • After the parent process forks the child process, continue to work.

2.3 RDB file structure

Insert picture description here


2.3.1 REDIS

The beginning part of the RDB file is the REDIS part. This part is 5 bytes long and holds five characters of "REDIS". Through these five characters, when the program loads a file, it can quickly check whether the file is an RDB file.

2.3.2 db_version

The length of db_version is 4 bytes, and the value is an integer represented by a string. This integer records the version number of the RDB file (not the Redis version number). For example, "0006" means that the version of the RDB file is the sixth version.

2.3.3 databases

The databases part of an RDB file can store any number of non-empty databases.

For example, if the server's database No. 0 and No. 3 are not empty, the server will create the RDB file as shown in the figure below. In the figure, database 0 represents all key-value pair data of database No. 0, and database 3 represents all key values ​​of database No. To the data.

Insert picture description here


Each non-empty database can be saved as SELECTDB, db_number, and key_value_pairs in the RDB file. As shown below:

Insert picture description here


The length of the SELECTDB constant is 1 byte. When the reading program encounters this value, it knows that it will read a database number next.

db_number stores a database number, and performs database switching according to the read database number, so that the key-value pairs read later can be loaded into the correct database.

key_value_pairs stores all the key-value pair data in the database. If the key-value pair has an expiration time, then the expiration time of the key-value pair will also be stored.

2.3.4 EOF

End sign

2.3.5 check_sum

Checksum is to see if the file is damaged or modified.


Finally, take a look at a complete RDB file, as shown in the following figure:

Insert picture description here

Three, AOF

The AOF (Append Only File) persistence method will record every write operation command from the client to the server, and append these write operations to the end of the file with the suffix aof in the Redis protocol. When the Redis server restarts, it will be loaded and run. aof file command to achieve the purpose of restoring data.

Insert picture description here


AOF records the process, RDB only cares about the results.

3.1 Open AOF persistence mode

Redis is not turned on by default AOF persistent way, we can redis.confturn on and a more detailed configuration profile.

appendonly no  # 默认不开启,需要开启的话要改成yes。
appendfilename "appendonly.aof" # aof文件名
dir ./ # AOF文件的保存位置和RDB文件的位置相同,都是通过dir参数设置的。 
# appendfsync always # 写入策略
appendfsync everysec # 写入策略
# appendfsync no     # 写入策略

no-appendfsync-on-rewrite no # 默认不重写aof文件

3.2 AOF implementation principle

The AOF file stores redis commands. The entire process of synchronizing commands to AOF files can be divided into three stages:

  • 命令传播: Redis sends the executed command, command parameters, number of command parameters and other information to the AOF program.
  • 缓存追加: The AOF program converts the command to the format of the network communication protocol according to the received command data, and then appends the content of the protocol to the AOF cache of the server.
  • 文件写入和保存: The content in the AOF cache is written to the end of the AOF file. If the set AOF saving condition is met, the fsync function or fdatasync function will be called to actually save the written content to the disk.

3.2.1 Command propagation

When a Redis client needs to execute a command, it sends the protocol text to the Redis server through a network connection. After the server receives the client's request, it will select the appropriate command function according to the content of the protocol text, and convert each parameter from a string text to a Redis string object (StringObject). Whenever the command function is successfully executed, the command parameters will be propagated to the AOF program.

3.2.2 Cache Append

After the command is propagated to the AOF program, the program will convert the command from the string object back to the original protocol text according to the command and its parameters. After the protocol text is generated, it will be appended to the end of aof_buf in the redis.h/redisServer structure.

The redisServer structure maintains the state of the Redis server, and the aof_buf field holds all the protocol text (RESP) waiting to be written to the AOF file.

3.2.3 File writing and saving

Whenever the server routine task function is executed, or the event handler is executed, the aof.c/flushAppendOnlyFile function will be called. This function performs the following two tasks:

  • WRITE: According to the conditions, write the buffer in aof_buf to the AOF file.
  • SAVE: According to the conditions, call the fsync or fdatasync function to save the AOF file to the disk.

In order to improve the efficiency of file writing, in modern operating systems, when the user calls the write function to write data to the file, the operating system usually temporarily stores the data in a memory buffer. When the buffer is full or exceeds the specified value After the time limit, the data in the buffer is actually written to the hard disk.

Although this kind of operation improves efficiency, it also brings safety problems: if the computer stops, the data in the memory buffer will be lost. Therefore, the system also provides synchronization functions such as fsync and fdatasync, which can force the operating system to write the data in the buffer to the hard disk immediately, thereby ensuring data security.

3.2.4 AOF save mode

The synchronization file strategy of the AOF buffer area is controlled by the parameter appendfsync, and the meaning of each value is as follows:

  • AOF_FSYNC_NO:do not save.
  • AOF_FSYNC_EVERYSEC: Save once every second. (default)
  • AOF_FSYNC_ALWAYS: Save every time a command is executed. (Not recommended)

3.2.4.1 AOF_FSYNC_NO

In this mode, WRITE will be executed every time the flushAppendOnlyFile function is called, but SAVE will be skipped.

In this mode, SAVE will only be executed in any of the following situations:

  • Redis is shut down
  • AOF function is turned off
  • The write cache of the system is refreshed (maybe the cache is full, or the regular save operation is executed)

The SAVE operation in these three cases will cause the Redis main process to block.

3.2.4.2 AOF_FSYNC_EVERYSEC

In this mode, SAVE will be executed every second in principle, because the SAVE operation is called by the background child process (fork), so it will not cause the server main process to block.

3.2.4.3 AOF_FSYNC_ALWAYS

In this mode, after executing a command every time, WRITE and SAVE will be executed.

In addition, because SAVE is executed by the Redis main process, during the execution of SAVE, the main process will be blocked and cannot accept command requests.

For the three AOF saving modes, their blocking of the main server process is as follows:

Insert picture description here

Four, AOF rewrite

4.1 Principle of AOF file rewriting

After understanding the principles of the two persistence methods, RDB and AOF, let’s look at the Ali interview question of my fan reader: What should I do if the AOF file in Redis is too large?

This will use the AOF rewrite mechanism. Because AOF persistence records the state of the database by saving the executed write commands. As the content of AOF files increases, the size of the file becomes larger and larger. If the AOF file is not controlled, it may affect the Redis server.

For example, if the client executes the following command:

rpush list 1 2 // [1,2]
rpush list 3 // [1,2,3]
rpush list 4 5 6 // [1,2,3,4,5,6]
lpop list 1 // [2,3,4,5,6]
lpop list 2 // [3,4,5,6]
rpush list 7 // [3,4,5,6,7]

Then just record the status of list, the AOF file needs to save six commands. In actual online applications, write commands are definitely more frequent and larger, not to mention the online recording of the status of many keys.

In order to solve the problem of AOF file volume expansion, Redis provides a file rewrite (rewrite) function. With this function, the Redis server can create a new AOF file to replace the old AOF file. The rewritten new AOF file contains the minimum set of commands required to restore the current data set. The so-called "rewriting" is actually an ambiguous term. In fact, AOF rewriting does not require any writing or reading of the original and old AOF files. The database statuses saved in the new and old AOF files are the same. However, the new AOF file will not contain any redundant commands that waste space, so the volume of the new AOF file is usually much smaller than the size of the old AOF file.

As you can see from the above, in order to record the status of the list, the AOF file needs to save six commands. If the server wants to record the state of the list key with as few commands as possible, the simplest and most efficient way is not to read and analyze the contents of the existing AOF file, but to read the value of the key list directly from the database, and then use rpush list 3 4 5 6 7Commands replace the six commands saved in the AOF file, so that the commands needed to save in the list key can be reduced from six to one.

Except for the set keys above, all other types of keys can be used in the same way to reduce the number of commands in the AOF file. First, read the current value of the key from the database, and then use one command to record the key-value pair, instead of multiple commands that record the key-value pair before. This is the realization principle of the AOF rewrite function.

4.2 AOF background rewrite

The above AOF rewrite function is implemented by the aof_rewrite function, but this function will perform a lot of write operations, so calling this thread will be blocked for a long time, because the Redis server uses a single thread to process command requests, so if the server calls directly If the aof_rewrite function is used, the server will not be able to process the command request sent by the client during the rewriting of the AOF file.

Redis does not want AOF rewriting to cause the server to be unable to process the request, so Redis decided to put the AOF rewrite program in the background subprocess for execution. The biggest advantage of this processing is:

  • While the child process is performing AOF rewriting, the main process can continue to process command requests.
  • The child process has a copy of the data of the main process, and the use of the child process instead of the thread can ensure the security of the data while avoiding locks.

However, there is a problem with the use of subprocesses: because the subprocess is performing AOF rewriting, the main process still needs to continue to process commands, and new commands may modify existing data, which will make the current database data and The data in the rewritten AOF file is inconsistent.

In order to solve this problem, Redis adds an AOF rewrite cache. This cache is enabled after the child process forks. After receiving a new write command, the Redis main process will append the protocol content of the write command to the existing In addition to the AOF file, it will also be appended to this cache.

Insert picture description here


4.3 Analysis of the rewriting process

In the process of creating a new AOF file, Redis will continue to append commands to the existing AOF file. Even if there is a downtime during the rewriting process, the existing AOF file will not be lost. Once the new AOF file is created, Redis will switch from the old AOF file to the new AOF file and start appending the new AOF file.

When the child process is performing AOF rewriting, the main process needs to perform the following three tasks:

  • Processing order request
  • Append the write command to the existing AOF file
  • Append the write command to the AOF rewrite cache

In this way, you can guarantee:

  • Existing AOF functions will continue to execute, even if there is a downtime during AOF rewriting, there will be no data loss.
  • All commands to modify the database will be recorded in the AOF rewrite cache.
  • When the child process completes the AOF rewriting, it will send a completion signal to the parent process. After receiving the completion signal, the parent process will call a signal processing function and complete the following tasks:
  • Write all the contents in the AOF rewrite cache to the new AOF file.
  • Rename the new AOF file to overwrite the original AOF file.

After the signal processing function is executed, the main process can continue to accept command requests as usual. In the entire AOF background rewriting process, only the final write cache and rename operation will cause the main process to block. At other times, the AOF background rewrite will not block the main process, which will cause AOF rewriting to performance The impact is minimized.

The above is the AOF background rewriting, which is the working principle of the BGREWRITEAOF command.

4.4 Trigger method

4.4.1 Configure trigger

Configure in redis.conf

# 表示当前aof文件大小超过上一次aof文件大小的百分之多少的时候会进行重写。如果之前没有重写过,以启动时aof文件大小为准
auto-aof-rewrite-percentage 100

# 限制允许重写最小aof文件大小,也就是文件大小小于64mb的时候,不需要进行优化 
auto-aof-rewrite-min-size 64mb

4.4.2 Execute bgrewriteaof command

127.0.0.1:6379> bgrewriteaof
Background append only file rewriting started

Five, Redis 4.0 hybrid persistence

When restarting Redis, we rarely use RDB to restore the memory state, because a large amount of data will be lost (all data changed since the last snapshot will be lost). We usually use AOF log replay, but replaying AOF logs is much slower than using RDB, so when the Redis instance is large, it takes a long time to start.

In order to solve this problem in Redis 4.0, a new persistence method-hybrid persistence has come out. As shown in the figure below, the contents of the RDB file and the incremental AOF log file are stored together. The AOF log here is no longer a full log, but an incremental AOF log that occurred during the period from the start of persistence to the end of persistence. Usually this part of the AOF log is very small.

Insert picture description here


If the mixed persistence is turned on, the content of the rdb is directly written to the beginning of the aof file during aof rewrite.

Turn on hybrid persistence aof-use-rdb-preamble yes

Insert picture description here


We can see that the AOF file is the header of the rdb file and the content of the aof format. When loading, it will first identify whether the AOF file starts with the REDIS string. If it is, it will be loaded in the RDB format. After loading the RDB, continue to press the AOF format Load the rest. In this way, the previous AOF full file playback can be completely replaced, and the restart efficiency is greatly improved.

6. Comparison of RDB and AOF

6.1 RDB

advantage:

  • Compared with the AOF method, recovering data through the rdb file is faster.
  • RDB is a binary compressed file, which occupies a small space and is convenient for transmission (to slaver), data backup, etc.
  • Data backup through RDB, because the main process forks the child process for persistence, it has little impact on the performance of the Redis server.

Disadvantages:

  • If the server is down, the use of RDB will cause data loss in a certain period of time. For example, if we set to synchronize once when it reaches 1000 writes in 5 minutes, then if the trigger condition is not reached, the server will crash, then this period of time The data will be lost.
  • Using the save command will cause the server to block, until the data synchronization is completed to receive subsequent requests.
  • When using the bgsave command to fork a child process, if the amount of data is too large, the fork process will also be blocked. In addition, the fork child process will consume memory.

6.2 AOF

advantage:

  • If AOF is set to save once per second, at most 2 seconds of data will be lost. Data loss is less than RDB.
  • When AOF writes a file, a del command will be added to the expired key. When AOF rewrite is executed, the expired key and del command will be ignored.

Disadvantages:

  • The generated log file is too large, even if it is rewritten through AOF, the file size is still very large.
  • The speed of restoring data is slower than RDB.