[Redis] Interview Questions (Continuous Update...)

Article Directory

Why use Redis (why use cache)

Mainly look at this problem from two points: "high performance" and "high concurrency"

High performance :

Suppose the user accesses some data in the database for the first time. This process will be slower because it is read from the hard disk. The data accessed by the user is stored in the data cache, so that the data can be directly obtained from the cache when the data is accessed again next time. Operating the cache is to directly manipulate the memory, so the speed is quite fast. If the corresponding data in the database is changed, the corresponding data in the cache can be changed synchronously!

High concurrency

Direct operation of the cache can withstand requests far greater than direct access to the database, so we can consider transferring part of the data in the database to the cache, so that part of the user's requests will go directly to the cache without going through the database.

Why use Redis instead of map/guava for caching

Cache is divided into local cache and distributed cache .

Taking Java as an example, using the built-in map or guava to achieve local caching, the main feature is light weight and fast speed , the life cycle ends with the destruction of the JVM , and in the case of multiple instances, each instance is Need to keep a copy of the cache, the cache is not consistent .

Using Redis or Memcached is called distributed cache. In the case of multiple instances, each instance shares a copy of cached data, and the cache is consistent . The disadvantage is that it is necessary to maintain the high availability of Redis or Memcached services, and the entire program architecture is more complicated.

Why is Redis so fast

  • Based entirely on memory, most of the requests are pure memory operations, very fast. Data is stored in memory, similar to HashMap. The advantage of HashMap is that the time complexity of search and operation is both O (1) O (1) O ( 1 )
  • The data structure is simple, and the data operation is also simple. The data structure in Redis is specially designed (such as ziplist, jump list)
  • Single thread is used to avoid unnecessary context switching and race conditions. There is no switching caused by multiple processes or threads to consume CPU. There is no need to consider various lock issues. There is no lock release operation, and there is no possibility Performance consumption caused by deadlock
  • Use multiple I/O multiplexing model, non-blocking IO

Redis basic data types

Redis mainly has 5 data types, including String, List, Set, Zset, and Hash, which meet most of the requirements for use. Why do you need to know each specific data type?

Jump table at the bottom of Zset

Refer to 【Redis】Basic operation

Redis persistence mechanism and their advantages and disadvantages

Reference [Redis] Persistence

Redis expired key deletion strategy

We all know that Redis is a key-value database, and we can set the expiration time of the key cached in Redis. Redis's expiration strategy refers to how Redis handles when the key cached in Redis expires.

Passive cleanup

Only when a key is accessed, will it be judged whether the key has expired, and if it expires, it will be cleared and nil is returned. This strategy can maximize the saving of CPU resources, but it is very unfriendly to memory. In extreme cases, a large number of expired keys may not be accessed again, so they will not be cleared and occupy a lot of memory.

Clean up regularly

In Redis persistence, we know that Redis will periodically execute a function in order to maintain the stability and robustness of the system. In this process, the automatic persistence operation mentioned before will be carried out, and the active memory cleaning will also be carried out.
In the process of active memory cleaning, Redis uses a random algorithm to carry out this process: Simply put, Redis will randomly select N (default 100) keys with an expiration time set, and check the keys that have expired. Clear it. At the same time, if the expired keys exceed a certain percentage M (the default is 25), an active cleanup will continue until the percentage of expired keys drops below M in probability.

Trigger active cleanup when memory is insufficient

Active cleanup will also be triggered when Redis's memory is insufficient.

The configuration file maxmemory-policyis used to set the deletion policy at the memory overrun

Insert picture description here
  • volatile-lru: Use the LRU algorithm to remove keys, only for keys with an expiration time (least recently used)
  • allkeys-lru: In all set keys, use the LRU algorithm to remove keys
  • volatile-random: Remove random keys from the expiration set, only the keys with expiration time set
  • allkeys-random: Remove random keys from all set keys
  • volatile-ttl: Remove the keys with the smallest TTL value, that is, the keys that will expire recently
  • noeviction: Do not remove, only return error message for write operation

At the same time, the sampling number of these strategies can also be configured in the configuration file maxmemory-samples:

Insert picture description here
  • Both the LRU algorithm and the minimum TTL algorithm are not accurate algorithms, but estimated values, so you can set the sample size, Redis will check so many keys by default and select the one with LRU
  • Generally set a number from 3 to 7. The smaller the value, the less accurate the sample, but the lower the performance cost.

How does Redis set the expiration time

2. EXPIREAT and PEXPIREAT

EXPIREAT

Interface definition: EXPIREAT key timestamp
Interface description: Set a key to expire after timestamp (timestamp (seconds)). Return 1 means the setting is successful, return 0 means the key does not exist or the expiration time cannot be set.

PEXPIREAT

Interface definition: PEXPIREAT key milliseconds-timestamp
Interface description: Set a key to expire after milliseconds-timestamp (timestamp (milliseconds)). Return 1 means the setting is successful, return 0 means the key does not exist or the expiration time cannot be set.

3. TTL and PTTL

TTL

Interface definition: TTL key
Interface description: Get the expiration time of the key. If the key has an expiration time, return the remaining survival time (seconds); if the key is permanent, return -1; if the key does not exist or has expired, return -2.

PTTL

Interface definition: PTTL key
Interface description: Get the expiration time of the key. If the key has an expiration time, return the remaining survival time (milliseconds); if the key is permanent, return -1; if the key does not exist or has expired, return -2.

4. PERSIST

PERSIST

Interface definition: PERSIST key
Interface description: Remove the expiration time of the key and convert it to a permanent state. If it returns 1, it means the conversion was successful. If it returns 0, it means that the key does not exist or has been in a permanent state before.

5. SETEX

SETEX

Interface definition: SETEX key seconds value
Interface description: SETEX is logically equivalent to the combined operation of SET and EXPIRE. The difference is that SETEX is a command, and the execution of the command is atomic, so there will be no concurrency problems.

What happens when Redis runs out of memory

If the set upper limit is reached, Redis write commands will return an error message (but read commands can also return normally.) Or you can configure the memory elimination mechanism, when Redis reaches the memory limit, the old content will be washed away.

Redis thread model

Redis 6 adds multi-threading, but it is somewhat different from Memcached's implementation of multi-threading from IO processing to data access. The multi-threaded part of Redis is only used to process network data reading and writing and protocol analysis, and the execution of commands is still single-threaded. The reason for this design is that you don't want to be complicated by multithreading, you need to control the concurrency problems of key, lua, transaction, LPUSH, LPOP, etc. The overall design is roughly as follows:

Insert picture description here

In addition, multi-threaded IO is not turned on by default, and needs to be configured in the configuration file

io-threads-do-reads yes

io-threads 4

Redis transaction

Reference [Redis] Transaction and Lock

Redis master-slave architecture

Redis sentinel mode

Reference 【Redis】Master-slave replication

Redis cluster

Refer to 【Redis】Cluster

Consistent hash algorithm

Reference consistent hashing

Redis implements distributed locks

Refer to [Redis] Frequently Asked Questions

Cache exception

Including cache penetration , cache breakdown , cache avalanche, etc., refer to [Redis] FAQ

Bloom filter

Intuitively speaking, the bloom algorithm is similar to a hash set, which is used to determine whether a certain element (key) is in a certain set.

Different from the general hash set, this algorithm does not need to store the value of the key. For each key, only k bits are needed, and each one stores a flag to determine whether the key is in the set.

algorithm:

  1. First, we need k hash functions, each of which can hash the key into an integer
  2. When initializing, an array of length n bits is required, and each bit is initialized to 0
  3. When a key is added to the set, use k hash functions to calculate k hash values, and set the corresponding bit position in the array to 1
  4. When judging whether a key is in the set, use k hash functions to calculate k hash values ​​and query the corresponding bits in the array. If all bits are 1, it is considered to be in the set

advantage:

  • Compared with other data structures, Bloom filters have huge advantages in space and time. Bloom filter storage space and insert/query time are constant. In addition, Hash functions have no relationship with each other, which is convenient for parallel implementation by hardware. Bloom filter does not need to store the element itself, which has advantages in some occasions where confidentiality requirements are very strict.

Disadvantages:

As the number of stored elements increases, the miscalculation rate increases, but if the number of elements is too small, a hash table is sufficient.

In addition, in general, elements cannot be removed from the Bloom filter. It is easy for us to think of turning the bit array into an integer array, adding 1 to the counter corresponding to each insertion of an element, so that the counter is decremented when the element is deleted. However, to ensure safe deletion of elements is not so simple. First of all, we must ensure that the deleted element is indeed in the Bloom filter. This is not guaranteed by this filter alone. In addition, counter wrap can also cause problems.