Redis cache penetration, cache breakdown, and cache collapse solutions

The cache is normal

Normally, if one of our requirements is to access redis query, then if there is data in redis, the result will be directly returned. Assuming that this result does not exist in redis, get the data from the mysql database and load it into redis, and then directly access redis to get the data from the cache.

Knowing the normal principle, let's analyze the conditions and solutions of cache penetration, cache breakdown, and cache avalanche.

Cache penetration (emphasize that a single data cannot be obtained repeatedly)

The concept of cache penetration

Cache penetration refers to querying a piece of data that does not exist. For example: the redis from the cache is not hit, and it needs to be queried from the mysql database. If no data is found, it will not be written to the cache, which will cause the non-existent data to be queried in the database every time it is requested, causing cache penetration.


Solution ideas for cache penetration

  1. If the query database is also empty, directly set a default value and store it in the cache, so that the second time you get the value from the cache, you will not continue to access the database. Set an expiration time or replace the value in the cache when there is a value.
  2. You can set some formatting rules for the key, and then filter out the keys that do not meet the rules before querying. The commonly used scheme is the Bloom filter.

Cache breakdown (emphasizes a large-area access that temporarily invalidates a hot data)

The concept of cache breakdown

Redis has a regular cleanup function. If a certain hotspot data is accessed very frequently.

At some point, redis expired and cleaned up this hot data, and it happened that there was a large number of query requests to access this hot data. Redis is not at this time, so all requests will go to access the mysql database, which may overwhelm mysql.

Solutions to cache breakdown

  1. Use mutex locks. That is, when the cache is invalid, the previous lock blocks other requests before going to the database to query the latest data. The data query, cache update and other operations are completed by the thread that successfully acquires the lock. After that, all threads can continue to access redis normally.

As you can see, this is quite similar to the singleton lazy man mode implementation.

  1. Set hotspot data to never expire.

Cache avalanche (emphasize that redis is completely invalid)

Cache avalanche, as the name implies, means that the redis server is down or restarted at a certain time, which causes all requests to check the database and overwhelms the database. In fact, these examples are not common in a production environment. A common example is: all Key expiration times are refreshed at 12 noon. When a large number of users flood in from a spike activity at zero, then redis is also useless at this time.

Cache avalanche solution (cache refresh time is scattered)

The expiration time of cached data is set randomly to prevent a large amount of data from expiring at the same time.

If the cache database is deployed in a distributed manner, evenly distribute the hot data in different cache databases. (Further avoid simultaneous cache invalidation)

Eh? Then the avalanche caused by redis restart or downtime is not solved? Of course it has to be solved, but this is part of the high availability of redis. I