How to properly use database read-write separation


In the early stage of the application system development, we don’t know what scale it will develop in the future, so we will not consider the complex system architecture at the beginning. The complex system architecture is time-consuming and labor-intensive, and the development cycle is long. It doesn't match. Therefore, we will adopt a simple architecture. With the continuous development of the business and the increasing number of visits, we will optimize the architecture of the system.

Architecture evolution

At the initial stage of the system establishment, our architecture was very simple, mainly to meet the normal operation of the business, as shown in the figure:

Insert picture description here

But with the increase in the number of visits, people have higher requirements for the reliability of the system. Therefore, in order to avoid single Point of failure, the horizontal expansion of the system application layer, as shown in the figure: In

Insert picture description here

this way, the high availability of the system application layer is ensured. In the event of a downtime or system upgrade, the system is still available to the outside world. Moreover, when the number of visits increases, the pressure of the system application layer will also be shared, so that the pressure of each single system application is within a reasonable range.
However, as the number of visits increases, all the pressure will be concentrated on the database layer. So how do we deal with the database layer? Can it be extended like the application layer of the system? The answer is of course no. Let's imagine that if the database layer is also scaled horizontally like the system application layer, as shown in the figure:

Insert picture description here

Then, if the system application layer generates a piece of data, should this data be inserted into DB1 or DB2? Assuming that DB1 is inserted, when this data is read, how does the application layer know which database to read this data from? Is the problem very complicated? If the database is not expanded, then a database cannot carry such a large amount of visits, so what should we do?

Database read and write separation

There are always more solutions than problems. With the development of Internet technology and the in-depth study of the Internet by generations of Internet people, people have found that the system application on the Internet is an application that reads more and writes less, such as the e-commerce system in our course, and the product The number of views is more than that of placing an order. The load on the database is high, mainly caused by these read requests, so can we separate the read operation from the write operation, so that all read requests fall on the database specifically responsible for reading, and all write operations fall on the special responsibility On the written database, the data in the write library is synchronized to the read library, so as to ensure that all data modifications can be obtained from the read library when reading. The system architecture is shown in the figure:

Insert picture description here

if the system has more read requests, Several more reading libraries can be deployed, so that read requests can be evenly distributed to multiple reading libraries, reducing the pressure on each reading library. But when writing data, the data should fall into a certain and unique writing library. In the picture above, we only have one writing library. Of course, you can deploy multiple writing libraries, but how to fragment data is a very important issue. We will introduce this issue to you in subsequent courses. At present, only one writing library is used as an example. For example, when a merchant publishes a product, the data of this product is placed on the writing library. At the same time, the writing library synchronizes this data to two reading libraries. When the buyer browses the product on the website , The product data will be read from the reading library. As for which read library to retrieve the data from, it depends on the routing situation of the request at that time.
In short, stripping a large number of read operations from the database, allowing read operations to read data from a dedicated read database, greatly eases the access pressure of the database, and also greatly improves the response speed of reading data. So is there any downside to read-write separation? Is the architecture of read-write separation applicable to all scenarios?

Disadvantages of separation of reading and writing

Read and write separation brings us many benefits. Let’s compare the original architecture and the architecture of separation of read and write. From the perspective of data flow, the difference between them is that data is written to the database to be retrieved from the database. The write-separated architecture adds a synchronous operation. Think about it, everyone, how long is the synchronization operation, if the delay is too large, will it affect the system? What if the synchronization hangs? Take a personally experienced case. In the order list page of the personal center, this function is quite simple. You only need to take out the order data and display it on the page. But when doing it, the order and the order-related data are taken from the reading library, including the payment status, which is a very sensitive field for the user. At a certain time of the day, I suddenly received a large number of complaints from users, saying that the user had paid, but the status of the order was still unpaid. I also found it very strange. I asked for an order number immediately and checked it in the database. I found that the order status was unpaid. No problem. After a while, to be on the safe side, I'd better go to the writing library and check the order again. I found that the order status of the writing library was indeed paid. After the order was finished, the writing and reading data were inconsistent. I immediately notified the DBA and asked him to check the database. His feedback was that the synchronization was down.
You can see, this is the drawback of the separation of read and write. When the synchronization hangs or the synchronization delay is relatively large, the data of the write library and the read library are inconsistent. This data is inconsistent. Can the user accept it? Of course, the order payment status is inconsistent. It is unacceptable. Can other business scenarios be acceptable? This requires a specific analysis of different business scenarios.

How to use read-write separation correctly

For some business scenarios that do not require high real-time data, consider using read-write separation. However, for scenarios with high requirements for real-time data, such as order payment status, it is not recommended to use read-write separation, or when you are writing a program, honestly read data from the writing library. I have also consulted organizations that specialize in data synchronization. Their advice is that if you do data synchronization, your network delay should be within 5ms. This requires a very high network environment. You can ping your network. The other machines in China, see if they can meet this standard. If your network environment is very good and meets the requirements, there is no problem using read-write separation. The data is synchronized to the read library almost in real time, and there is no delay at all.
Reading and writing separation, let me introduce you to this. When you use it, you still have to start from the business and see if your business is suitable for reading and writing separation. Each technical architecture has its own advantages and disadvantages. , Only fit and not fit. Only an architecture suitable for the business is a good architecture.