My mainframe memory is only 100G, and now I want to scan a 200G large table, will it use up the memory of the DB mainframe?
When doing logical backup, isn't it the whole library scan? If this will eat up the memory, isn't the logical backup already dead?
So a full table scan of a large table looks like it should be no problem. Why is this?
The impact of full table scan on the server layer
Suppose, we now want to perform a full table scan on a 200G InnoDB table db1.t. Of course, if you want to save the scan results on the client, you will use a command like this:
mysql -h$host -P$port -u$user -p$pwd -e "select * from db1.t" > $target_file
InnoDB data is stored on the primary key index, so the full table scan actually scans the primary key index of table t directly. Since this query statement has no other judgment conditions, each row found can be directly placed in the result set and then returned to the client.
So, where does this "result set" exist?
The server does not need to save a complete result set. The process of fetching and sending data is as follows:
- Get a line and write it to net_buffer . The size of this memory is defined by the parameter net_buffer_length , the default is 16k
- Get rows repeatedly, until net_buffer is full, call the network interface to send it out
- If the transmission is successful, clear the net_buffer , then continue to fetch the next line and write to the net_buffer
- If the send function returns EAGAIN or WSAEWOULDBLOCK , it means that the local network stack (socket send buffer) is full and enters the wait. Continue to send until the network stack is writable again
- Query result sending process
- In the process of sending a query, the maximum memory occupied by MySQL is as large as net_buffer_length , which will not reach 200G
- The socket send buffer cannot reach 200G (default definition /proc/sys/net/core/wmem_default). If the socket send buffer is full, the process of reading data will be suspended.
So MySQL is actually "reading and posting". This means that if the client is slow to receive, it will cause the MySQL server to fail to send the results, and the transaction will take longer to execute.
For example, the following state is the result seen on the server's show processlist when the client does not read the contents of the socket receive buffer .
- Server-side sending block
If you see that the State is always "Sending to client", it means that the server-side network stack is full.
If the client uses the -quick parameter, it will use the mysql_use_result method: read one line and process one line. Assuming that the logic of a certain business is complex, if the logic to be processed after each line of data is read is very slow, it will cause the client to take a long time to fetch a line of data, and the above result may appear.
Therefore, for normal online business, if a query does not return many results, it is recommended to use the mysql_store_result interface to directly save the query results to local memory.
Of course, the premise is that the query returns not many results. If there are too many, the client occupies nearly 20G of memory because of a large query. In this case, you need to use the mysql_use_result interface instead .
If you see many threads in "Sending to client" in the MySQL that you are responsible for maintaining, it means that you want business development students to optimize the query results and evaluate whether so many returned results are reasonable.
To quickly reduce the number of threads in this state, you can set net_buffer_length to be larger.
Sometimes, I see a lot of query statements on the instance that the status is "Sending data", but there is nothing wrong with checking the network. Why does Sending data take so long?
The state change of a query statement is like this:
- After the MySQL query statement enters the execution stage, first set the status to Sending data
- Then, send the column related information (meta data) of the execution result to the client
- Then continue the process of executing the statement
- After the execution is complete, set the status to an empty string.
That is, "Sending data" does not necessarily mean "sending data", but may be at any stage in the actuator process. For example, you can construct a lock waiting scene, and you can see the Sending data status.
Read the entire table is locked:
select * from t where id=1 for update
|select * from t lock in share mode |
- Sending data status
It can be seen that session2 is waiting for the lock, and the status is displayed as Sending data.
- Only when a thread is in the state of "waiting for the client to receive the result" will it display "Sending to client"
- If it displays as "Sending data", it means just "executing"
Therefore, the result of the query is sent to the client in segments, so the entire table is scanned, and the query returns a large amount of data, which will not burst the memory.
The above is the processing logic of the server layer. How is it handled in the InnoDB engine?
The impact of full table scan on InnoDB
One function of InnoDB memory is to save the updated results and cooperate with the redo log to avoid random disk writing.
The data pages of the memory are managed in the Buffer Pool (referred to as BP), and BP plays a role in accelerating the update in WAL.
BP can also speed up queries.
- Because of WAL, when the transaction is committed, the data page on the disk is old. If there is a query to read the data page immediately, should the redo log be applied to the data page immediately?
No need. Because at this time, the result of the memory data page is the latest, just read the memory page directly. At this time, the query does not need to read the disk, and the result is directly accessed from within, which is very fast. Therefore, Buffer Pool can speed up queries.
The acceleration effect of BP on the query depends on an important indicator, namely: the memory hit rate.
You can view the current BP hit rate of a system in the show engine innodb status results. Under normal circumstances, a stable service online system must ensure that the response time meets the requirements, and the memory hit rate must be above 99%.
Execute show engine innodb status, you can see the words "Buffer pool hit rate", which shows the current hit rate. For example, the hit rate in the figure below is 100%.
If all the data pages required by the query can be obtained directly from the memory, that is the best, corresponding to a hit rate of 100%.
The size of the InnoDB Buffer Pool is determined by the parameter innodb_buffer_pool_size . Generally, it is recommended to set it to 60%~80% of the available physical memory.
About ten years ago, the data volume of a single machine was hundreds of gigabytes, and the physical memory was a few gigabytes; now, although many servers can have 128G or higher memory, the data volume of a single machine has reached the T level.
Therefore, it is common for innodb_buffer_pool_size to be smaller than the amount of disk data. If a Buffer Pool is full and a data page needs to be read from the disk, then an old data page must be eliminated.
InnoDB memory management
The Least Recently Used (LRU) algorithm is used to eliminate the longest unused data.
- The basic LRU algorithm
InnoDB manages the LRU algorithm of BP, which is implemented with a linked list:
- state1, the head of the linked list is P1, indicating that P1 is the data page that has just been accessed recently
- At this time, a read request accesses P3, so it becomes state 2, and P3 is moved to the front
- State 3 indicates that the data page accessed this time does not exist in the linked list, so a new data page Px needs to be applied for in BP and added to the linked list head. But because the memory is full, new memory cannot be requested. Then clear the Pm data page memory at the end of the linked list, store the content of Px, and put it at the head of the linked list
In the end, the data page Pm that has not been accessed for the longest time is eliminated.
What happens if a full table scan is to be performed at this time? If you want to scan a 200G table, and this table is a historical data table, no business usually visits it.
Then, scanning according to this algorithm will eliminate all the data in the current BP and store it in the content of the data page accessed during the scanning process. In other words, BP mainly puts the data of this historical data table.
For a library that is doing business services, this won't work. You will see that the BP memory hit rate drops sharply, the disk pressure increases, and the SQL statement response slows down.
Therefore, InnoDB cannot directly use this LRU. In fact, InnoDB has improved the LRU algorithm.
- The improved LRU algorithm
InnoDB divides the LRU linked list into the young area and the old area in a ratio of 5:3. In the figure, LRU_old points to the first position of the old area, which is 5/8 of the entire linked list. That is, 5/8 near the head of the linked list is the young area, and 3/8 near the end of the linked list is the old area.
The improved LRU algorithm execution process:
- State 1, to access P3, because P3 is in the young area, just like the LRU before optimization, move it to the head of the linked list=》State 2
- After a visit to the new list does not exist in the current data page, then still eliminate data pages Pm, but the newly inserted data page Px, is on LRU_old at
- The data page in the old area must be judged as follows every time it is accessed:
- If the data page exists in the LRU linked list for more than 1s, move it to the head of the linked list
- If the data page exists in the LRU linked list for less than 1s, the position remains unchanged. 1s is controlled by the parameter innodb_old_blocks_time , the default value is 1000, the unit is ms.
This strategy is tailored to handle operations similar to full table scans. Or scan the 200G historical data table:
4. During the scanning process, the newly inserted data pages are placed in the old area.
5. There are multiple records in a data page. This data page will be accessed multiple times, but because it is Sequential scanning, the time interval between the first access and the last access of this data page will not exceed 1 second, so it will still be kept in the old area
6. Continue to scan the subsequent data, and the previous data page will not be Will be visited again, so there is never a chance to move to the head of the linked list (that is, the young area), and will soon be eliminated.
It can be seen that the biggest benefit of this strategy is that in the process of scanning this large table, although BP is also used, it has no effect on the young area, thus ensuring the hit rate of Buffer Pool in response to normal business queries.
MySQL uses the logic of calculating and sending at the same time, so for query results with a large amount of data, the complete result set will not be saved on the server side. Therefore, if the client does not read the results in time, it will block the MySQL query process, but it will not burst the memory.
For the InnoDB engine, due to the elimination strategy, large queries will not cause memory to skyrocket. Moreover, because InnoDB has made improvements to the LRU algorithm, the full table scan of cold data can also control the impact on the Buffer Pool.
The full table scan still consumes IO resources, so it is still not possible to directly perform a full table scan on the main database online during peak business periods.