System resource saturation for performance analysis

Article Directory

I. Introduction

When doing performance analysis, we inevitably judge whether the resources are sufficient or not. Where is it not enough? Why not enough? What is the evidence?

It is not easy to answer these questions.

Today, let’s talk about how to measure operating system resource saturation?

Now k8s is prevalent, so here is to borrow Prometheus+Grafana deployed in k8s to visually look at the pictures.

Two, CPU resources

First look at a graph:

Insert picture description here


one side is the CPU usage rate, and the other side is the CPU saturation.

How is saturation calculated?

See what its Query looks like:

node:node_cpu_saturation_load1:{cluster="$cluster"} / scalar(sum(min(kube_pod_info{cluster="$cluster"}) by (node)))

That is node_cpu_saturation_load1calculated, then this node_cpu_saturation_load1what is the basis of the data? Here comes its source:

 sum by (node) (    
          node_load1{job="node-exporter"}    
        * on (namespace, pod) group_left(node)    
          node_namespace_pod:kube_pod_info:    
        )    
        /    
        node:node_num_cpu:sum    
      record: 'node:node_cpu_saturation_load1:'

The data within 1min of load average, node_exporter has also been realized at the same time node_load5/node_load15. Respectively correspond to our common load average of 1 minute, 5 minutes, and 15 minutes in Linux.

Where does this load average come from? I have described it in the previous article, and also said that this value is used to judge the limitations of the system load. It will not be expanded here.

After knowing the source of this CPU saturation, let's look at the figure above. That is to say, when we judge whether the CPU is enough, we not only look at the CPU usage rate, but also look at the CPU saturation .

Three, memory resources

As shown in the figure below:

Insert picture description here


Swap IO is added after the memory saturation here. In fact, those who know memory will know that the word swap really has a specific meaning, swap partition, but when we configure k8s, we will know that swap is off. But what is swap here?

Let's look at its Query statement again.

node:node_memory_swap_io_bytes:sum_rate{cluster="$cluster"}

We took node_memory_swap_io_bytesthis value, the value of what prometheus where is it?

- expr: |    
        1e3 * sum(    
          (rate(node_vmstat_pgpgin{job="node-exporter"}[1m])    
         + rate(node_vmstat_pgpgout{job="node-exporter"}[1m]))    
        )    
      record: :node_memory_swap_io_bytes:sum_rate

Take the page in/out of vmstat. This understands that swap does not specifically refer to swap partitions. It's page swapping. It doesn't matter if there is no swap partition. Page swapping still needs to be done.

The page swap does not necessarily mean that the memory is used up. It means that the page to be used cannot be found in the memory and there must be page in. After the memory size of a variable or object is defined in the code, it will page out when it is not enough.

So judging whether the memory is enough, not only depends on whether the memory is used up, but also whether the page in/out is generated .

Fourth, disk resources

Insert picture description here

The IO saturation of the disk is relatively easy to judge. Let's take a look at how it is judged, and also take a look at its Query:

node:node_disk_saturation:avg_irate{cluster="$cluster"} / scalar(:kube_pod_info_node_count:{cluster="$cluster"})

Take node_disk_staturation, and calculate avg_irate at the same time. Let's take a look at the source of this value.

- expr: |    
        avg by (node) (    
          irate(node_disk_io_time_weighted_seconds_total{job="node-exporter",device=~"nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+"}[1m])    
        * on (namespace, pod) group_left(node)    
          node_namespace_pod:kube_pod_info:    
        )    
      record: node:node_disk_saturation:avg_irate

It takes a node_disk_io_time_weighted_seconds_totalvalue, which is a weighted cumulative value. And this value is derived from avgqu-sz in iostat, this value is the length of the IO queue.

So you know the source of this saturation

Five, network resources

Insert picture description here

In the judgment of network resources, a very direct word dropped is used here. The intuitive understanding is packet loss, look at its Query statement.

node:node_net_saturation:sum_irate{cluster="$cluster"}

The value of node_net_saturation is called here, and this value is not intuitive enough to know what it is.

Let's take a look at its source:

- expr: |    
        sum by (node) (    
          (irate(node_network_receive_drop_total{job="node-exporter",device!~"veth.+"}[1m]) +    
          irate(node_network_transmit_drop_total{job="node-exporter",device!~"veth.+"}[1m]))    
        * on (namespace, pod) group_left(node)    
          node_namespace_pod:kube_pod_info:    
        )    
      record: node:node_net_saturation:sum_irate  

From this code, it is clear that this is from the received drop. Good intuitive name.

In fact, judging the network bottleneck is not only whether the packet is lost or not, but also depends on the queue. If there has been packet loss, the network quality must be poor.

Six, summary

In fact, no matter what tool we use to look at performance data, we need to know its source and the meaning of its value so that we can judge it accurately.

On many occasions, whether it is a project or some sharing, I have emphasized that in performance analysis, we must first understand the meaning of each data and then see how it is displayed in the selected tool.

Some people think that the previous monitoring platform or APM can know where the bottleneck is. This concept is definitely problematic.

Laying a good foundation is still the focus of learning .