Performance analysis of the CPU percentage in ps&top in Linux system is inconsistent?

Article Directory

I. Introduction

In the 7DGroup group today, someone asked a question. Why is the CPU percentage calculated by ps much less than that calculated by top?

2. Problem phenomenon

As shown below:

Insert picture description here
Insert picture description here


Insert picture description here


Adding it from the above picture, the difference is really big.

top inside:

800 − 16.9 − 7.6 − 22.1 − 29.9 − 8.8 − 24.4 − 16.9 − 20.3 = 653.1 800 -16.9-7.6-22.1-29.9-8.8-24.4-16.9-20.3 = 653.1 8 0 0−1 6 . 9−7 . 6−2 2 . 1−2 9 . 9−8 . 8−2 4 . 4−1 6 . 9−2 0 . 3=. 6 . 5 . 3 . . 1

And ps only has less than 300%.

Three, problem analysis

What could be like this?

We must first understand the difference between ps and top:

Top is a monitoring tool, but ps is a snapshot tool. This is an essential difference.

ps is currently taking data from the /proc/pidtaken out directory. top is always fetching data and calculating according to the refresh cycle.

How does ps calculate the CPU?

  • There are several parameters as follows:
  • System startup time: is the total length of time since the system was started.
  • Thread start time: the point in time when the thread starts.
  • Thread CPU time: The length of time the thread uses the CPU.
  • Thread time = system startup time-thread startup time
  • Thread CPU usage = thread CPU time * 1000 / thread time
  • Calculated percentage of CPU usage = thread CPU usage / 10. Thread CPU usage %10
  • for example:
  • System startup time: 15456374.085712
  • Thread start time: 9470058.848042
  • Thread CPU time: 987163
  • Thread time = 15456374.085712-9470058.848042 = 5986315.23767
  • Thread CPU usage = 987163 * 1000 / 5986315.23767 = 164.9
  • Calculated percentage of CPU usage = 164.9/10. 164.9% 10 = 16.5

The data for ps calculation percentage is taken from the /proc/ directory

  • Take the following times from /proc/pid/stat:
  • utime: CPU time consumption in user mode
  • stime: CPU time consumption in kernel mode
  • cutime: CPU time consumption in user mode, including child processes
  • cstime: CPU time consumption in kernel mode, including child processes
  • starttime: thread start time point

Time consumption is counted through CPU time slices. The basis of calculation is the CPU time slice, and the CPU frequency is the number of CPU calculations per second (but the CPU time slice is different for single-precision and double-precision floating point operations, this should be noted).

Below we only calculate from the result of the retrieved value.

If you want to use ps to calculate the CPU usage in a certain period of time like top, you can calculate it as follows.

[[email protected] 2287]# ps -p 2287 -o %cpu,cputime,etime,etimes
%CPU     TIME     ELAPSED ELAPSED
0.3 00:00:00       01:03      63

There are two parameters, etime and etimes (the headers of these two parameters are both ELAPSED). The two parameters are:

etime: The duration of the thread since it was started, the format is [DD-]hh:]mm:ss.

etimes: The duration of the thread since it was started, in seconds.

For example, the 2287 process above is as shown above when the value is first taken.

To calculate how much CPU is consumed at the current time, you need to fetch the data again:

[[email protected] 2287]# ps -p 2287 -o %cpu,cputime,etime,etimes
%CPU     TIME     ELAPSED ELAPSED
 0.3 00:00:01       04:30     270

Calculate the time:

In the previous value:

The CPU time slice consumption is: 0. The calculation process is: (00 3600+00 60+00)

The CPU time window is 63. The calculation process is: 1*60+3 = 63.

For the next value:

The CPU time slice consumption is: 1. The calculation process is: 00 3600+00 60+1

The CPU time window is: 270. The calculation process is: 4*60+30 = 270.

The CPU usage calculation is:
 ((1 − 0) / (270 − 63)) ∗ 100 = 0.4 ((1-0)/(270-63))*100 = 0.4 ( ( 1−0 ) / ( 2 7 0−6 3 ) )∗1 0 0=0 . 4
So the percentage of CPU used by this process during this time is 0.4%.

Those who are interested can also take a look at the source code of ps:

Insert picture description here


If you have objections to the calculation result of top, you can also download the source code of top.

Four, summary

There is so much logic in the above, so when we look at the resource utilization rate of a system, should we look at ps or top?

The relationship between ps and top is as follows:

Insert picture description here


top is the resource statistics in a time period. ps is the resource value at each point in time.

If you want to monitor the overall resource usage of a system, it is recommended to use top to view it.

If you want to analyze the CPU resources used by a specific thread, use ps again.