Linux性能优化从入门到实战：02 CPU篇：平均负载

Posted 2022-08-09 qccz123456

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Linux性能优化从入门到实战：02 CPU篇：平均负载相关的知识，希望对你有一定的参考价值。

每次发现系统变慢时，我们通常做的第一件事，就是执行 top 或 uptime 命令：

$ uptime
22:22:17 up 2 days, 20:14, 1 user, load average: 0.63, 0.83, 0.88
// 22:22:17 当前时间  up 2 days, 20:14 系统运行时间  1 user 正在登录用户数
// load average 过去 1 分钟、5 分钟、15 分钟的平均负载

??平均负载是指单位时间内，系统处于可运行状态和不可中断状态的平均进程数，也就是平均活跃进程数，它和 CPU 使用率并没有直接关系。
??可运行状态的进程，是指正在使用 CPU 或者正在等待 CPU 的进程，也就是我们常用 ps 命令看到的，处于 R 状态（Running 或 Runnable）的进程。
??不可中断状态的进程，是正处于内核态关键流程中的进程，并且这些流程是不可打断的，比如最常见的是等待硬件设备的 I/O 响应，也就是我们在 ps 命令中看到的 D 状态（Uninterruptible Sleep，也称为 Disk Sleep）的进程。不可中断状态实际上是系统对进程和硬件设备的一种保护机制。
??平均负载为多少时合理：在实际生产环境中，一般平均负载应该低于CPU数量的70%才是正常的，否则可能影响进程响应速度。但该值并不是绝对的，最好的方式是将系统的平均负载监控起来，通过更多的历史数据，判断负载的变化趋势。查看CPU个数：grep ‘model name‘ /proc/cpuinfo | wc -l，平均负载中有三个值，表示三个时间段，可以更全面的了解过去15分钟内的趋势变化。
??平均负载 VS CPU使用率：平均负载代表活跃进程数，平均负载高，不意味着CPU使用率高。平均负载是单位时间内，可运行状态和不可中断状态的进程数，包括了正在使用CPU的进程，还包括等到CPU和等待I/O的进程。而CPU使用率是单位时间内CPU繁忙情况的统计。例如：
????CPU密集型进程，大量使用CPU，导致CPU使用率和平均负载升高，是一致的；
????I/O密集型，等待I/O导致平均负载高，CPU使用率不一定高；
????等待CPU进程的调度，导致平均负载高，CPU使用率也高。
??
??CPU密集型进程例程

$ sudo apt install stress-ng sysstat
// stress-ng 系统压测工具    sysstat 性能监控和分析工具：mpstat 多核性能分析、pidstat进程性能分析
$ stress-ng --cpu 1 --timeout 600
// 模拟一个CPU使用率为100%

$ watch -d uptime
 11:25:06 up 13:04,  1 user,  load average: 0.87, 0.41, 0.18
 
$ mpstat -P ALL 5
// -P ALL 监控所有CPU，5 每5秒输出一组数据
11时24分50秒  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
11时24分55秒  all   51.38    0.00    0.31    0.00    0.00    0.00    0.00    0.00    0.00   48.32
11时24分55秒    0  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
11时24分55秒    1    1.04    0.00    0.62    0.00    0.00    0.00    0.00    0.00    0.00   98.34

$ pidstat -u 5 1
// 间隔5秒，总共输出1次数据
11时33分10秒   UID       PID    %usr %system  %guest    %CPU   CPU  Command
11时33分15秒  1000     32091   98.60    1.20    0.00   99.80     1  stress-ng-cpu

??
??I/O密集型进程例程

$ stress-ng -i 1 --timeout 600   或  stress-ng -i 1 --hdd 1 timeout 600
// 模拟I/O压力，即不停地执行sync

$ watch -d uptime
 11:41:17 up 13:20,  1 user,  load average: 0.99, 0.75, 0.44

$ mpstat -P ALL 5 1
11时41分59秒  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
11时42分04秒  all    0.66    0.00    1.31   47.21    0.00    0.00    0.00    0.00    0.00   50.82
11时42分04秒    0    0.62    0.00    1.87   40.96    0.00    0.00    0.00    0.00    0.00   56.55
11时42分04秒    1    0.69    0.00    0.69   54.04    0.00    0.00    0.00    0.00    0.00   44.57

$ pidstat -u 5 1
// pidstat中没有%wait，需要更新成最新的版本，采用源码或者RPM安装。
// %wait 表示等待CPU的进程已经在CPU就绪队列中，处于运行态。
// 有别于 %iowait（wa），代表等待 I/O 的 CPU 时间，处于不可中断状态。
11时42分17秒   UID       PID    %usr %system  %guest    %CPU   CPU  Command
11时42分22秒  1000       315    0.00    4.00    0.00    4.00     1  stress-ng-iosyn

??
??大量进程的例程

$ stress-ng -c 8 --timeout 600
// 模拟8个进程

$ uptime
 11:53:18 up 13:32,  1 user,  load average: 8.02, 6.47, 3.46

$ mpstat -P ALL 5 1
11时48分59秒  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
11时49分04秒  all   98.00    0.00    2.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
11时49分04秒    0   99.00    0.00    1.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
11时49分04秒    1   96.80    0.00    3.20    0.00    0.00    0.00    0.00    0.00    0.00    0.00

$ pidstat -u 5 1
11时49分03秒   UID       PID    %usr %system  %guest    %CPU   CPU  Command
11时49分08秒  1000       337   23.55    0.20    0.00   23.75     1  stress-ng-cpu
11时49分08秒  1000       338   24.55    0.00    0.00   24.55     0  stress-ng-cpu
11时49分08秒  1000       339   23.55    0.20    0.00   23.75     1  stress-ng-cpu
11时49分08秒  1000       340   23.35    0.20    0.00   23.55     0  stress-ng-cpu
11时49分08秒  1000       341   23.35    0.20    0.00   23.55     1  stress-ng-cpu
11时49分08秒  1000       342   24.55    0.00    0.00   24.55     0  stress-ng-cpu
11时49分08秒  1000       343   24.15    0.00    0.00   24.15     1  stress-ng-cpu
11时49分08秒  1000       344   24.55    0.20    0.00   24.75     0  stress-ng-cpu

??
??uptime、mpstat、pidstat
??
??
??
??
??

以上是关于Linux性能优化从入门到实战：02 CPU篇：平均负载的主要内容，如果未能解决你的问题，请参考以下文章