perf性能瓶颈分析小试牛刀

Posted Linux学习之路

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了perf性能瓶颈分析小试牛刀相关的知识,希望对你有一定的参考价值。

文档

https://perf.wiki.kernel.org/index.php/Tutorial#Live_analysis_with_perf_top

内核配置

CONFIG_PERF_EVENTS=y

编译

make -C tools/perf

Auto-detecting system features:
...                         dwarf: [ OFF ]
...            dwarf_getlocations: [ OFF ]
...                         glibc: [ on  ]
...                          gtk2: [ OFF ]
...                      libaudit: [ OFF ]
...                        libbfd: [ on  ]
...                        libelf: [ on  ]
...                       libnuma: [ OFF ]
...        numa_num_possible_cpus: [ OFF ]
...                       libperl: [ OFF ]
...                     libpython: [ OFF ]
...                      libslang: [ OFF ]
...                     libcrypto: [ OFF ]
...                     libunwind: [ on  ]
...            libdw-dwarf-unwind: [ OFF ]
...                          zlib: [ on  ]
...                          lzma: [ OFF ]
...                     get_cpuid: [ OFF ]
...                           bpf: [ on  ]

文件系统一定要添加libunwind,否则 perf 无法找到函数名,只能显示一些符号地址。

应用实例

lvgl 应用界面卡顿,简单的用 perf top 分析哪些函数消耗cpu资源较大,进一步优化应用程序

perf top

PerfTop:     630 irqs/sec  kernel:90.8%  exact:  0.0% [4000Hz cycles:ppp],  (all, 4 CPUs)
-------------------------------------------------------------------------------

    19.15%  lvgl_launcher  [.] lodepng_inflatev.isra.11
    12.19%  libc-2.23.so   [.] memcpy
     7.77%  lvgl_launcher  [.] fbdev_flush
     6.41%  [kernel]       [k] cpuidle_enter_state
     3.74%  [kernel]       [k] _raw_spin_unlock_irqrestore
     3.73%  lvgl_launcher  [.] _lv_blend_map
     3.49%  lvgl_launcher  [.] lodepng_zlib_decompress
     3.46%  lvgl_launcher  [.] decodeGeneric
     3.45%  lvgl_launcher  [.] unfilter
     3.38%  [kernel]       [k] __slab_alloc.isra.19.constprop.23
     2.77%  lvgl_launcher  [.] _lv_blend_fill
     2.54%  lvgl_launcher  [.] lv_refr_vdb_flush
     2.37%  [kernel]       [k] _raw_spin_unlock_irq
     1.86%  lvgl_launcher  [.] lv_draw_label
     1.85%  lvgl_launcher  [.] lv_draw_map
     1.35%  lvgl_launcher  [.] decoder_open
     1.28%  lvgl_launcher  [.] fread@plt
     1.08%  lvgl_launcher  [.] lv_color_fill
     1.00%  [kernel]       [k] __softirqentry_text_start
     0.92%  [kernel]       [k] tick_nohz_idle_enter

perf top -p PID 单独分析lvgl_launcher进程

   PerfTop:    3972 irqs/sec  kernel: 3.8%  exact:  0.0% [4000Hz cycles:ppp],  (target_pid: 2766)
-------------------------------------------------------------------------------

    33.50%  lvgl_launcher  [.] lodepng_inflatev.isra.11
    12.89%  libc-2.23.so   [.] memcpy
     6.92%  lvgl_launcher  [.] fbdev_flush
     6.59%  lvgl_launcher  [.] _lv_blend_map
     6.12%  lvgl_launcher  [.] lodepng_zlib_decompress
     6.12%  lvgl_launcher  [.] unfilter
     6.05%  lvgl_launcher  [.] decodeGeneric
     3.15%  lvgl_launcher  [.] lv_draw_map
     2.81%  lvgl_launcher  [.] _lv_blend_fill
     2.33%  lvgl_launcher  [.] lv_refr_vdb_flush
     2.32%  lvgl_launcher  [.] decoder_open
     1.90%  lvgl_launcher  [.] lv_draw_label
     1.30%  lvgl_launcher  [.] fread@plt
     1.27%  lvgl_launcher  [.] lv_color_fill
     0.68%  lvgl_launcher  [.] HuffmanTree_makeFromLengths2
     0.68%  [kernel]       [k] __slab_alloc.isra.19.constprop.23
     0.42%  [kernel]       [k] _raw_spin_unlock_irqrestore
     0.28%  [kernel]       [k] __arch_copy_to_user
     0.28%  [kernel]       [k] __softirqentry_text_start
     0.19%  [kernel]       [k] clear_page

perf top -g -p PID 打印调用栈

    99.92%     0.00%  lvgl_launcher  [.] _start
            |
            ---_start
               __libc_start_main
               |          
                --99.91%--main
                          |          
                           --99.82%--lv_task_handler
                                     |          
                                      --99.64%--_lv_disp_refr_task
                                                |          
                                                 --99.62%--lv_refr_area_part
                                                           |          
                                                           |--71.73%--lv_refr_obj_and_children
                                                           |          lv_refr_obj.part.3
                                                           |          |          
                                                           |          |--70.50%--lv_refr_obj.part.3
                                                           |          |          |          
                                                           |          |          |--58.45%--lv_refr_obj.part.3
                                                           |          |          |          |          
                                                           |          |          |          |--40.61%--lv_img_design
                                                           |          |          |          |          |          
                                                           |          |          |          |           --40.53%--lv_draw_img
                                                           |          |          |          |                     |          
                                                           |          |          |          |                     |--36.02%--_lv_img_cache_open
                                                           |          |          |          |                     |          |          
                                                           |          |          |          |                     |           --35.86%--lv_img_decoder_open
                                                           |          |          |          |                     |                     |          
                                                           |          |          |          |                     |                      --35.56%--decoder_open
                                                           |          |          |          |                     |                                |          
                                                           |          |          |          |                     |                                 --33.89%--lodepng_decode_memory
                                                           |          |          |          |                     |                                           |          
                                                           |          |          |          |                     |                                            --33.75%--decodeGeneric
                                                           |          |          |          |                     |                                                      |          
                                                           |          |          |          |                     |                                                      |--27.36%--lodepng_zlib_decompress
                                                           |          |          |          |                     |                                                      |          |          
                                                           |          |          |          |                     |                                                      |          |--22.28%--lodepng_inflatev.isra.11
                                                           |          |          |          |                     |                                                      |          |          
                                                           |          |          |          |                     |                                                      |           --1.85%--memcpy
                                                           |          |          |          |                     |                                                      |          
                                                           |          |          |          |                     |                                                       --3.19%--postProcessScanlines
                                                           |          |          |          |                     |                                                                 unfilter
                                                           |          |          |          |                     |          
                                                           |          |          |          |                      --4.49%--lv_draw_map
                                                           |          |          |          |                                |          
                                                           |          |          |          |                                 --2.75%--_lv_blend_map

以上是关于perf性能瓶颈分析小试牛刀的主要内容,如果未能解决你的问题,请参考以下文章

perf性能瓶颈分析小试牛刀

Linux性能分析专题perf牛刀小试------从一个简单的程序谈起

perf 命令

perf record/report

perf record/report

如何用perf工具