Android ANR分析-笔记

Posted 且听真言

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Android ANR分析-笔记相关的知识,希望对你有一定的参考价值。

1.什么是ANR

ANR:Application Not Responding,即应用无响应

2.ANR的类型

1.KeyDispatchTimeout(5 seconds)。按键或触摸事件在5s内无响应。

2.BroadcastTimeout(10 seconds)。BroadcastReceiver在10s内无法完成操作。

3.ServiceTimeout(20 seconds) 。Service在20s内无法处理完成任务。

3. 产生ANR原因

         主线程耗时操做函数

         主线程被其它线程Block

         系统级响应阻塞

         系统或进程自身可用内存紧张

         CPU资源被抢占

4.ANR 日志分析

_exp_main.text

Backtrace: 
Process: com.xxxx.xxxx //包名
PID: 8541
UID: 10166
Flags: 0x28d83ec5
Package: com.xxxx.xxxx v2181 (V5.9.4.1) //包名,版本号
Foreground: Yes
Process-Runtime: 13711627
Activity: com.xxxx.xxxx/.repair.MainActivity  //发生ANR时,应用所在的Activity
Subject: Input dispatching timed out (6f8f132 com.xxxx.xxxx/comxxxx.xxxx.repair.MainActivity (server) is not responding. Waited 5000ms for KeyEvent) //ANR 类型
Build: TECNO/xxxx/xxxxx/xxxxx/xxx:user/release-keys

在_exp_main.text最上方的一段Log描述:应用进程、pid、uid、应用包名、版本号、ANR类型(触摸事件在5s内无响应)。

CPU usage from 246337ms to 0ms ago (2022-09-26 23:52:10.872 to 2022-09-26 23:56:17.209):
  15% 676/surfaceflinger: 9% user + 6.3% kernel / faults: 3411 minor
  13% 1308/system_server: 8% user + 5% kernel / faults: 31673 minor
  9.3% 8541/com.transsion.carlcare: 6.6% user + 2.6% kernel / faults: 136116 minor
  5.7% 1764/com.android.systemui: 3.4% user + 2.3% kernel / faults: 8443 minor
  5.7% 627/android.hardware.graphics.composer@2.1-service: 2.4% user + 3.2% kernel / faults: 8238 minor
  2.3% 2420/com.google.android.inputmethod.latin: 1.6% user + 0.6% kernel / faults: 30125 minor
  1.9% 943/vendor.mediatek.hardware.pq@2.2-service: 1% user + 0.8% kernel
  1.6% 4093/com.transsion.smartpanel: 1.2% user + 0.3% kernel / faults: 7857 minor
  1.5% 453/logd: 0.5% user + 1% kernel / faults: 18 minor
  1.4% 891/mobile_log_d: 0.3% user + 1% kernel / faults: 3578 minor
  1% 3033/com.google.android.gms: 0.7% user + 0.3% kernel / faults: 6351 minor
  0.9% 644/vendor.mediatek.hardware.mtkpower@1.0-service: 0.2% user + 0.7% kernel
  0.7% 6365/com.transsion.hilauncher:BranchSearch: 0.4% user + 0.2% kernel / faults: 995 minor
  0.6% 5541/kworker/u17:3: 0% user + 0.6% kernel
  0.6% 7858/com.hoffnung.mobile.service: 0.3% user + 0.2% kernel / faults: 4051 minor
  0.6% 797/adbd: 0.1% user + 0.5% kernel / faults: 14573 minor
  0.6% 10976/kworker/u16:0: 0% user + 0.6% kernel / faults: 274 minor
  0.5% 2585/com.google.android.gms.persistent: 0.3% user + 0.1% kernel / faults: 2747 minor
  0.4% 15641/kworker/u16:1: 0% user + 0.4% kernel / faults: 274 minor
  0.4% 200/pbm: 0% user + 0.4% kernel
  0.4% 23315/com.android.commands.monkey: 0.2% user + 0.1% kernel / faults: 444 minor
  0.3% 26534/kworker/u16:2: 0% user + 0.3% kernel / faults: 227 minor
  0.3% 25937/kworker/u17:0: 0% user + 0.3% kernel
  0.3% 1877/com.android.networkstack.process: 0.3% user + 0% kernel / faults: 4200 minor
  0.3% 26562/kworker/u16:3: 0% user + 0.3% kernel / faults: 172 minor
  0.3% 5467/kworker/u16:6: 0% user + 0.3% kernel / faults: 175 minor
  0.2% 327/ipi_cpu_dvfs_rt: 0% user + 0.2% kernel
  0.2% 8/rcu_preempt: 0% user + 0.2% kernel
  0.2% 1745/hif_thread: 0% user + 0.2% kernel
  0.2% 1743/main_thread: 0% user + 0.2% kernel
  0.2% 18676/com.transsion.carlcare:ccWebview: 0.1% user + 0% kernel / faults: 3164 minor
  0.2% 27300/kworker/u17:2: 0% user + 0.2% kernel
  0.1% 27240/kworker/0:2: 0% user + 0.1% kernel
  0.1% 611/statsd: 0% user + 0% kernel / faults: 339 minor
  0.1% 134/ion_mm_heap_for: 0% user + 0.1% kernel
  0.1% 455/servicemanager: 0% user + 0% kernel
  0.1% 19980/kworker/1:1: 0% user + 0.1% kernel
  0.1% 298/irq/39-5-0034: 0% user + 0.1% kernel
  0.1% 269/disp_idlemgr: 0% user + 0.1% kernel
  0.1% 268/present_fence_w: 0% user + 0.1% kernel
  0.1% 660/audioserver: 0% user + 0% kernel
  0.1% 2012/com.transsion.hilauncher: 0% user + 0% kernel / faults: 587 minor
  0.1% 27670/kworker/4:1: 0% user + 0.1% kernel
  0.1% 1050/emdlogger1: 0% user + 0% kernel / faults: 44 minor
  0.1% 5125/kworker/u16:5: 0% user + 0.1% kernel / faults: 74 minor
  0.1% 23352/kworker/6:0: 0% user + 0.1% kernel
  0.1% 266/disp_delay_trig: 0% user + 0.1% kernel
  0.1% 626/android.hardware.graphics.allocator@4.0-service-mediatek: 0% user + 0.1% kernel / faults: 4544 minor
  0.1% 25345/kworker/2:1: 0% user + 0.1% kernel
  0% 25427/kworker/3:0: 0% user + 0% kernel
  0% 906/android.hardware.sensors@2.0-service-mediatek: 0% user + 0% kernel
  0% 629/android.hardware.memtrack@1.0-service: 0% user + 0% kernel
  0% 352/charger_thread: 0% user + 0% kernel
  0% 1980/com.android.phone: 0% user + 0% kernel / faults: 794 minor
  0% 1746/rx_thread: 0% user + 0% kernel
  0% 31074/kworker/5:0: 0% user + 0% kernel
  0% 454/lmkd: 0% user + 0% kernel
  0% 1198/mtkfusionrild: 0% user + 0% kernel
  0% 216/chre_kthread: 0% user + 0% kernel
  0% 2516/iomanager: 0% user + 0% kernel
  0% 284/resource_monito: 0% user + 0% kernel
  0% 9/rcu_sched: 0% user + 0% kernel
  0% 5699/com.google.android.gms.unstable: 0% user + 0% kernel / faults: 12 minor
  0% 23267/kworker/7:0: 0% user + 0% kernel
  0% 129/dlpt_notify_thr: 0% user + 0% kernel
  0% 6548/com.transsion.plat.appupdate: 0% user + 0% kernel / faults: 592 minor
  0% 859/connsyslogger: 0% user + 0% kernel
  0% 899/wificond: 0% user + 0% kernel
  0% 11/migration/0: 0% user + 0% kernel
  0% 201/btif_rxd: 0% user + 0% kernel
  0% 261/disp_check: 0% user + 0% kernel
  0% 509/android.system.suspend@1.0-service: 0% user + 0% kernel
  0% 612/netd: 0% user + 0% kernel / faults: 40 minor
  0% 657/android.hardware.vibrator-service.mediatek: 0% user + 0% kernel
  0% 854/mtk_wmtd: 0% user + 0% kernel
  0% 914/thermal: 0% user + 0% kernel
  0% 14/migration/1: 0% user + 0% kernel
  0% 19/migration/2: 0% user + 0% kernel
  0% 136/ion_history: 0% user + 0% kernel
  0% 422/ueventd: 0% user + 0% kernel
  0% 684/fuelgauged: 0% user + 0% kernel
  0% 1051/gsm0710muxd: 0% user + 0% kernel
  0% 1739/android.hardware.wifi@1.0-service-lazy: 0% user + 0% kernel
  0% 8455/trace: 0% user + 0% kernel
  0% 7/ksoftirqd/0: 0% user + 0% kernel
  0% 15/ksoftirqd/1: 0% user + 0% kernel
  0% 24/migration/3: 0% user + 0% kernel
  0% 40/ksoftirqd/6: 0% user + 0% kernel
  0% 160/kswapd0: 0% user + 0% kernel
  0% 312/battery_thread: 0% user + 0% kernel
  0% 364/kworker/6:1H: 0% user + 0% kernel
  0% 852/mtk_stp_psm: 0% user + 0% kernel
  0% 880/media.metrics: 0% user + 0% kernel
  0% 2715/com.google.android.providers.media.module: 0% user + 0% kernel / faults: 6 minor
  0% 8896/android.process.acore: 0% user + 0% kernel / faults: 9 minor
  0% 20/ksoftirqd/2: 0% user + 0% kernel
  0% 25/ksoftirqd/3: 0% user + 0% kernel
  0% 29/migration/4: 0% user + 0% kernel
  0% 30/ksoftirqd/4: 0% user + 0% kernel
  0% 34/migration/5: 0% user + 0% kernel
  0% 45/ksoftirqd/7: 0% user + 0% kernel
  0% 222/ccci_poll1: 0% user + 0% kernel
  0% 254/teei_switch_thr: 0% user + 0% kernel
  0% 276/krtatm: 0% user + 0% kernel
  0% 357/wdtk-3: 0% user + 0% kernel
  0% 363/kworker/7:1H: 0% user + 0% kernel
  0% 367/kworker/1:1H: 0% user + 0% kernel
  0% 368/kworker/0:1H: 0% user + 0% kernel
  0% 416/kworker/3:1H: 0% user + 0% kernel
  0% 456/hwservicemanager: 0% user + 0% kernel
  0% 628/android.hardware.health@2.1-service: 0% user + 0% kernel
  0% 800/traced_probes: 0% user + 0% kernel
  0% 802/tne: 0% user + 0% kernel / faults: 48 minor
  0% 1812/com.transsnet.store: 0% user + 0% kernel / faults: 3 minor
  0% 2330/com.transsion.trancare:subservice: 0% user + 0% kernel / faults: 6 minor
  0% 2517/pnpmgr: 0% user + 0% kernel
  0% 2519/tranlog: 0% user + 0% kernel
  0% 2693/com.google.android.googlequicksearchbox:interactor: 0% user + 0% kernel / faults: 3 minor
  0% 2818/com.transsion.camera: 0% user + 0% kernel
 +0% 27811/kworker/6:2: 0% user + 0% kernel
 +0% 27970/kworker/2:2: 0% user + 0% kernel
 +0% 28296/kworker/0:1: 0% user + 0% kernel
8.8% TOTAL: 4.8% user + 3.9% kernel + 0% iowait + 0% softirq

 从上面的log可以发现:

第一行表明负载信息抓取在ANR发生之后的0~246337ms。同时也指明了ANR的时间段:2022-09-26 23:52:10.872 to 2022-09-26 23:56:17.209。

然后下面是各个进程占用的CPU的详细情况。

最后一行是各个进程合计占用的CPU信息。

涉及的一些重要名词:

ser:用户态,kernel:内核态
​faults:内存缺页,minor——轻微的,major——重度,需要从磁盘拿数据
​iowait:IO使用(等待)占比
​irq:硬中断,softirq:软中断

  • 进程创建到现在一共创建了多少对象
Total number of allocations 87509197
  • 进程创建到现在一共申请了多少内存
Total bytes allocated 5510MB
  •  Total bytes freed 52MB
Total bytes freed 5374MB
  • 不扩展堆的情况下可用的内存 
Free memory 23MB
  • GC前的可用内存 
Free memory until GC 23MB
  • OOM之前的可用内存
Free memory until OOME 376MB
  • 当前总内存(已用+可用) 
Total memory 159MB
  • 进程最多能申请的内存
Max memory 512MB

注意:Free memory until OOME 的值很小的时候,已经处于内存紧张状态。应用可能是占用了过多内存。

除了trace文件中有内存信息,普通的eventlog日志中,也有内存信息。

SYS_ANDROID_EVENT_LOG.txt

09-26 23:56:41.661  1308  7845 I am_meminfo: [3266568192,240267264,77824,245788672,910894080]

以上四个值分别指的是:Cached,Free,Zram,Kernel,Native。

Cached+Free的内存代表着当前整个手机的可用内存,如果值很小,意味着处于内存紧张状态。一般低内存的判定阈值为:4G 内存手机以下阀值:350MB,以上阀值则为:450MB。如果ANR时间点前后,日志里有打印onTrimMemory,也可以作为内存紧张的一个参考判断。

5.堆栈消息

堆栈信息是最重要的一个信息,展示了ANR发生的进程当前所有线程的状态。

线程名称:"main"

线程优先级:“prio”,此时优先级5。

JVM线程id: tid = 1

线程状态:Runnable

suspend all histogram:	Sum: 86.658ms 99% C.I. 5.132us-1021.385us Avg: 70.856us Max: 33536us
DALVIK THREADS (67):
"main" prio=5 tid=1 Runnable
  | group="main" sCount=0 dsCount=0 flags=0 obj=0x731f53f8 self=0xb400007b21471be0
  | sysTid=8541 nice=-10 cgrp=default sched=0/0 handle=0x7c484324f8
  | state=R schedstat=( 591843734006 11890654357 895335 ) utm=43825 stm=15359 core=7 HZ=100
  | stack=0x7ff99af000-0x7ff99b1000 stackSize=8192KB
  | held mutexes= "mutator lock"(shared held)
  at android.icu.impl.SoftCache.getInstance(SoftCache.java:51)
  at android.icu.util.ULocale.forLocale(ULocale.java:396)
  at android.text.TextUtils.getLayoutDirectionFromLocale(TextUtils.java:2066)
  at android.content.res.Configuration.setLayoutDirection(Configuration.java:2181)
  at android.content.res.Configuration.setLocales(Configuration.java:2127)
  at android.content.res.ResourcesImpl.calcConfigChanges(ResourcesImpl.java:533)
  at android.content.res.ResourcesImpl.updateConfiguration(ResourcesImpl.java:414)
  - locked <0x079019e2> (a java.lang.Object)
  at android.content.res.Resources.updateConfiguration(Resources.java:2020)
  at android.content.res.Resources.updateConfiguration(Resources.java:2012)
  at com.transsion.common.activity.SionBaseActivity.getResources(SourceFile:4)
  at android.view.LayoutInflater.inflate(LayoutInflater.java:520)
  at k.y.a.c.m(SourceFile:1)
  at k.y.a.c.onCreateViewHolder(SourceFile:1)
  at androidx.recyclerview.widget.RecyclerView$g.createViewHolder(SourceFile:2)
  at androidx.recyclerview.widget.RecyclerView$u.I(SourceFile:28)
  at androidx.recyclerview.widget.RecyclerView$u.p(SourceFile:1)
  at androidx.recyclerview.widget.RecyclerView$u.o(SourceFile:1)
  at androidx.recyclerview.widget.LinearLayoutManager$c.d(SourceFile:3)
  at androidx.recyclerview.widget.LinearLayoutManager.layoutChunk(SourceFile:1)
  at androidx.recyclerview.widget.LinearLayoutManager.fill(SourceFile:9)
  at androidx.recyclerview.widget.LinearLayoutManager.onLayoutChildren(SourceFile:71)
  at androidx.recyclerview.widget.RecyclerView.dispatchLayoutStep2(SourceFile:8)
  at androidx.recyclerview.widget.RecyclerView.dispatchLayout(SourceFile:9)
  at androidx.recyclerview.widget.RecyclerView.consumePendingUpdateOperations(SourceFile:21)
  at androidx.recyclerview.widget.RecyclerView$b0.run(SourceFile:5)
  at android.view.Choreographer$CallbackRecord.run(Choreographer.java:1030)
  at android.view.Choreographer.doCallbacks(Choreographer.java:840)
  at android.view.Choreographer.doFrame(Choreographer.java:763)
  at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:1002)
  at android.os.Handler.handleCallback(Handler.java:938)
  at android.os.Handler.dispatchMessage(Handler.java:99)
  at android.os.Looper.loop(Looper.java:268)
  at android.app.ActivityThread.main(ActivityThread.java:8000)
  at java.lang.reflect.Method.invoke(Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:627)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:997)

"OkHttp ConnectionPool" daemon prio=5 tid=42 TimedWaiting
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x12c81c20 self=0xb400007b214ab1b0
  | sysTid=8604 nice=0 cgrp=default sched=0/0 handle=0x792bca0cc0
  | state=S schedstat=( 256337769 38916225 1410 ) utm=13 stm=11 core=7 HZ=100
  | stack=0x792bb9d000-0x792bb9f000 stackSize=1043KB
  | held mutexes=
  kernel: __switch_to+0x134/0x150
  kernel: futex_wait_queue_me+0xc8/0x114
  kernel: futex_wait+0x11c/0x310
  kernel: do_futex+0x120/0x1b60
  kernel: SyS_futex+0x148/0x1c8
  kernel: __sys_trace+0x48/0x48
  at java.lang.Object.wait(Native method)
  - waiting on <0x014b83b7> (a com.android.okhttp.ConnectionPool)
  at com.android.okhttp.ConnectionPool$1.run(ConnectionPool.java:106)
  - locked <0x014b83b7> (a com.android.okhttp.ConnectionPool)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
  at java.lang.Thread.run(Thread.java:923)

在上面两行log中,一个是主线程main,它的状态是Runnable。另一个是OkHttp ConnectionPool,它的状态是TimeWaiting。

6.Thread状态分析

线程的状态是一个很重要的东西,因此thread dump中会显示这些状态,通过对这些状态的分析,能够得出线程的运行状况,进而发现可能存在的问题。

Java中的线程状态有6种:

  public static enum State 
        NEW,
        RUNNABLE,
        BLOCKED,
        WAITING,
        TIMED_WAITING,
        TERMINATED;

        private State() 
        
    
public enum State   
  
       /** 
        * Thread state for a thread which has not yet started. 
        */  
       NEW,  
         
       /** 
        * Thread state for a runnable thread.  A thread in the runnable 
        * state is executing in the Java virtual machine but it may 
        * be waiting for other resources from the operating system 
        * such as processor. 
        */  
       RUNNABLE,  
         
       /** 
        * Thread state for a thread blocked waiting for a monitor lock. 
        * A thread in the blocked state is waiting for a monitor lock 
        * to enter a synchronized block/method or  
        * reenter a synchronized block/method after calling 
        * @link Object#wait() Object.wait. 
        */  
       BLOCKED,  
     
       /** 
        * Thread state for a waiting thread. 
        * A thread is in the waiting state due to calling one of the  
        * following methods: 
        * <ul> 
        *   <li>@link Object#wait() Object.wait with no timeout</li> 
        *   <li>@link #join() Thread.join with no timeout</li> 
        *   <li>@link LockSupport#park() LockSupport.park</li> 
        * </ul> 
        *  
        * <p>A thread in the waiting state is waiting for another thread to 
        * perform a particular action.   
        * 
        * For example, a thread that has called <tt>Object.wait()</tt> 
        * on an object is waiting for another thread to call  
        * <tt>Object.notify()</tt> or <tt>Object.notifyAll()</tt> on  
        * that object. A thread that has called <tt>Thread.join()</tt>  
        * is waiting for a specified thread to terminate. 
        */  
       WAITING,  
         
       /** 
        * Thread state for a waiting thread with a specified waiting time. 
        * A thread is in the timed waiting state due to calling one of  
        * the following methods with a specified positive waiting time: 
        * <ul> 
        *   <li>@link #sleep Thread.sleep</li> 
        *   <li>@link Object#wait(long) Object.wait with timeout</li> 
        *   <li>@link #join(long) Thread.join with timeout</li> 
        *   <li>@link LockSupport#parkNanos LockSupport.parkNanos</li>  
        *   <li>@link LockSupport#parkUntil LockSupport.parkUntil</li> 
        * </ul> 
        */  
       TIMED_WAITING,  
  
       /** 
        * Thread state for a terminated thread. 
        * The thread has completed execution. 
        */  
       TERMINATED;  

状态解释
New线程创建,但是还没有start()。每一个线程,在堆内存中都有一个对应的Thread对象。Thread t = new Thread();当刚刚在堆内存中创建Thread对象,还没有调用t.start()方法之前,线程就处在NEW状态。在这个状态上,线程与普通的java对象没有什么区别,就仅仅是一个堆内存中的对象。
RUNNABLE线程执行或者在等待系统资源,例如CPU资源。该状态表示线程具备所有运行条件,在运行队列中准备操作系统的调度,或者正在运行。 这个状态的线程比较正常,但如果线程长时间停留在在这个状态就不正常了,这说明线程运行的时间很长(存在性能问题),或者是线程一直得不得执行的机会(存在线程饥饿的问题)。
BLOCKED线程在等待其它线程释放一个锁,则会进入 Blocked 状态。例如在等待一个同步代码块的锁,获得锁则进入 Runnable 状态。线程正在等待获取java对象的监视器(也叫内置锁),即线程正在等待进入由synchronized保护的方法或者代码块。synchronized用来保证原子性,任意时刻最多只能由一个线程进入该临界区域,其他线程只能排队等待。
WAITING

线程在等待一个其它线程唤醒它。可能是调用了 Object.wait(),Thread.join() ,LockSupport.part()。

处在该线程的状态,正在等待某个事件的发生,只有特定的条件满足,才能获得执行机会。而产生这个特定的事件,通常都是另一个线程。也就是说,如果不发生特定的事件,那么处在该状态的线程一直等待,不能获取执行的机会。比如:

A线程调用了obj对象的obj.wait()方法,如果没有线程调用obj.notify或obj.notifyAll,那么A线程就没有办法恢复运行;(WAITING)
如果A线程调用了LockSupport.park(),没有别的线程调用LockSupport.unpark(A),那么A没有办法恢复运行。(WAITING)

TIMED_WAITING同上,只不过方法换成了 Object.wait(timeOut) ,Thread.join(millis) ,LockSupport.part(deadline)。J.U.C中很多与线程相关类,都提供了限时版本和不限时版本的API。TIMED_WAITING意味着线程调用了限时版本的API,正在等待时间流逝。当等待时间过去后,线程一样可以恢复运行。如果线程进入了WAITING状态,一定要特定的事件发生才能恢复运行;而处在TIMED_WAITING的线程,如果特定的事件发生或者是时间流逝完毕,都会恢复运行。
TERMINATED线程销毁。线程执行完毕,执行完run方法正常返回,或者抛出了运行时异常而结束,线程都会停留在这个状态。这个时候线程只剩下Thread对象了,没有什么用了。

trace文件中的状态是是CPP代码中定义的状态(native状态)。

java thread状态C++ 层Thread 状态解释
NEWINITIALIZING新建,正在分配资源
NEWSTARING新建,正在启动
RUNNABLERUNING/RUNNABLE线程创建或者正在运行
RUNNABLENATIVE正在执行 JNI 方法
RUNNABLESUSPENDED现在挂起,一般是GC导致
BLOCKEDMOINITOR线程阻塞,等待其它线程释放对象锁
WAITINGWAIT执行了无超时参数的 wait() 等方法
WAITINGVMWAIT正在等待 VM 资源
TERMINATEDZOMBIE线程销毁

如果主线程是MOINITOR状态,那么 ANR 的原因就是主线程被阻塞。如果主线程是WAIT状态,那么 ANR 的原因就是等待子线程释放锁。

7.Java thread statck trace

suspend all histogram:	Sum: 86.658ms 99% C.I. 5.132us-1021.385us Avg: 70.856us Max: 33536us
DALVIK THREADS (67):
"main" prio=5 tid=1 Runnable
  | group="main" sCount=0 dsCount=0 flags=0 obj=0x731f53f8 self=0xb400007b21471be0
  | sysTid=8541 nice=-10 cgrp=default sched=0/0 handle=0x7c484324f8
  | state=R schedstat=( 591843734006 11890654357 895335 ) utm=43825 stm=15359 core=7 HZ=100
  | stack=0x7ff99af000-0x7ff99b1000 stackSize=8192KB
  | held mutexes= "mutator lock"(shared held)
  at android.icu.impl.SoftCache.getInstance(SoftCache.java:51)
  at android.icu.util.ULocale.forLocale(ULocale.java:396)
  at android.text.TextUtils.getLayoutDirectionFromLocale(TextUtils.java:2066)
  at android.content.res.Configuration.setLayoutDirection(Configuration.java:2181)
  at android.content.res.Configuration.setLocales(Configuration.java:2127)
  at android.content.res.ResourcesImpl.calcConfigChanges(ResourcesImpl.java:533)
  at android.content.res.ResourcesImpl.updateConfiguration(ResourcesImpl.java:414)
  - locked <0x079019e2> (a java.lang.Object)
  at android.content.res.Resources.updateConfiguration(Resources.java:2020)
  at android.content.res.Resources.updateConfiguration(Resources.java:2012)
  at com.transsion.common.activity.SionBaseActivity.getResources(SourceFile:4)
  at android.view.LayoutInflater.inflate(LayoutInflater.java:520)
  at k.y.a.c.m(SourceFile:1)
  at k.y.a.c.onCreateViewHolder(SourceFile:1)
  at androidx.recyclerview.widget.RecyclerView$g.createViewHolder(SourceFile:2)
  at androidx.recyclerview.widget.RecyclerView$u.I(SourceFile:28)
  at androidx.recyclerview.widget.RecyclerView$u.p(SourceFile:1)
  at androidx.recyclerview.widget.RecyclerView$u.o(SourceFile:1)
  at androidx.recyclerview.widget.LinearLayoutManager$c.d(SourceFile:3)
  at androidx.recyclerview.widget.LinearLayoutManager.layoutChunk(SourceFile:1)
  at androidx.recyclerview.widget.LinearLayoutManager.fill(SourceFile:9)
  at androidx.recyclerview.widget.LinearLayoutManager.onLayoutChildren(SourceFile:71)
  at androidx.recyclerview.widget.RecyclerView.dispatchLayoutStep2(SourceFile:8)
  at androidx.recyclerview.widget.RecyclerView.dispatchLayout(SourceFile:9)
  at androidx.recyclerview.widget.RecyclerView.consumePendingUpdateOperations(SourceFile:21)
  at androidx.recyclerview.widget.RecyclerView$b0.run(SourceFile:5)
  at android.view.Choreographer$CallbackRecord.run(Choreographer.java:1030)
  at android.view.Choreographer.doCallbacks(Choreographer.java:840)
  at android.view.Choreographer.doFrame(Choreographer.java:763)
  at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:1002)
  at android.os.Handler.handleCallback(Handler.java:938)
  at android.os.Handler.dispatchMessage(Handler.java:99)
  at android.os.Looper.loop(Looper.java:268)
  at android.app.ActivityThread.main(ActivityThread.java:8000)
  at java.lang.reflect.Method.invoke(Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:627)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:997)

Java stack trace提供了大部分信息来精确定位问题根源。

例子:

java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
-waiting on <0xb3885f60> (a java.util.TaskQueue)     # 继续wait 
at java.util.TimerThread.mainLoop(Timer.java:509)
-locked <0xb3885f60> (a java.util.TaskQueue)         # 已经locked
at java.util.TimerThread.run(Timer.java:462)

堆栈信息应该逆向解读:

- locked <0xb3885f60> (a java.util.ArrayList)
- waiting on <0xb3885f60> (a java.util.ArrayList) 

 对象先上锁,锁住对象0xb3885f60,然后释放该对象锁,进入waiting状态。对应代码如下:

synchronized(obj)   
   .........  
   obj.wait();  
   .........  

 先用 synchronized 获得了这个对象的 Monitor-------> locked <0xb3885f60>

当执行到 obj.wait(),线程即放弃了 Monitor的所有权,进入 “wait set”队列(-------> waiting on <0xb3885f60> )

8. 关键状态分析

1.当前线程等待另一个条件的发生,来把自己唤醒,或者当前线程调用了 sleep(n)。

此时线程处于的状态:

java.lang.Thread.State: WAITING (parking):一直等那个条件发生;
java.lang.Thread.State: TIMED_WAITING (parking或sleeping):定时的,那个条件不到来,也将定时唤醒自己。
 

2.Waiting for Monitor Entry 和 in Object.wait()。

Monitor是Java中用以实现线程之间的互斥与协作的主要手段,它可以看成是对象或者Class的锁。每一个对象都有,也仅有一个 Monitor。

synchronized(obj) 
   .........

一般CPU很忙时,则关注runnable的线程,CPU很闲时,则关注waiting for monitor entry的线程。

9.Android性能指标CPU关注点 

1.CPU总体使用率
2.应用程序CPU占用率

获取App CPU指标值的方式:

1.proc文件获取方式

/proc文件系统是一个伪文件系统,它只存在内存当中,而不占用外存空间。用户和应用程序可以通过/proc得到系统的信息,并可以改变内核的某些参数。安卓性能指标cpu总体使用率和应用程序cpu占用率主要与两个proc文件相关,分别是/proc/stat和/proc/进程id/stat文件.。
通过adb shell进入到手机内部shell模式,再通过cat /proc/stat 查看结果如下:

 前面几行cpu cpu0 ... cpu7是我们需要关注的重点,cpu0~cpu7表示当前CPU的核心(多核),cpu为总的Jiffies,这里引入了Jiffies(时间片)的概念,Jiffies的介绍如下:

Jiffies 为Linux核心变数,是一个unsigned long类型的变量,它被用来记录系统自开机以来,已经过了多少tick。每发生一次timer interrupt,Jiffies变数会被加1。

上面每一列的数值含义如下:
user :从系统启动开始累计到当前时刻,用户态的jiffies ,不包含nice值为负进程;
nice :从系统启动开始累计到当前时刻,nice值为负的进程所占用的jiffies;
system :从系统启动开始累计到当前时刻,系统态的jiffies;
idle :从系统启动开始累计到当前时刻,除硬盘IO等待时间以外其它等待的jiffies;
iowait : 从系统启动开始累计到当前时刻,硬盘IO等待的jiffies;
irq : 从系统启动开始累计到当前时刻,硬中断的jiffies;
softirq :从系统启动开始累计到当前时刻,软中断的jiffies。

总的Jiffies就是上面所有项加起来的总和。因此我们计算一段时间的CPU占用率的时候就可以使用:
total=user+system+nice+idle+iowait+irq+softirq
cpu usage=[(user_end +sys_end+nice_end) - (user_begin + sys_begin+nice_begin)]/(total_end - total_begin)*100

上述方法统计的是当前系统所有进程的CPU总和使用率。

如果我们要统计某个App的使用率可以进入到/proc/进程id/stat目录,其中就有包含某进程的CPU信息。

首先,我们需要查询到App的进程ID:

其中的进程ID为5451,再查询stat文件信息:

在第14行、15行有记录当前进程的Jiffies信息
utime=1565 该任务在用户态运行的时间,单位为jiffies
stime=455 该任务在核心态运行的时间,单位为jiffies
所以当前进程的Jiffies计算方式为utime+stime
通过shell脚本获取的方式:
cat /proc/进程id/stat | awk -F " " 'print $14,$15'
要计算出一段时间内该进程的CPU使用率信息即可通过:
utime+stime(当前时间点)-utime+stime(旧时间点)/ total cpu Jiffies

top命令获取CPU使用率
top命令是Linux下常用的性能分析工具,能够实时显示系统中各个进程的资源占用状况,类似于Windows的任务管理器。top是一个动态显示过程,即可以通过用户按键来不断刷新当前状态。如果在前台执行该命令,它将独占前台,直到用户终止该程序为止。top命令提供了实时的对系统处理器的状态监视。
使用方式:
可查看占用cpu最高的前10个程序(-t 显示进程名称,-s 按指定行排序,-n 在退出前刷新几次,-d 刷新间隔,-m 显示最大数量):top -m 10 -s cpu
如果你想筛选出你自己的应用的话可以用下面这一命令:
adb shell top -n 1| grep PackageName

10. 问题场景

1.CPU飙高,load高,响应很慢

一个请求过程中多次dump;
对比多次dump文件的runnable线程,如果执行的方法有比较大变化,说明比较正常。如果在执行同一个方法,就有一些问题了;

2.查找占用CPU最多的线程

使用命令:top -H -p pid(pid为被测系统的进程号),找到导致CPU高的线程ID,对应thread dump信息中线程的nid,只不过一个是十进制,一个是十六进制;
在thread dump中,根据top命令查找的线程id,查找对应的线程堆栈信息;

3.CPU使用率不高但是响应很慢

进行dump,查看是否有很多thread struck在了i/o、数据库等地方,定位瓶颈原因;

4.请求无法响应

多次dump,对比是否所有的runnable线程都一直在执行相同的方法,如果是的,恭喜你,锁住了!

5. 死锁

死锁经常表现为程序的停顿,或者不再响应用户的请求。从操作系统上观察,对应进程的CPU占用率为零,很快会从top或prstat的输出中消失。

比如在下面这个示例中,是个较为典型的死锁情况:

"Thread-1" prio=5 tid=0x00acc490 nid=0xe50 waiting for monitor entry [0x02d3f000 ..0x02d3fd68] 
at deadlockthreads.TestThread.run(TestThread.java:31) 
- waiting to lock <0x22c19f18> (a java.lang.Object) 
- locked <0x22c19f20> (a java.lang.Object) 

"Thread-0" prio=5 tid=0x00accdb0 nid=0xdec waiting for monitor entry [0x02cff000 ..0x02cff9e8] 
at deadlockthreads.TestThread.run(TestThread.java:31) 
- waiting to lock <0x22c19f20> (a java.lang.Object) 
- locked <0x22c19f18> (a java.lang.Object) 

在 JAVA 5中加强了对死锁的检测。线程 Dump中可以直接报告出 Java级别的死锁,如下所示:


Found one Java-level deadlock: 
============================= 
"Thread-1": 
waiting to lock monitor 0x0003f334 (object 0x22c19f18, a java.lang.Object), 
which is held by "Thread-0" 

"Thread-0": 
waiting to lock monitor 0x0003f314 (object 0x22c19f20, a java.lang.Object), 
which is held by "Thread-1" 

6.热锁

热锁,也往往是导致系统性能瓶颈的主要因素。其表现特征为:由于多个线程对临界区,或者锁的竞争,可能出现:

1.频繁的线程的上下文切换:从操作系统对线程的调度来看,当线程在等待资源而阻塞的时候,操作系统会将之切换出来,放到等待的队列,
当线程获得资源之后,调度算法会将这个线程切换进去,放到执行队列中。

2.大量的系统调用:因为线程的上下文切换,以及热锁的竞争,或者临界区的频繁的进出,都可能导致大量的系统调用。

3.大部分CPU开销用在“系统态”:线程上下文切换,和系统调用,都会导致 CPU在 “系统态 ”运行,换而言之,虽然系统很忙碌,但是CPU用在 “用户态 ”的比例较小,应用程序得不到充分的 CPU资源。

4.随着CPU数目的增多,系统的性能反而下降。因为CPU数目多,同时运行的线程就越多,可能就会造成更频繁的线程上下文切换和系统态的CPU开销,从而导致更糟糕的性能。


上面的描述,都是一个 scalability(可扩展性)很差的系统的表现。从整体的性能指标看,由于线程热锁的存在,程序的响应时间会变长,吞吐量会降低。

参考:

啃碎并发(四):Java线程Dump分析 - 掘金 (juejin.cn)

以上是关于Android ANR分析-笔记的主要内容,如果未能解决你的问题,请参考以下文章

android ANR traces.txt文件导出分析

Android ANR log trace日志文件分析

NFC问题分析之死锁引起的ANR

Android ANR优化 2

如何分析解决Android ANR

Android ANR 分析 /data/anr/trace.txt 该文件记录了产生ANR的函数堆栈可以帮助分析