20210519 使用jstack命令排查线程死锁问题
Posted 陈如水
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了20210519 使用jstack命令排查线程死锁问题相关的知识,希望对你有一定的参考价值。
问题: 针对线上多线程死锁、阻塞,跑着跑着就卡住了;查看线上线程池的状态;
jstack主要用于生成java虚拟机当前时刻的线程快照。线程快照是当前java虚拟机内每一条线程正在执行的方法堆栈的集合,生成线程快照的主要目的是定位线程出现长时间停顿的原因,如线程间死锁、死循环、请求外部资源导致的长时间等待等。 线程出现停顿的时候通过jstack来查看各个线程的调用堆栈,就可以知道没有响应的线程到底在后台做什么事情,或者等待什么资源。
jstack: java堆栈跟踪工具(线程快照和内存快照)
-
执行jstack命令,将得到进程的堆栈信息。我一般使用jstack -l pid来得到长列表,显示其详细信息。
-
有时线程挂起的时候,需要执行jstack -F pid来获取。
-
在实际运行中,往往一次 dump的信息,还不足以确认问题。建议产生三次 dump信息,如果每次 dump都指向同一个问题,我们才确定问题的典型性。
-
堆栈信息只是一种参考,一些正常RUNNING的线程,由于复杂网络环境和IO的影响,也是有问题的,用jstack就无法定位,需要结合对业务代码的理解。
常用命令
jps
top -H -p <pid>
jstack -l <pid>
top -H -p <pid>
1)在当前用户下,列出pid包含的所有线程。查询某个进程中所有的线程信息,cpu使用和内存。
2)线程id(十进制)和线程快照中nid相同(十六进制,操作系统级别的线程id)
3) 关注某个进程中,哪些线程的cpu和内存使用率最高,找到对应的线程快照,并且重点关注线程状态和业务代码。并且关注线程中,是否存在死锁。
1,死锁代码如下
/**
* 测试死锁问题
*/
public class DeadLockDemo {
public static void main(String[] args) {
testDeadLock();
}
private static void testDeadLock() {
ReentrantLock lock1 = new ReentrantLock();
ReentrantLock lock2 = new ReentrantLock();
Thread thread1 = new Thread(new Runnable() {
@Override
public void run() {
lock1.lock();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
lock2.lock();
}
});
Thread thread2 = new Thread(new Runnable() {
@Override
public void run() {
lock2.lock();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
lock1.lock();
}
});
thread1.setName("thread1");
thread2.setName("thread2");
System.out.println("启动线程");
thread1.start();
thread2.start();
}
}
2,线程快照如下:
命令:jps获取进程编号,jstack -l 82595 >> thread.txt
使用jstack查看服务进程的线程情况
2021-05-19 15:29:10
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.291-b10 mixed mode):
"Attach Listener" #14 daemon prio=9 os_prio=31 tid=0x00007f8b2c45c800 nid=0x9c03 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"DestroyJavaVM" #13 prio=5 os_prio=31 tid=0x00007f8b2c015000 nid=0x2403 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"thread2" #12 prio=5 os_prio=31 tid=0x00007f8b1c016800 nid=0x9e03 waiting on condition [0x000000030b82d000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076b014f98> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at com.anjuke.aifang.aifang.building.operate.scf.utils.DeadLockDemo$2.run(DeadLockDemo.java:42)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- <0x000000076b014fc8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
"thread1" #11 prio=5 os_prio=31 tid=0x00007f8b2d55b000 nid=0x6493 waiting on condition [0x000000030b72a000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076b014fc8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at com.anjuke.aifang.aifang.building.operate.scf.utils.DeadLockDemo$1.run(DeadLockDemo.java:29)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- <0x000000076b014f98> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
"Service Thread" #10 daemon prio=9 os_prio=31 tid=0x00007f8afc023000 nid=0x5e0f runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"C1 CompilerThread3" #9 daemon prio=9 os_prio=31 tid=0x00007f8b2c401000 nid=0xa503 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"C2 CompilerThread2" #8 daemon prio=9 os_prio=31 tid=0x00007f8b2d8a0000 nid=0x5b03 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"C2 CompilerThread1" #7 daemon prio=9 os_prio=31 tid=0x00007f8b2d812000 nid=0x5903 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"C2 CompilerThread0" #6 daemon prio=9 os_prio=31 tid=0x00007f8b2c3e0800 nid=0x5803 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"Monitor Ctrl-Break" #5 daemon prio=5 os_prio=31 tid=0x00007f8b2c3dd800 nid=0x570b runnable [0x000000030b015000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
- locked <0x000000076b06aab8> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
- locked <0x000000076b06aab8> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:48)
Locked ownable synchronizers:
- None
"Signal Dispatcher" #4 daemon prio=9 os_prio=31 tid=0x00007f8b2c808800 nid=0xa903 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"Finalizer" #3 daemon prio=8 os_prio=31 tid=0x00007f8b2c06f000 nid=0x351f in Object.wait() [0x000000030ad00000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000076ab08ee0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x000000076ab08ee0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)
Locked ownable synchronizers:
- None
"Reference Handler" #2 daemon prio=10 os_prio=31 tid=0x00007f8b2c06c800 nid=0x4723 in Object.wait() [0x000000030abfd000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000076ab06c00> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x000000076ab06c00> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
Locked ownable synchronizers:
- None
"VM Thread" os_prio=31 tid=0x00007f8b2c065800 nid=0x4903 runnable
"GC task thread#0 (ParallelGC)" os_prio=31 tid=0x00007f8b2d817800 nid=0x2d4f runnable
"GC task thread#1 (ParallelGC)" os_prio=31 tid=0x00007f8b2d818000 nid=0x5303 runnable
"GC task thread#2 (ParallelGC)" os_prio=31 tid=0x00007f8b2d818800 nid=0x2f03 runnable
"GC task thread#3 (ParallelGC)" os_prio=31 tid=0x00007f8b2c026000 nid=0x5003 runnable
"GC task thread#4 (ParallelGC)" os_prio=31 tid=0x00007f8b2c026800 nid=0x4f03 runnable
"GC task thread#5 (ParallelGC)" os_prio=31 tid=0x00007f8b2c027000 nid=0x3103 runnable
"GC task thread#6 (ParallelGC)" os_prio=31 tid=0x00007f8b2c027800 nid=0x4d03 runnable
"GC task thread#7 (ParallelGC)" os_prio=31 tid=0x00007f8b2c028800 nid=0x4b03 runnable
"VM Periodic Task Thread" os_prio=31 tid=0x00007f8b2d55a000 nid=0x6163 waiting on condition
JNI global references: 15
Found one Java-level deadlock:
=============================
"thread2":
waiting for ownable synchronizer 0x000000076b014f98, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "thread1"
"thread1":
waiting for ownable synchronizer 0x000000076b014fc8, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "thread2"
Java stack information for the threads listed above:
===================================================
"thread2":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076b014f98> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at com.anjuke.aifang.aifang.building.operate.scf.utils.DeadLockDemo$2.run(DeadLockDemo.java:42)
at java.lang.Thread.run(Thread.java:748)
"thread1":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076b014fc8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at com.anjuke.aifang.aifang.building.operate.scf.utils.DeadLockDemo$1.run(DeadLockDemo.java:29)
at java.lang.Thread.run(Thread.java:748)
Found 1 deadlock.
线程快照查看:
1)taskExecutor- 为线程词的名称;
2)java.lang.Thread.State 线程的状态;
3)具体为线程的执行到什么动作,代码行数;
4)tid: java内的线程id;
5)nid: 操作系统级别线程的线程id;
6)prio: java内定义的线程的优先级;
7)os_prio:操作系统级别的优先级;
一个Thread对象可以有多个状态,在java.lang.Thread.State中,总共定义六种状态:
1、NEW
线程刚刚被创建,也就是已经new过了,但是还没有调用start()方法,jstack命令不会列出处于此状态的线程信息。
2、RUNNABLE #java.lang.Thread.State: RUNNABLE
RUNNABLE这个名字很具有欺骗性,很容易让人误以为处于这个状态的线程正在运行。事实上,这个状态只是表示,线程是可运行的。我们已经无数次提到过,一个单核CPU在同一时刻,只能运行一个线程。
3、BLOCKED # java.lang.Thread.State: BLOCKED (on object monitor)
线程处于阻塞状态,正在等待一个monitor lock。通常情况下,是因为本线程与其他线程公用了一个锁。其他在线程正在使用这个锁进入某个synchronized同步方法块或者方法,而本线程进入这个同步代码块也需要这个锁,最终导致本线程处于阻塞状态。
4、WAITING
等待状态,调用以下方法可能会导致一个线程处于等待状态:
Object.wait 不指定超时时间 # java.lang.Thread.State: WAITING (on object monitor)
Thread.join with no timeout
LockSupport.park #java.lang.Thread.State: WAITING (parking)
例如:对于wait()方法,一个线程处于等待状态,通常是在等待其他线程完成某个操作。本线程调用某个对象的wait()方法,其他线程处于完成之后,调用同一个对象的notify或者notifyAll()方法。Object.wait()方法只能够在同步代码块中调用。调用了wait()方法后,会释放锁。
5、TIMED_WAITING
线程等待指定的时间,对于以下方法的调用,可能会导致线程处于这个状态:
Thread.sleep #java.lang.Thread.State: TIMED_WAITING (sleeping)
Object.wait 指定超时时间 #java.lang.Thread.State: TIMED_WAITING (on object monitor)
Thread.join with timeout
LockSupport.parkNanos #java.lang.Thread.State: TIMED_WAITING (parking)
LockSupport.parkUntil #java.lang.Thread.State: TIMED_WAITING (parking)
6、TERMINATED 线程终止。
7、parking指线程处于挂起中。
Wait on condition 等待资源,或等待某个条件的发生。具体原因需结合 stacktrace来分析。
常见情况是该线程在 sleep,等待 sleep的时间到了时候,将被唤醒。
关键字:TIMED_WAITING,sleeping,parking。TIMED_WAITING可能是调用了有超时参数的wait所引起的。
parking指线程处于挂起中。
如果发现有大量的线程都在处在 Wait on condition, 可能出现了某中瓶颈。
如果堆栈信息明确是应用代码,则证明该线程正在等待资源。一般是大量读取某资源,且该资源采用了资源锁的情况下,线程进入等待状态,等待资源的读取。
Blocked
线程阻塞,是指当前线程执行过程中,所需要的资源长时间等待却一直未能获取到,被容器的线程管理器标识为阻塞状态,可以理解为等待资源超时的线程。如果线程处于Blocked状态,但是原因不清楚。可以使用jstack -m pid得到线程的mixed信息。
在dump 文件里,哪些需要重点关注:
- 死锁,Deadlock(重点关注)
- 执行中,Runnable
- 等待资源,Waiting on condition(重点关注)
- 等待获取监视器,Waiting on monitor entry(重点关注)
- 对象等待中,Object.wait() 或 TIMED_WAITING
- 暂停,Suspended
- 阻塞,Blocked(重点关注)
- 停止,Parked
3,分析
Waiting on monitor entry 意味着线程在等待进入一个临界区;Monitor是Java中用以实现线程之间的互斥与协作的主要手段,它可以看成是对象或者 Class的锁。每一个对象都有,也仅有一个 monitor。
thread2正在等待锁0x000000076b014f98,它被线程1所持有;
thread1正在等待锁0x000000076b014fc8,它被线程2所持有;这样就造成了死锁问题。
并且堆栈也记录了发生死锁的代码行数。
4,在线分析线程快照
把jstack
输出到文本中,然后使用分析工具查看(分析线程状态)。
https://fastthread.io/ 线程快照在线分析工具,上传线程快照后,查询是否有死锁问题。
以上是关于20210519 使用jstack命令排查线程死锁问题的主要内容,如果未能解决你的问题,请参考以下文章