一次简单的死锁分析

Posted 万剑齐发

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了一次简单的死锁分析相关的知识,希望对你有一定的参考价值。

数据库错误日志中监测到如下死锁:

deadlock-list
 deadlock victim=process16fc9dd498
  process-list
   process id=process16fc9dd498 taskpriority=0 logused=0 waitresource=KEY: 7:72057595333771264 (58bb110a434d) waittime=2239 ownerId=8413250494 transactionname=user_transaction lasttranstarted=2016-07-01T22:52:34.180 XDES=0x892f80d28 lockMode=U schedulerid=81 kpid=10732 status=suspended spid=242 sbid=0 ecid=0 priority=0 trancount=3 lastbatchstarted=2016-07-01T22:52:34.183 lastbatchcompleted=2016-07-01T22:52:34.180 lastattention=1900-01-01T00:00:00.180 clientapp=pytds hostname=adj52 hostpid=31708 loginname=app_rw isolationlevel=read committed (2) xactid=8413250494 currentdb=7 lockTimeout=4294967295 clientoption1=671219744 clientoption2=128056
    executionStack
     frame procname=xxx.dbo.gsp_InsertCustomerInfo_v2 line=32 stmtstart=2358 stmtend=2582 sqlhandle=0x030007004577437e8a39af005ba5000001000000000000000000000000000000000000000000000000000000
update D_CustomerInfo set CustomerName=@CustomerName, LastCalledAddress=@Address where cellPhone=@CellPhone     
     frame procname=adhoc line=1 stmtstart=316 sqlhandle=0x010007008d0ab52e709f6cd40100000000000000000000000000000000000000000000000000000000000000
exec gsp_InsertCustomerInfo_v2 
     frame procname=unknown line=1 sqlhandle=0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
unknown     
    inputbuf
   exec gsp_InsertCustomerInfo_v2      
   process id=process16fd5b1868 taskpriority=0 logused=40140 waitresource=KEY: 7:72057595738193920 (8111eec6ede3) waittime=2224 ownerId=8411580132 transactionname=user_transaction lasttranstarted=2016-07-01T22:40:56.053 XDES=0x5405ce3a8 lockMode=U schedulerid=52 kpid=9136 status=suspended spid=339 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2016-07-01T22:52:34.180 lastbatchcompleted=2016-07-01T22:40:56.053 lastattention=2016-07-01T22:40:56.053 clientapp=pytds hostname=adj52 hostpid=33727 loginname=app_rw isolationlevel=read committed (2) xactid=8411580132 currentdb=7 lockTimeout=5000 clientoption1=671219744 clientoption2=128056
    executionStack
     frame procname=xxx.dbo.zxf_InsertOrder_v9 line=1109 stmtstart=62672 stmtend=62860 sqlhandle=0x030007009ab5d3576d78a30036a6000001000000000000000000000000000000000000000000000000000000
UPDATE D_CustomerInfo set isFree = 2,isbaidu=1  where Cellphone = @CellPhone 
    --更新司机状态     
     frame procname=adhoc line=1 stmtstart=2308 sqlhandle=0x01000700bdbd0606b05ecde93100000000000000000000000000000000000000000000000000000000000000
exec zxf_InsertOrder_v9     
     frame procname=unknown line=1 sqlhandle=0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
unknown     
    inputbuf
(@CellPhone VARCHAR(MAX),@FromCellPhone VARCHAR(MAX).... 
  resource-list
   keylock hobtid=72057595333771264 dbid=7 objectname=xxx.dbo.D_CustomerInfo indexname=PK_D_CustomerInfo id=lock124ed8ee00 mode=X associatedObjectId=72057595333771264
    owner-list
     owner id=process16fd5b1868 mode=X
    waiter-list
     waiter id=process16fc9dd498 mode=U requestType=wait
   keylock hobtid=72057595738193920 dbid=7 objectname=aidaijiaNew.dbo.D_CustomerInfo indexname=idx_Cellphone__CustomerId_CustomerName_RecommendCode_Amount id=lock7bc3b8380 mode=U associatedObjectId=72057595738193920
    owner-list
     owner id=process16fc9dd498 mode=U
    waiter-list
     waiter id=process16fd5b1868 mode=U requestType=wait
View Code

process16fc9dd498

xxx.dbo.xxx_InsertCustomerInfo

的语句:update XXCustomerInfo set CustomerName=@CustomerName, LastCalledAddress=@Address where cellPhone=@CellPhone 

持有索引idx_Cellphone__xxxx页上7:72057595738193920 (8111eec6ede3)数据行的更新锁,等待获取主键PK_D_CustomerInfo上7:72057595333771264 (58bb110a434d)数据行的更新锁

 

process16fd5b1868

xxx.dbo.xxx_InsertOrder_v9

的语句:UPDATE XXCustomerInfo set isFree = 2,isbaidu=1  where Cellphone = @CellPhone

持有主键PK_D_CustomerInfo上7:72057595333771264 (58bb110a434d)数据行的排他锁,等待获取索引idx_Cellphone__xxxx页上7:72057595738193920 (8111eec6ede3)数据行的更新锁

最终process16fc9dd498成为了牺牲品。

--------------------------------------------------------

主键PK_XXCustomerInfo上锁定的内容可以这样查到:

SELECT %%lockres%% AS keyhashvalue,cellphone FROM  D_CustomerInfo
where 
%%lockres%% = \'(58bb110a434d)\'

那么索引idx_Cellphone__xxxx上锁定的内容7:72057595738193920 (8111eec6ede3)是不是也是同一个电话号码呢?我们可以试着从索引根节点往下找找看cellphone为13917XX6177的索引页上keyhashvalue是不是8111eec6ede3
首先找到索引的根节点:

Select dbo.f_get_page(I.root_page) root_page
FROM sys.partitions P
INNER JOIN SYS.allocation_units A
ON P.partition_id = A.container_id
INNER JOIN SYS.system_internals_allocation_units I
ON A.allocation_unit_id = I.allocation_unit_id
WHERE P.object_id = OBJECT_ID(\'XXCustomerInfo\')
AND P.index_id = 66

再看看根节点的内容:

 

再找13916XXXX的ChildPage8503545

....

再找到ChildPage8506395,证实索引页上被锁定的也是同一个电话号码13917XX6177(KeyHashValue为8111eec6ede3)

 

-----------------------------------

死锁的形成过程:

1.xxx_InsertOrder存储过程上开启了事务,并且先更新了一次D_CustomerInfo,但此更新不涉及idx_Cellphone__xxxx上的内容更新,因此仅持有了主键上13917XX6177这一行的排他锁。

2.xxx_InsertCustomerInfo存储过程请求更新XXCustomerInfo,查询条件为cellphone,因此先获取了idx_Cellphone__xxxx上13917XX6177这一行的更新锁,之后申请主键上13917XX6177这一行更新锁时被迫等待xxx_InsertOrder释放排他锁。

3.xxx_InsertOrder执行到后面再次需要更新XXCustomerInfo,更新查询条件为cellphone,因此申请获取idx_Cellphone__xxxx上13917XX6177这一行的更新锁,但由于更新锁被xxx_InsertCustomerInfo给占有了,被迫等待。

因此互相构成了死锁,xxx_InsertCustomerInfo_v2成为了牺牲品。

----------------------------------

死锁的解决方案:

解决方案1.xxx_InsertOrde中两次更新D_CustomerInfo改为集中到一次更新

解决方案2.xxx_InsertOrder中在判断客户是否存在时可以直接select *** from XXCustomerInfo with(updlock,rowlock) where cellphone = XXXX,那么之后的update 也不会释放idx_Cellphone__xxxx上的更新锁,当然这会使锁定资源变多,时间变长。

 

以上是关于一次简单的死锁分析的主要内容,如果未能解决你的问题,请参考以下文章

实现死锁的简单案例和分析是否产生死锁

记一次死锁问题的排查和解决

Mysql死锁如何排查:insert on duplicate死锁一次排查分析过程

一次Mysql下批量更新造成的死锁案例分析

MySQL一次死锁排查过程分析(双update)

DB2一次死锁问题的分析过程