Infinispan 集群同步问题

Posted

技术标签:

【中文标题】Infinispan 集群同步问题【英文标题】:Infinispan Cluster Synchronization Problems 【发布时间】:2019-03-28 13:20:12 【问题描述】:

我们使用具有 6 个实例的 JBoss 集群 (EAP 6.4.10),并大量使用捆绑的 Infinispan 5.2.11 来处理不同的内存网格用例。然而,它们中的大多数是分布式缓存(实际上是复制)。我们还将分布式事务和 JMS 绑定到混合中的事务边界。后端是 SQL Server 2016 Enterprise,即使在这里事务也可以跨越多个 SQL Server 实例和数据库 (DTC)。 Infinispan 的同步是通过使用 JGroups 3.2.13 的 UDP 多播完成的

有时,尤其是在重负载下或之后,我们会遇到 JBoss 工作线程堆积在看似永远不会被释放的特定 Infinispan 内部锁上的问题。因此,http 连接池饥饿,数据库上的打开事务不会被提交或回滚; JMS 消息丢失;我们在数据库上面临一堆锁,这些锁会影响连接到同一数据库的所有其他系统(它们是集群中的其他 JBoss 实例)。

目前我们观察 http 线程池,一旦线程开始堆积一定时间,实例就会从负载均衡器中移除并关闭。 有时这是不必要的,即集群会自行修复,并且罪魁祸首实例在没有人工干预的情况下再次开始正常运行。 但是,通常唯一的方法是手动重新启动实例并保持手指交叉以再次形成正确的 infinispan 集群。

来自挂起实例的堆栈跟踪总是显示几个有趣的特性:

    一组线程(http worker)堆积在 JGroup 的流控制实现深处的一个条件上;在这些情况下,应用程序的最后一次调用通常是对分布式缓存的一些操作(例如remove()):

    http-threads - 178 awaiting notification on [ 0x0000000374059958 ]
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
        at org.jgroups.util.CreditMap.decrement(CreditMap.java:157)
        at org.jgroups.protocols.MFC.handleDownMessage(MFC.java:104)
        at org.jgroups.protocols.FlowControl.down(FlowControl.java:341)
        at org.jgroups.protocols.FRAG2.down(FRAG2.java:148)
        at org.jgroups.protocols.RSVP.down(RSVP.java:143)
        at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:1030)
        at org.jgroups.JChannel.down(JChannel.java:722)
        at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:618)
        at org.jgroups.blocks.RequestCorrelator.sendRequest(RequestCorrelator.java:174)
        at org.jgroups.blocks.GroupRequest.sendRequest(GroupRequest.java:360)
        at org.jgroups.blocks.GroupRequest.sendRequest(GroupRequest.java:103)
        at org.jgroups.blocks.Request.execute(Request.java:83)
        at org.jgroups.blocks.MessageDispatcher.cast(MessageDispatcher.java:337)
        at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:249)
        at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processCalls(CommandAwareRpcDispatcher.java:333)
        at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommands(CommandAwareRpcDispatcher.java:146)
        at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.broadcastRemoteCommands(CommandAwareRpcDispatcher.java:197)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:498)
        at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:173)
        at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:194)
        at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:251)
        at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:238)
        at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:233)
        at org.infinispan.remoting.rpc.RpcManagerImpl.broadcastRpcCommand(RpcManagerImpl.java:212)
        at org.infinispan.remoting.rpc.RpcManagerImpl.broadcastRpcCommand(RpcManagerImpl.java:204)
        at org.infinispan.interceptors.ReplicationInterceptor.handleCrudMethod(ReplicationInterceptor.java:307)
        at org.infinispan.interceptors.ReplicationInterceptor.visitRemoveCommand(ReplicationInterceptor.java:269)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
        at org.infinispan.interceptors.EntryWrappingInterceptor.invokeNextAndApplyChanges(EntryWrappingInterceptor.java:302)
        at org.infinispan.interceptors.EntryWrappingInterceptor.visitRemoveCommand(EntryWrappingInterceptor.java:207)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
        at org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor.visitRemoveCommand(NonTransactionalLockingInterceptor.java:124)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:134)
        at org.infinispan.commands.AbstractVisitor.visitRemoveCommand(AbstractVisitor.java:67)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:134)
        at org.infinispan.commands.AbstractVisitor.visitRemoveCommand(AbstractVisitor.java:67)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
        at org.infinispan.statetransfer.StateTransferInterceptor.handleTopologyAffectedCommand(StateTransferInterceptor.java:284)
        at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:222)
        at org.infinispan.statetransfer.StateTransferInterceptor.visitRemoveCommand(StateTransferInterceptor.java:171)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
        at org.infinispan.interceptors.CacheMgmtInterceptor.visitRemoveCommand(CacheMgmtInterceptor.java:137)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:128)
        at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:92)
        at org.infinispan.commands.AbstractVisitor.visitRemoveCommand(AbstractVisitor.java:67)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:343)
        at org.infinispan.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1186)
        at org.infinispan.CacheImpl.removeInternal(CacheImpl.java:314)
        at org.infinispan.CacheImpl.remove(CacheImpl.java:308)
        at org.infinispan.CacheImpl.remove(CacheImpl.java:302)
        at org.infinispan.AbstractDelegatingCache.remove(AbstractDelegatingCache.java:313)
        at com.company.project.information.AuthenticationService.processChallengeImpl(AuthenticationService.java:155)
        at com.company.project.information.AuthenticationService.processChallengeForLogin(AuthenticationService.java:132)
        at com.company.project.information.AuthenticationService.respondChallenge(AuthenticationService.java:418)
        at com.company.project.security.auth.ServerLoginModule.verify(ServerLoginModule.java:280)
        at com.company.project.security.auth.ServerLoginModule._login(ServerLoginModule.java:166)
        at com.company.project.security.auth.ServerLoginModule.login(ServerLoginModule.java:87)
        at sun.reflect.GeneratedMethodAccessor487.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
        at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
        at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
        at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
        at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
        at org.jboss.security.authentication.JBossCachedAuthenticationManager.defaultLogin(JBossCachedAuthenticationManager.java:399)
        at org.jboss.security.authentication.JBossCachedAuthenticationManager.proceedWithJaasLogin(JBossCachedAuthenticationManager.java:338)
        at org.jboss.security.authentication.JBossCachedAuthenticationManager.authenticate(JBossCachedAuthenticationManager.java:326)
        at org.jboss.security.authentication.JBossCachedAuthenticationManager.isValid(JBossCachedAuthenticationManager.java:142)
        at org.jboss.as.security.service.SimpleSecurityManager.authenticate(SimpleSecurityManager.java:418)
        at org.jboss.as.security.service.SimpleSecurityManager.authenticate(SimpleSecurityManager.java:377)
        at org.jboss.as.ejb3.security.SecurityContextInterceptor$1.run(SecurityContextInterceptor.java:54)
        at org.jboss.as.ejb3.security.SecurityContextInterceptor$1.run(SecurityContextInterceptor.java:48)
        at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:86)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
        at org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
        at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
        at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
        at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
        at org.jboss.as.ejb3.component.interceptors.AdditionalSetupInterceptor.processInvocation(AdditionalSetupInterceptor.java:55)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
        at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61)
        at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:189)
        at org.jboss.as.ejb3.remote.LocalEjbReceiver.processInvocation(LocalEjbReceiver.java:271)
        at org.jboss.ejb.client.EJBClientInvocationContext.sendRequest(EJBClientInvocationContext.java:184)
        at org.jboss.ejb.client.EJBObjectInterceptor.handleInvocation(EJBObjectInterceptor.java:58)
        at org.jboss.ejb.client.EJBClientInvocationContext.sendRequest(EJBClientInvocationContext.java:186)
        at org.jboss.ejb.client.EJBHomeInterceptor.handleInvocation(EJBHomeInterceptor.java:83)
        at org.jboss.ejb.client.EJBClientInvocationContext.sendRequest(EJBClientInvocationContext.java:186)
        at org.jboss.ejb.client.TransactionInterceptor.handleInvocation(TransactionInterceptor.java:42)
        at org.jboss.ejb.client.EJBClientInvocationContext.sendRequest(EJBClientInvocationContext.java:186)
        at org.jboss.ejb.client.ReceiverInterceptor.handleInvocation(ReceiverInterceptor.java:125)
        at org.jboss.ejb.client.EJBClientInvocationContext.sendRequest(EJBClientInvocationContext.java:186)
        at com.company.project.module.service.CallIdPropagator$1.handleInvocation(CallIdPropagator.java:60)
        at org.jboss.ejb.client.EJBClientInvocationContext.sendRequest(EJBClientInvocationContext.java:186)
        at org.jboss.ejb.client.EJBInvocationHandler.sendRequestWithPossibleRetries(EJBInvocationHandler.java:255)
        at org.jboss.ejb.client.EJBInvocationHandler.doInvoke(EJBInvocationHandler.java:200)
        at org.jboss.ejb.client.EJBInvocationHandler.doInvoke(EJBInvocationHandler.java:183)
        at org.jboss.ejb.client.EJBInvocationHandler.invoke(EJBInvocationHandler.java:146)
        at com.sun.proxy.$Proxy653.hasOneOfThisNetworkFeatures(
        Unknown Source)
        at sun.reflect.GeneratedMethodAccessor773.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.cgm.life.clientservices.Invoker.doPost(Invoker.java:129)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:754)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:295)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:231)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:149)
        at org.jboss.as.jpa.interceptor.WebNonTxEmCloserValve.invoke(WebNonTxEmCloserValve.java:50)
        at org.jboss.as.jpa.interceptor.WebNonTxEmCloserValve.invoke(WebNonTxEmCloserValve.java:50)
        at org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:169)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:150)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:97)
        at org.jboss.web.rewrite.RewriteValve.invoke(RewriteValve.java:466)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:102)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:344)
        at org.apache.coyote.ajp.AjpAprProcessor.process(AjpAprProcessor.java:475)
        at org.apache.coyote.ajp.AjpAprProtocol$AjpConnectionHandler.process(AjpAprProtocol.java:454)
        at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:2562)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
        at org.jboss.threads.JBossThread.run(JBossThread.java:122)
    

    (完整的堆栈跟踪只是为了给你整个画面)。此示例中的条件 (0x0000000374059958) 阻塞了数百个具有不同堆栈跟踪的线程;不过,它们都共享对某个分布式缓存的调用。

    另一组线程(似乎是 UDP 协议处理程序,我们有一大堆)也是对象上的awaiting notification。上下文似乎与 Infinispan 集群中的状态转移有关:

    Incoming-8,shared=udp awaiting notification on [ 0x00000003810b5188 ]
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:502)
        at org.infinispan.statetransfer.StateTransferLockImpl.waitForTransactionData(StateTransferLockImpl.java:100)
        at org.infinispan.statetransfer.StateTransferInterceptor.updateTopologyIdAndWaitForTransactionData(StateTransferInterceptor.java:311)
        at org.infinispan.statetransfer.StateTransferInterceptor.handleTopologyAffectedCommand(StateTransferInterceptor.java:281)
        at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:222)
        at org.infinispan.statetransfer.StateTransferInterceptor.visitPutKeyValueCommand(StateTransferInterceptor.java:156)
        at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:82)
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:128)
        at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:92)
        at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:62)
        at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:82)
        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:343)
        at org.infinispan.commands.remote.BaseRpcInvokingCommand.processVisitableCommand(BaseRpcInvokingCommand.java:61)
        at org.infinispan.commands.remote.SingleRpcCommand.perform(SingleRpcCommand.java:70)
        at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:100)
        at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:121)
        at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:85)
        at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:247)
        at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:220)
        at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:484)
        at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391)
        at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249)
        at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:600)
        at org.jgroups.blocks.mux.MuxUpHandler.up(MuxUpHandler.java:130)
        at org.jgroups.JChannel.up(JChannel.java:707)
        at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1025)
        at org.jgroups.protocols.RSVP.up(RSVP.java:188)
        at org.jgroups.protocols.FRAG2.up(FRAG2.java:182)
        at org.jgroups.protocols.FlowControl.up(FlowControl.java:400)
        at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)
        at org.jgroups.protocols.pbcast.GMS.up(GMS.java:897)
        at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:247)
        at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:453)
        at org.jgroups.protocols.pbcast.NAKACK.handleMessage(NAKACK.java:793)
        at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:609)
        at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:147)
        at org.jgroups.protocols.FD.up(FD.java:253)
        at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288)
        at org.jgroups.protocols.MERGE3.up(MERGE3.java:290)
        at org.jgroups.protocols.Discovery.up(Discovery.java:359)
        at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2616)
        at org.jgroups.protocols.TP.passMessageUp(TP.java:1269)
        at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1831)
        at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1804)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
    

    只有 4 个线程在等待这个通知。

    但是,还有更多的线程(例如,数百个)在不同的对象上等待,但在代码中的相同位置:

    OOB-4950,shared=udp awaiting notification on [ 0x000000038bccd2b0 ]
    OOB-4951,shared=udp awaiting notification on [ 0x00000003bacf4930 ]
    OOB-4952,shared=udp awaiting notification on [ 0x000000038bccd2b0 ]
    OOB-4953,shared=udp awaiting notification on [ 0x000000038bccd2b0 ]
    OOB-4954,shared=udp awaiting notification on [ 0x00000003bacf4930 ]
    OOB-4956,shared=udp awaiting notification on [ 0x000000038bccd2b0 ]
    OOB-4958,shared=udp awaiting notification on [ 0x000000038bccd2b0 ]
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:502)
        at org.infinispan.statetransfer.StateTransferLockImpl.waitForTransactionData(StateTransferLockImpl.java:100)
        at org.infinispan.statetransfer.StateTransferInterceptor.updateTopologyIdAndWaitForTransactionData(StateTransferInterceptor.java:311)
        at org.infinispan.statetransfer.StateTransferInterceptor.handleTopologyAffectedCommand(StateTransferInterceptor.java:281)
        at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:222)
        at org.infinispan.statetransfer.StateTransferInterceptor.visitRemoveCommand(StateTransferInterceptor.java:171)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
        at org.infinispan.interceptors.CacheMgmtInterceptor.visitRemoveCommand(CacheMgmtInterceptor.java:137)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:128)
        at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:92)
        at org.infinispan.commands.AbstractVisitor.visitRemoveCommand(AbstractVisitor.java:67)
        at org.infinispan.commands.write.RemoveCommand.acceptVisitor(RemoveCommand.java:73)
        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:343)
        at org.infinispan.commands.remote.BaseRpcInvokingCommand.processVisitableCommand(BaseRpcInvokingCommand.java:61)
        at org.infinispan.commands.remote.SingleRpcCommand.perform(SingleRpcCommand.java:70)
        at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:100)
        at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:121)
        at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:85)
        at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:247)
        at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:220)
        at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:484)
        at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391)
        at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249)
        at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:600)
        at org.jgroups.blocks.mux.MuxUpHandler.up(MuxUpHandler.java:130)
        at org.jgroups.JChannel.up(JChannel.java:707)
        at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1025)
        at org.jgroups.protocols.RSVP.up(RSVP.java:188)
        at org.jgroups.protocols.FRAG2.up(FRAG2.java:182)
        at org.jgroups.protocols.FlowControl.up(FlowControl.java:400)
        at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)
        at org.jgroups.protocols.pbcast.GMS.up(GMS.java:897)
        at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:247)
        at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:453)
        at org.jgroups.protocols.pbcast.NAKACK.handleMessage(NAKACK.java:751)
        at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:609)
        at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:147)
        at org.jgroups.protocols.FD.up(FD.java:253)
        at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288)
        at org.jgroups.protocols.MERGE3.up(MERGE3.java:290)
        at org.jgroups.protocols.Discovery.up(Discovery.java:359)
        at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2616)
        at org.jgroups.protocols.TP.passMessageUp(TP.java:1269)
        at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1831)
        at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1804)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
    

    还有线程在等待一个锁,该锁由等待 (1) 中的条件的线程持有:

    Transaction Reaper Worker 9 waiting to acquire [ 0x0000000507058c88 ]                                    
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.afterCompletion(TwoPhaseCoordinator.java:356
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.afterCompletion(TwoPhaseCoordinator.java:334
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.cancel(TwoPhaseCoordinator.java:120)        
        at com.arjuna.ats.arjuna.AtomicAction.cancel(AtomicAction.java:215)                                  
        at com.arjuna.ats.arjuna.coordinator.TransactionReaper.doCancellations(TransactionReaper.java:370)   
        at com.arjuna.ats.internal.arjuna.coordinator.ReaperWorkerThread.run(ReaperWorkerThread.java:78)
    

    其中一个 http 线程持有 0x0000000507058c88,而后者又等待 0x0000000374059958

关于我可以从系统中读取的内容,这就是它。里面严重卡了东西,但我不知道如何分析或排除故障,更不用说解决问题了。 (我们尝试更新到 EAP 6.4.20,但在此处可以观察到相同的行为,因此我们回滚了此更新,因为更新后出现了其他更新、更闪亮的问题。

我很乐意提供有助于诊断问题的更多信息(如特定配置参数)。

感谢任何人的帮助!

最好的, 马库斯

【问题讨论】:

不建议将捆绑的 Infinispan 实例用于您的用途。如果您在应用程序中捆绑 Infinispan 和 JGroups 版本,则可以更轻松地升级这些部分,而不依赖于基本 EAP/Wildfly 版本。我建议你转向这样的设置。最新的 Infinispan 版本是 9.4.0.Final。 @GalderZamarreño 感谢您的提示。目前,我们将大约 40 个工件(WAR 文件)部署到 EAP 实例,我担心在一台服务器上以某种方式运行 40 多个 infinispan 实例有点矫枉过正。另外,我们 AFAIK 无法使用 EAP 配置将缓存注入应用程序模块。另外,我了解red.ht/2kDmQ53 它实际上应该可以工作,或者作为设置没有那么牵强。除此之外,还有什么其他的吗?一个“受支持”的设置可能不会破坏我们使用 infinispan 的 40 多个模块? 一种选择是将状态推出 EAP 并将其放入 Infinispan 服务器实例。您可以在那里为每个 WAR 文件设置不同的缓存。 【参考方案1】:

事实证明,监控 GC 日志中的有趣事件非常有意义...x_X

当时的情况是,我们遇到了旧代 CMS 碎片,Java 虚拟机定期决定暂停应用程序服务器长达 45 秒以进行碎片整理。

我们切换到 G1,这些暂停以及 Infinispan 去同步也消失了。在过去的几个月里,这个设置现在运行得相当稳定。

【讨论】:

以上是关于Infinispan 集群同步问题的主要内容,如果未能解决你的问题,请参考以下文章

JBoss EAP 6.4 Infinispan 集群缓存网络问题

防止Infinispan集群化

在 Keycloak 集群的 Infinispan 缓存中存储用户会话的确切用途是啥?

Infinispan 集群锁性能不会随着节点的增加而提高吗?

Infinispan 中集群侦听器的可扩展性,用于具有许多键的缓存?

在 Wildfly 10.1 中禁用 Infinispan 集群