背景描述:
statis服务,接收mq消息,进行聚合数据,保存到ElaticSearch。
前天下午,运营管理人员反馈说,用户实际充值到账了,但是我们的后台延迟,甚至丢单。(后台订单展示都是通过es进行查询的,说明es的数据没有被保存进来)
下面是排查的步骤:
1、数据库是否充值成功,且es里的数据不存在。
--被证实是的。
2、使用kibana查看线上日志,某个服务节点一直报错如下:
send heart beat to broker exception org.apache.rocketmq.remoting.exception.RemotingTimeoutException: wait response on the channel <192.168.20.152:21001> timeout, 3000(ms)
at org.apache.rocketmq.remoting.netty.NettyRemotingAbstract.invokeSyncImpl(NettyRemotingAbstract.java:291)
at org.apache.rocketmq.remoting.netty.NettyRemotingClient.invokeSync(NettyRemotingClient.java:338)
at org.apache.rocketmq.client.impl.MQClientAPIImpl.sendHearbeat(MQClientAPIImpl.java:838)
at org.apache.rocketmq.client.impl.factory.MQClientInstance.sendHeartbeatToAllBroker(MQClientInstance.java:496)
at org.apache.rocketmq.client.impl.factory.MQClientInstance.sendHeartbeatToAllBrokerWithLock(MQClientInstance.java:410)
at org.apache.rocketmq.client.impl.factory.MQClientInstance$4.run(MQClientInstance.java:286)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
3、打开rocket mq console, 并检查broker运行正常。排除是mq消息中间件的故障。
4、top -H -p <pid> 查看应用节点的内部线程占用cpu情况。发现cpu飙升,有20来个线程大量占用cpu, 且都是java应用的,都居高不下。(pid,进程ID, 通过ps ef|grep java 或者 jps等都可以分析出)
5、初步怀疑是出现死锁,所以让运维jstack -l >/data/log/stack 打印出线程。
重点查这么几类情况:
1)wait on monitor entry: 被阻塞的,肯定有问题
"ConsumeMessageThread_19" #525 prio=5 os_prio=0 tid=0x00007f734c63e000 nid=0x222 waiting for monitor entry [0x00007f703c60d000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.hibernate.boot.registry.classloading.internal.ClassLoaderServiceImpl.classForName(ClassLoaderServiceImpl.java:226)
at org.hibernate.internal.util.ReflectHelper.getConstantValue(ReflectHelper.java:235)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl$JavaConstantConverter.handleDotStructure(QueryTranslatorImpl.java:615)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl$JavaConstantConverter.visit(QueryTranslatorImpl.java:610)
at org.hibernate.hql.internal.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:60)
at org.hibernate.hql.internal.ast.util.NodeTraverser.traverseDepthFirst(NodeTraverser.java:49)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.parse(QueryTranslatorImpl.java:284)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.doCompile(QueryTranslatorImpl.java:187)
- locked <0x0000000785d14ed0> (a org.hibernate.hql.internal.ast.QueryTranslatorImpl)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.compile(QueryTranslatorImpl.java:142)
at org.hibernate.engine.query.spi.HQLQueryPlan.<init>(HQLQueryPlan.java:115)
at org.hibernate.engine.query.spi.HQLQueryPlan.<init>(HQLQueryPlan.java:76)
at org.hibernate.engine.query.spi.QueryPlanCache.getHQLQueryPlan(QueryPlanCache.java:150)
at org.hibernate.internal.AbstractSessionImpl.getHQLQueryPlan(AbstractSessionImpl.java:302)
at org.hibernate.internal.AbstractSessionImpl.createQuery(AbstractSessionImpl.java:240)
at org.hibernate.internal.SessionImpl.createQuery(SessionImpl.java:1894)
at org.hibernate.jpa.spi.AbstractEntityManagerImpl.createQuery(AbstractEntityManagerImpl.java:531)
at org.hibernate.jpa.criteria.CriteriaQueryImpl$1.buildCompiledQuery(CriteriaQueryImpl.java:316)
at org.hibernate.jpa.criteria.compile.CriteriaCompiler.compile(CriteriaCompiler.java:130)
at org.hibernate.jpa.spi.AbstractEntityManagerImpl.createQuery(AbstractEntityManagerImpl.java:699)
at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.orm.jpa.ExtendedEntityManagerCreator$ExtendedEntityManagerInvocationHandler.invoke(ExtendedEntityManagerCreator.java:347)
at com.sun.proxy.$Proxy126.createQuery(Unknown Source)
at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.orm.jpa.SharedEntityManagerCreator$SharedEntityManagerInvocationHandler.invoke(SharedEntityManagerCreator.java:298)
at com.sun.proxy.$Proxy126.createQuery(Unknown Source)
at org.springframework.data.jpa.repository.support.SimpleJpaRepository.getQuery(SimpleJpaRepository.java:656)
at com.gws.utils.jpa.BaseSimpleJpaRepositoryEx.getQuery(BaseSimpleJpaRepositoryEx.java:292)
at org.springframework.data.jpa.repository.support.SimpleJpaRepository.getQuery(SimpleJpaRepository.java:633)
at com.gws.utils.jpa.BaseSimpleJpaRepositoryEx.getQuery(BaseSimpleJpaRepositoryEx.java:286)
at org.springframework.data.jpa.repository.support.SimpleJpaRepository.findOne(SimpleJpaRepository.java:387)
at com.gws.utils.jpa.BaseSimpleJpaRepositoryEx.findOne(BaseSimpleJpaRepositoryEx.java:189)
at com.gws.utils.jpa.BaseSimpleJpaRepositoryEx.findOne(BaseSimpleJpaRepositoryEx.java:122)
at sun.reflect.GeneratedMethodAccessor131.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.executeMethodOn(RepositoryFactorySupport.java:504)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:489)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:461)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:61)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:282)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:136)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor.invoke(CrudMethodMetadataPostProcessor.java:133)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.data.repository.core.support.SurroundingTransactionDetectorMethodInterceptor.invoke(SurroundingTransactionDetectorMethodInterceptor.java:57)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
at com.sun.proxy.$Proxy141.findOne(Unknown Source)
at com.gws.services.user.impl.UserServiceImpl.getUserBaseInfo(UserServiceImpl.java:98)
at com.gws.services.user.impl.UserManagerServiceImpl.saveUserOperation(UserManagerServiceImpl.java:106)
at com.gws.handle.rockeqmq.RocketMqListenHandler.onUserOperation(RocketMqListenHandler.java:81)
at sun.reflect.GeneratedMethodAccessor136.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:216)
at com.yushi.core.MqConsumerBeanAutoConfiguration$2.consumeMessage(MqConsumerBeanAutoConfiguration.java:190)
at org.apache.rocketmq.client.impl.consumer.ConsumeMessageConcurrentlyService$ConsumeRequest.run(ConsumeMessageConcurrentlyService.java:419)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- <0x00000007617e4e78> (a java.util.concurrent.ThreadPoolExecutor$Worker)
2)runnable : 注意IO线程
"ConsumeMessageThread_15" #470 prio=5 os_prio=0 tid=0x00007f727c086800 nid=0x1e9 runnable [0x00007f703e4c4000]
java.lang.Thread.State: RUNNABLE
at org.springframework.boot.loader.jar.Handler.getFileFromContext(Handler.java:176)
at org.springframework.boot.loader.jar.Handler.parseURL(Handler.java:152)
at java.net.URL.<init>(URL.java:622)
at java.net.URL.<init>(URL.java:490)
at sun.misc.URLClassPath$Loader.getResource(URLClassPath.java:692)
at sun.misc.URLClassPath.getResource(URLClassPath.java:212)
at java.net.URLClassLoader$1.run(URLClassLoader.java:365)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
- locked <0x000000074337d530> (a java.lang.Object)
at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:94)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.hibernate.boot.registry.classloading.internal.ClassLoaderServiceImpl$AggregatedClassLoader.findClass(ClassLoaderServiceImpl.java:209)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
- locked <0x00000006fb8ea7e0> (a org.hibernate.boot.registry.classloading.internal.ClassLoaderServiceImpl$AggregatedClassLoader)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.hibernate.boot.registry.classloading.internal.ClassLoaderServiceImpl.classForName(ClassLoaderServiceImpl.java:226)
at org.hibernate.internal.util.ReflectHelper.getConstantValue(ReflectHelper.java:235)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl$JavaConstantConverter.handleDotStructure(QueryTranslatorImpl.java:615)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl$JavaConstantConverter.visit(QueryTranslatorImpl.java:610)
at org.hibernate.hql.internal.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:60)
at org.hibernate.hql.internal.ast.util.NodeTraverser.traverseDepthFirst(NodeTraverser.java:49)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.parse(QueryTranslatorImpl.java:284)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.doCompile(QueryTranslatorImpl.java:187)
- locked <0x0000000785cff140> (a org.hibernate.hql.internal.ast.QueryTranslatorImpl)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.compile(QueryTranslatorImpl.java:142)
at org.hibernate.engine.query.spi.HQLQueryPlan.<init>(HQLQueryPlan.java:115)
at org.hibernate.engine.query.spi.HQLQueryPlan.<init>(HQLQueryPlan.java:76)
at org.hibernate.engine.query.spi.QueryPlanCache.getHQLQueryPlan(QueryPlanCache.java:150)
at org.hibernate.internal.AbstractSessionImpl.getHQLQueryPlan(AbstractSessionImpl.java:302)
at org.hibernate.internal.AbstractSessionImpl.createQuery(AbstractSessionImpl.java:240)
at org.hibernate.internal.SessionImpl.createQuery(SessionImpl.java:1894)
at org.hibernate.jpa.spi.AbstractEntityManagerImpl.createQuery(AbstractEntityManagerImpl.java:531)
at org.hibernate.jpa.criteria.CriteriaQueryImpl$1.buildCompiledQuery(CriteriaQueryImpl.java:316)
at org.hibernate.jpa.criteria.compile.CriteriaCompiler.compile(CriteriaCompiler.java:130)
at org.hibernate.jpa.spi.AbstractEntityManagerImpl.createQuery(AbstractEntityManagerImpl.java:699)
at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.orm.jpa.ExtendedEntityManagerCreator$ExtendedEntityManagerInvocationHandler.invoke(ExtendedEntityManagerCreator.java:347)
at com.sun.proxy.$Proxy126.createQuery(Unknown Source)
at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.orm.jpa.SharedEntityManagerCreator$SharedEntityManagerInvocationHandler.invoke(SharedEntityManagerCreator.java:298)
at com.sun.proxy.$Proxy126.createQuery(Unknown Source)
at org.springframework.data.jpa.repository.support.SimpleJpaRepository.getQuery(SimpleJpaRepository.java:656)
at com.gws.utils.jpa.BaseSimpleJpaRepositoryEx.getQuery(BaseSimpleJpaRepositoryEx.java:292)
at org.springframework.data.jpa.repository.support.SimpleJpaRepository.getQuery(SimpleJpaRepository.java:633)
at com.gws.utils.jpa.BaseSimpleJpaRepositoryEx.getQuery(BaseSimpleJpaRepositoryEx.java:286)
at org.springframework.data.jpa.repository.support.SimpleJpaRepository.findOne(SimpleJpaRepository.java:387)
at com.gws.utils.jpa.BaseSimpleJpaRepositoryEx.findOne(BaseSimpleJpaRepositoryEx.java:189)
at com.gws.utils.jpa.BaseSimpleJpaRepositoryEx.findOne(BaseSimpleJpaRepositoryEx.java:122)
at sun.reflect.GeneratedMethodAccessor131.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.executeMethodOn(RepositoryFactorySupport.java:504)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:489)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:461)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:61)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:282)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:136)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor.invoke(CrudMethodMetadataPostProcessor.java:133)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.data.repository.core.support.SurroundingTransactionDetectorMethodInterceptor.invoke(SurroundingTransactionDetectorMethodInterceptor.java:57)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
at com.sun.proxy.$Proxy141.findOne(Unknown Source)
at com.gws.services.user.impl.UserServiceImpl.getUserBaseInfo(UserServiceImpl.java:98)
at com.gws.services.user.impl.UserManagerServiceImpl.saveUserOperation(UserManagerServiceImpl.java:106)
at com.gws.handle.rockeqmq.RocketMqListenHandler.onUserOperation(RocketMqListenHandler.java:81)
at sun.reflect.GeneratedMethodAccessor136.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:216)
at com.yushi.core.MqConsumerBeanAutoConfiguration$2.consumeMessage(MqConsumerBeanAutoConfiguration.java:190)
at org.apache.rocketmq.client.impl.consumer.ConsumeMessageConcurrentlyService$ConsumeRequest.run(ConsumeMessageConcurrentlyService.java:419)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- <0x00000007462cbdd8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
RUNNABLE状态的线程还有一个见下:
"ConsumeMessageThread_19" #468 prio=5 os_prio=0 tid=0x00007f7274081000 nid=0x1e7 runnable [0x00007f703e5c9000]
java.lang.Thread.State: RUNNABLE
at redis.clients.jedis.BinaryClient.setnx(BinaryClient.java:196)
at redis.clients.jedis.BinaryJedis.setnx(BinaryJedis.java:598)
at org.springframework.data.redis.connection.jedis.JedisConnection.setNX(JedisConnection.java:1456)
at org.springframework.data.redis.core.DefaultValueOperations$12.doInRedis(DefaultValueOperations.java:219)
at org.springframework.data.redis.core.DefaultValueOperations$12.doInRedis(DefaultValueOperations.java:216)
at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:207)
at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:169)
at org.springframework.data.redis.core.AbstractOperations.execute(AbstractOperations.java:91)
at org.springframework.data.redis.core.DefaultValueOperations.setIfAbsent(DefaultValueOperations.java:216)
at com.gws.services.detail.impl.AggregationServiceImpl.saveUserActive(AggregationServiceImpl.java:192)
at com.gws.handle.rockeqmq.RocketMqListenHandler.onMerUserActive(RocketMqListenHandler.java:161)
at sun.reflect.GeneratedMethodAccessor129.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:216)
at com.yushi.core.MqConsumerBeanAutoConfiguration$2.consumeMessage(MqConsumerBeanAutoConfiguration.java:190)
at org.apache.rocketmq.client.impl.consumer.ConsumeMessageConcurrentlyService$ConsumeRequest.run(ConsumeMessageConcurrentlyService.java:419)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- <0x00000007462d56b8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
3)in Object.wait(): 注意非线程池等待
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f74703d3800 nid=0x25 in Object.wait() [0x00007f7404ecd000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000006c01fe740> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
Locked ownable synchronizers:
- None
"OkHttp ConnectionPool" #344 daemon prio=5 os_prio=0 tid=0x00007f707c334000 nid=0x16b in Object.wait() [0x00007f70aa351000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:460)
at okhttp3.ConnectionPool$1.run(ConnectionPool.java:66)
- locked <0x00000007413778d8> (a okhttp3.ConnectionPool)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- <0x0000000740560d60> (a java.util.concurrent.ThreadPoolExecutor$Worker)
最后注意看runnable的GC任务线程
"VM Thread" os_prio=0 tid=0x00007f74703c7000 nid=0x23 runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f747001f000 nid=0xc runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f7470021000 nid=0xd runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f7470022800 nid=0xe runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f7470024800 nid=0xf runnable
"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007f7470026800 nid=0x10 runnable
"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007f7470028000 nid=0x11 runnable
"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007f747002a000 nid=0x12 runnable
"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007f747002c000 nid=0x13 runnable
"GC task thread#8 (ParallelGC)" os_prio=0 tid=0x00007f747002d800 nid=0x14 runnable
"GC task thread#9 (ParallelGC)" os_prio=0 tid=0x00007f747002f800 nid=0x15 runnable
"GC task thread#10 (ParallelGC)" os_prio=0 tid=0x00007f7470031800 nid=0x16 runnable
"GC task thread#11 (ParallelGC)" os_prio=0 tid=0x00007f7470033000 nid=0x17 runnable
"GC task thread#12 (ParallelGC)" os_prio=0 tid=0x00007f7470035000 nid=0x18 runnable
"GC task thread#13 (ParallelGC)" os_prio=0 tid=0x00007f7470036800 nid=0x19 runnable
"GC task thread#14 (ParallelGC)" os_prio=0 tid=0x00007f7470038800 nid=0x1a runnable
"GC task thread#15 (ParallelGC)" os_prio=0 tid=0x00007f747003a800 nid=0x1b runnable
"GC task thread#16 (ParallelGC)" os_prio=0 tid=0x00007f747003c000 nid=0x1c runnable
"GC task thread#17 (ParallelGC)" os_prio=0 tid=0x00007f747003e000 nid=0x1d runnable
"GC task thread#18 (ParallelGC)" os_prio=0 tid=0x00007f7470040000 nid=0x1e runnable
"GC task thread#19 (ParallelGC)" os_prio=0 tid=0x00007f7470041800 nid=0x1f runnable
"GC task thread#20 (ParallelGC)" os_prio=0 tid=0x00007f7470043800 nid=0x20 runnable
"GC task thread#21 (ParallelGC)" os_prio=0 tid=0x00007f7470045800 nid=0x21 runnable
"GC task thread#22 (ParallelGC)" os_prio=0 tid=0x00007f7470047000 nid=0x22 runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f7470cb7800 nid=0x88 waiting on condition
JNI global references: 419
6、使用pinpoint也可以分析服务节点,出现频繁GC的现象,符合上面22个GC task thread runnable的情况。 一个服务启动起来,因为预发环境设置的堆大小是1G,在我们使用http接口补偿数据的时候,很容易复现出OOM,然后报mq连接超时的错误。
线上设置的是4G,而且jpa拼接查询条件的bug是偶现的,所以当时是采用服务扩容的方式,由4个节点扩展到8个节点。
因为确定了频繁full gc 引起cpu飙升,进而服务不可用。
第二天分析目标就是堆的分配及使用情况,需要找到具体导致OOM的线程。
7、需要借助于内存分析工具MAT进行问题跟踪。
(这里推荐一个mac可用http://101.96.10.44/mirror.rise.ph/eclipse/mat/1.7/rcp/MemoryAnalyzer-1.7.0.20170613-macosx.cocoa.x86_64.zip )
jmap -dump:live,format=b,file=heap.hprof <pid>
有4种Action来分析heap profile:
- Histogram:与命令 jmap -histo:live <pid> | head -7 类似,只是在MAT里面可以用GUI来展示应用系统各个类产生的实例。根据Shllow Heap排序。
2)Dominator Tree:用来显示大对象的占用率。
8、回到最初的报错业务代码,根据用户ID查询用户信息,怎么也想不出会很慢。事实是产生了大量的用户对象(用户表数量有几百万),让gc忙不过来了都~~
at com.gws.services.user.impl.UserServiceImpl.getUserBaseInfo(UserServiceImpl.java:98)
@Override
public UserBaseInfo getUserBaseInfo(Long uid) {
UserBaseInfoQuery query = new UserBaseInfoQuery();
query.setUid(uid);
return userBaseInfoSlave.findOne(query);
}
修改为
@Override
public UserBaseInfo getUserBaseInfo(Long uid) {
return userBaseInfoSlave.findOne(uid);
}
总结:query的uid条件,相当于是没生效,导致是全表查询。
网友评论