1.问题描述
在针对RocketMQ集群做模拟故障测试,测试环境:
1.两台linux服务器,系统配置MEM:64G,CPU:32 core。RocketMQ版本4.2.0。
2.每台服务器上部署一个nameSvr,主broker、备broker(两台服务器之间互为主备)
当在模拟其中一台RocketMQ服务器故障时(强制停止这台服务器所有程序进程),RocketMQ的producer程序出现如下异常:
[java] view plain copy
See http://rocketmq.apache.org/docs/faq/ for further details.
org.apache.rocketmq.client.exception.MQClientException: No route info ofthis topic, xxx
See http://rocketmq.apache.org/docs/faq/ for further details.
at org.apache.rocketmq.client.impl.producer.DefaultMQProducerImpl.sendDefaultImpl(DefaultMQProducerImpl.java:564)
at org.apache.rocketmq.client.impl.producer.DefaultMQProducerImpl.send(DefaultMQProducerImpl.java:1069)
at org.apache.rocketmq.client.impl.producer.DefaultMQProducerImpl.send(DefaultMQProducerImpl.java:1023)
at org.apache.rocketmq.client.producer.DefaultMQProducer.send(DefaultMQProducer.java:214)
at com.cnc.livect.connserver.common.mq.MQProductor.sendDelayMsg(MQProductor.java:144)
at com.cnc.livect.connserver.common.mq.MQProductor.sendSessionDelayMsg(MQProductor.java:126)
at com.cnc.livect.connserver.common.session.SessionUtil.start(SessionUtil.java:83)
at com.cnc.livect.connserver.common.session.SessionUtil.checkAndStart(SessionUtil.java:105)
at com.cnc.livect.connserver.netty.connhandler.WebSocketFrameHandler.doPong(WebSocketFrameHandler.java:88)
at com.cnc.livect.connserver.netty.connhandler.WebSocketFrameHandler.channelRead0(WebSocketFrameHandler.java:55)
at com.cnc.livect.connserver.netty.connhandler.WebSocketFrameHandler.channelRead0(WebSocketFrameHandler.java:33)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at io.netty.handler.codec.http.websocketx.Utf8FrameValidator.channelRead(Utf8FrameValidator.java:77)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350)
at com.cnc.livect.connserver.netty.connhandler.WsWebSocketServerProtocolHandler$1.channelRead(WsWebSocketServerProtocolHandler.java:125)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at io.netty.handler.codec.http.websocketx.extensions.WebSocketServerExtensionHandler.channelRead(WebSocketServerExtensionHandler.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:610)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:551)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:465)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:437)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:745)
2.分析解决
作为集群服务当单点产生故障而导致生产者无法发送消息是很不合理,故苦苦去寻找解决方案,在网上看到很多人说在broker启动的时候设置autoCreateTopicEnable=true就可以解决问题,测试了下并没鸟用。根据异常的字面意思理解猜测是当这个broker挂了后这个topic的路由信息就消失了,故猜测使用RocketMQ的客户端对某个topic进行消费发送时,当这个topic不存在第一次会随机选择一个broker来创建并存储这个topic。
为了验证猜测使用mqadmin查看topic的route信息:
从上图可以明显的看到这个topic只存在于testMQ1中当这个broker故障和RocketMQ就不存在这个topic了,并且producter不会再自动在另外一台broker上创建这个topic(猜测重新初始化productor或者重启productor能再另外一个broker中创建topic,只是猜测未进行验证,因为不符合应用需求,出现异常就需要重新初始化productor或者重启是不可接受的,有兴趣的可以自己去验证下猜测是否正确)。
居然productor的自动创建topic不能满足需求,那针对这种情况就将要使用的topic进行手动创建,使用mqadmin对两台linux主broker都创建了对应的topic信息,这样其中一台挂了后另外一台照样有topic就能进行服务了。创建命令如下:
现在重现查看topic的route信息可以发现两台主broker都包含了这个topic:
现在再次模拟单点故障测试,测试结果完美无异常。搞定!
版权声明:本文为博主原创文章,转载请注明出处。 https://blog.csdn.net/guiliguiwang/article/details/79852556
网友评论