现有环境
本机伪集群由3个节点组成(slave-9200,slave-9201,slave-9202),3个节点配置分别如下:
# ===slave-9200节点配置===
cluster.name: my-cluster
node.name: slave-9200
node.master: true
network.host: 0.0.0.0
http.port: 9200
transport.tcp.port: 9300
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9300", "127.0.0.1:9301", "127.0.0.1:9302"]
discovery.zen.minimum_master_nodes: 2
http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-credentials: true
# ===slave-9201节点配置===
cluster.name: my-cluster
node.name: slave-9201
network.host: 0.0.0.0
http.port: 9201
transport.tcp.port: 9301
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9300", "127.0.0.1:9301", "127.0.0.1:9302"]
discovery.zen.minimum_master_nodes: 2
# ===slave-9202节点配置===
cluster.name: my-cluster
node.name: slave-9202
network.host: 0.0.0.0
http.port: 9202
transport.tcp.port: 9302
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9300", "127.0.0.1:9301", "127.0.0.1:9302"]
discovery.zen.minimum_master_nodes: 2
通过head插件查看集群状况如下图:

扩容需求
新增扩容slave-9203,slave-9204两个节点
升级流程
1、将es_slave1拷贝两份,分别命名为slave_3,slave4,Linux命令如下:
cp -rf es_slave1 es_slave3
cp -rf es_slave1 es_slave4
拷贝完之后通过linux的ls -l命令查看ES文件夹目录如下:
drwxr-xr-x@ 13 jay-xqt staff 442 10 13 14:59 es_slave1
drwxr-xr-x@ 13 jay-xqt staff 442 10 13 14:59 es_slave2
drwxr-xr-x@ 13 jay-xqt staff 442 10 13 15:28 es_slave3
drwxr-xr-x@ 13 jay-xqt staff 442 10 13 15:41 es_slave4
2、分别配置slave_3,slave_4的elasticsearch.yml
# ===slave-9203节点配置===
cluster.name: my-cluster
node.name: slave-9203
network.host: 0.0.0.0
http.port: 9203
transport.tcp.port: 9303
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9300", "127.0.0.1:9301", "127.0.0.1:9302"]
discovery.zen.minimum_master_nodes: 2
# ===slave-9204节点配置===
cluster.name: my-cluster
node.name: slave-9204
network.host: 0.0.0.0
http.port: 9204
transport.tcp.port: 9304
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9300", "127.0.0.1:9301", "127.0.0.1:9302"]
discovery.zen.minimum_master_nodes: 2
3、分别先后启动slave-9203和slave-9204节点,但是在启动slave-9203的时候报错。
[2018-10-13T15:30:57,562][INFO ][o.e.d.z.ZenDiscovery ] [slave-9203] failed to send join request to master [{slave-9201}{EvW-pPzMQu2X4Wxj9tybew}{zaJboJqtS_SbCbzm9pxINQ}{192.168.0.101}{192.168.0.101:9301}], reason [RemoteTransportException[[slave-9201][192.168.0.101:9301][internal:discovery/zen/join]]; nested: IllegalArgumentException[can't add node {slave-9203}{EvW-pPzMQu2X4Wxj9tybew}{NICrtQeUQG-XuT9OyYRTsw}{192.168.0.101}{192.168.0.101:9303}, found existing node {slave-9201}{EvW-pPzMQu2X4Wxj9tybew}{zaJboJqtS_SbCbzm9pxINQ}{192.168.0.101}{192.168.0.101:9301} with the same id but is a different node instance]; ]
[2018-10-13T15:31:00,591][INFO ][o.e.d.z.ZenDiscovery ] [slave-9203] failed to send join request to master [{slave-9201}{EvW-pPzMQu2X4Wxj9tybew}{zaJboJqtS_SbCbzm9pxINQ}{192.168.0.101}{192.168.0.101:9301}], reason [RemoteTransportException[[slave-9201][192.168.0.101:9301][internal:discovery/zen/join]]; nested: IllegalArgumentException[can't add node {slave-9203}{EvW-pPzMQu2X4Wxj9tybew}{NICrtQeUQG-XuT9OyYRTsw}{192.168.0.101}{192.168.0.101:9303}, found existing node {slave-9201}{EvW-pPzMQu2X4Wxj9tybew}{zaJboJqtS_SbCbzm9pxINQ}{192.168.0.101}{192.168.0.101:9301} with the same id but is a different node instance]; ]
由错误提示{192.168.0.101}{192.168.0.101:9301} with the same id but is a different node instance]可以看出,从es_slave1拷贝过来的es_slave3里也有同样的node节点,id相同导致了冲突,实际上es_slave3的node是需要重新进行分配的,因此需要先将es_slave3的node删除。
解决方案
将拷贝的slave_3文件夹中的data目录下的内容删除,具体路径为/Users/jay-xqt/Downloads/myapp/es_slave/es_slave3/data/,里面有一个nodes文件夹,将该文件夹删除即可。然后重启slave-9203,问题解决。
[2018-10-13T15:36:56,360][INFO ][o.e.d.DiscoveryModule ] [slave-9203] using discovery type [zen]
[2018-10-13T15:36:57,145][INFO ][o.e.n.Node ] [slave-9203] initialized
[2018-10-13T15:36:57,146][INFO ][o.e.n.Node ] [slave-9203] starting ...
[2018-10-13T15:37:02,368][INFO ][o.e.t.TransportService ] [slave-9203] publish_address {192.168.0.101:9303}, bound_addresses {[::]:9303}
[2018-10-13T15:37:02,380][INFO ][o.e.b.BootstrapChecks ] [slave-9203] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2018-10-13T15:37:05,662][INFO ][o.e.c.s.ClusterService ] [slave-9203] detected_master {slave-9201}{EvW-pPzMQu2X4Wxj9tybew}{zaJboJqtS_SbCbzm9pxINQ}{192.168.0.101}{192.168.0.101:9301}, added {{slave-9200}{cT55VEW8Tz6vwksba2WDiQ}{9iAMeRhTQ16uo11O5bFk8Q}{192.168.0.101}{192.168.0.101:9300},{slave-9202}{zOJljPRWSpCRHJwtl_0QSQ}{Z8u7LzvIRP-zKRxGObw-7w}{192.168.0.101}{192.168.0.101:9302},{slave-9201}{EvW-pPzMQu2X4Wxj9tybew}{zaJboJqtS_SbCbzm9pxINQ}{192.168.0.101}{192.168.0.101:9301},}, reason: zen-disco-receive(from master [master {slave-9201}{EvW-pPzMQu2X4Wxj9tybew}{zaJboJqtS_SbCbzm9pxINQ}{192.168.0.101}{192.168.0.101:9301} committed version [114]])
[2018-10-13T15:37:06,027][INFO ][o.e.h.n.Netty4HttpServerTransport] [slave-9203] publish_address {192.168.0.101:9203}, bound_addresses {[::]:9203}
[2018-10-13T15:37:06,028][INFO ][o.e.n.Node ] [slave-9203] started
于此同时,其他节点log中打印出了集群新增了新节点的日志:
[2018-10-13T15:37:05,648][INFO ][o.e.c.s.ClusterService ] [slave-9200] added {{slave-9203}{DK8bum6BTQya8osXyQMv3A}{yF1-sxNLS1uBO_7FN1wZ-w}{192.168.0.101}{192.168.0.101:9303},}, reason: zen-disco-receive(from master [master {slave-9201}{EvW-pPzMQu2X4Wxj9tybew}{zaJboJqtS_SbCbzm9pxINQ}{192.168.0.101}{192.168.0.101:9301} committed version [114]])
扩容结果


由结果可以看出,每个节点索引分片的分配在每新增一个节点都要重新进行一次分配。
网友评论