上篇文章实践了elasticsearch搜索集群的环境部署,接下来再完善下搜索功能,分词器。
分词器源码地址:https://github.com/medcl/elasticsearch-analysis-ik
3.1 安装分词器
[root@elastic-redis-03 elasticsearch]# ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.1/elasticsearch-analysis-ik-6.2.1.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.1/elasticsearch-analysis-ik-6.2.1.zip
[=================================================] 100%
-> Installed analysis-ik
[root@elastic-redis-03 elasticsearch]# ls
bin config data lib LICENSE.txt logs modules NOTICE.txt plugins README.textile
[root@elastic-redis-03 elasticsearch]# ls -l plugins/
total 4
drwxr-xr-x 2 root root 4096 Jun 14 13:04 analysis-ik #这个目录就是执行安装操作后产生的
[root@elastic-redis-03 elasticsearch]# ls -l plugins/analysis-ik/
total 1420
-rw-r--r-- 1 root root 263965 Jun 14 13:04 commons-codec-1.9.jar
-rw-r--r-- 1 root root 61829 Jun 14 13:04 commons-logging-1.2.jar
-rw-r--r-- 1 root root 51097 Jun 14 13:04 elasticsearch-analysis-ik-6.2.1.jar
-rw-r--r-- 1 root root 736658 Jun 14 13:04 httpclient-4.5.2.jar
-rw-r--r-- 1 root root 326724 Jun 14 13:04 httpcore-4.4.4.jar
-rw-r--r-- 1 root root 1805 Jun 14 13:04 plugin-descriptor.properties
[root@elastic-redis-03 elasticsearch]#
[root@elastic-redis-03 elasticsearch]# ls -l config/
total 68
drwxr-x--- 2 elastic elastic 4096 Jun 14 13:04 analysis-ik #这个目录就是执行安装操作后产生的
-rw-rw---- 1 elastic elastic 1868 Jun 14 14:52 elasticsearch.yml
-rw-rw---- 1 elastic elastic 2767 Jun 14 12:39 jvm.options
-rw-rw---- 1 elastic elastic 5091 Feb 8 03:30 log4j2.properties
-rw------- 1 elastic elastic 41824 Jun 14 14:50 nohup.out
[root@elastic-redis-03 elasticsearch]#
3.32 然后另外两个节点执行同样的安装操作
3.33 查看插件是否安装成功
[root@elastic-redis-03 analysis-ik]# curl -get 172.31.15.172:9200/_cat/plugins
node-1 analysis-ik 6.2.1
node-2 analysis-ik 6.2.1
node-3 analysis-ik 6.2.1
3.4 重启elasticsearch(三台服务器都要重启),趁着重启的机会,我们可以观察下集群的状态。怎么观察?最简单的方式就是上篇文章说的通过head插件实现的web页面查看。
下图是目前为止的集群状态,五角星代表主节点,圆圈代表数据节点。
3.png
-
现在关闭节点2(node-2),查看下集群状态
4.png - 现在启动节点2,并查看。然后关闭node3节点,并查看。
5.png
6.png
node1我这里就不试了,head只安装在node1节点上。
通过观察,我们可以看到符合elasticsearch集群原理的状态。
入正题,继续分词器。
3.5 创建一个索引
[root@pro-3-b ~]# curl -XPUT 'http://172.31.15.172:9200/custome?pretty'
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "custome"
}
- 索引一个文档到customer索引,external类型中
[root@pro-3-b ~]# curl -XPOST -H "Content-Type: application/json" '172.31.15.172:9200/customer/external?pretty' -d '
{
"name": "安徽省长江流域"
}'
[root@pro-3-b ~]# curl -XPOST -H "Content-Type: application/json" '172.31.15.172:9200/customer/external?pretty' -d '
{
"name": "省长是中华人民共和国的部级官员"
}'
[root@pro-3-b ~]# curl -XPOST -H "Content-Type: application/json" '172.31.15.172:9200/customer/external?pretty' -d '
> {
> "name": "河南省郑州市"
> }'
{
"_index" : "customer",
"_type" : "external",
"_id" : "jB86m2MB0bX0KdnnB-Dc",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1
}
3.6 搜索关键字测试
#搜索"省长"关键字
[root@pro-3-b ~]# curl -XGET '172.31.15.172:9200/_search' -H "Content-Type: application/json" -d '
{
"query": {
"match": {
"name": "省长"
}
}
}'
{"took":4,"timed_out":false,"_shards":{"total":15,"successful":15,"skipped":0,"failed":0},"hits":{"total":3,"max_score":0.74487394,"hits":[{"_index":"customer","_type":"external","_id":"ix-tmmMB0bX0KdnnhuCs","_score":0.74487394,"_source":
{
"name": "省长是中华人民共和国的部级官员"
}},{"_index":"customer","_type":"external","_id":"ih-pmmMB0bX0Kdnn0OAK","_score":0.5753642,"_source":
{
"name": "安徽省长江流域"
}},{"_index":"customer","_type":"external","_id":"jB86m2MB0bX0KdnnB-Dc","_score":0.22108285,"_source":
{
"name": "河南省郑州市"
}}]}}[root@pro-3-b ~]#
#搜索"中国"关键字
[root@pro-3-b ~]# curl -XGET '172.31.15.172:9200/_search' -H "Content-Type: application/json" -d '
{
"query": {
"match": {
"name": "中国"
}
}
}'
{"took":7,"timed_out":false,"_shards":{"total":15,"successful":15,"skipped":0,"failed":0},"hits":{"total":1,"max_score":1.179499,"hits":[{"_index":"customer","_type":"external","_id":"ix-tmmMB0bX0KdnnhuCs","_score":1.179499,"_source":
{
"name": "省长是中华人民共和国的部级官员"
}}]}}[root@pro-3-b ~]#
可以看到分词成功。也可以通过head来测试,如下图:
7.png
注意:
ik_max_word: 会将文本做最细粒度的拆分,比如会将“俄罗斯世界杯即将开幕”拆分为“俄罗斯,罗斯,斯世,世界杯,世界,杯,即将,开幕”,会穷尽各种可能的组合。
ik_smart: 会做最粗粒度的拆分,比如会将“俄罗斯世界杯即将开幕”拆分为“俄罗斯,世界杯,即将,开幕”。
参考:
https://blog.csdn.net/chengyuqiang/article/details/78991570
https://www.cnblogs.com/xing901022/p/5469338.html
http://www.hemingliang.site/1087.html
网友评论