此文是之前ES技术测试的笔记,主要是简单的测试结果记录。
中文分词包
此处分词用的是ik分词,分词效果还是不错的,而且只要将自己的特殊短语加到配置中即可准确分词。
下载ik包,解压到plugins目录下,5.5.1会自动加载,不需要在配置文件里配置了
GitHub中有详细的说明以及对应ES版本的分词包,GitHub 传送门
示例
-
创建索引
curl -XPUT http://localhost:9200/index
-
创建mapping
curl -XPOST http://localhost:9200/index/fulltext/_mapping -d' { "properties": { "content": { "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word" } } }'
-
添加数据
curl -XPOST http://localhost:9200/index1/fulltext/1 -d' {"content":"战狼2真是个好电影啊"} ' curl -XPOST http://localhost:9200/index1/fulltext/2 -d' {"content":"战狼良心之作啊"} ' curl -XPOST http://localhost:9200/index1/fulltext/3 -d' {"content":"三生三世锁场"} '
-
查询 match
curl -XPOST http://localhost:9200/index/fulltext/_search -d' { "query" : { "match" : { "content" : "战狼2" }}, "highlight" : { "pre_tags" : ["<tag1>", "<tag2>"], "post_tags" : ["</tag1>", "</tag2>"], "fields" : { "content" : {} } } } ' Result: { "took": 36, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 2, "max_score": 0.854655, "hits": [ { "_index": "index", "_type": "fulltext", "_id": "1", "_score": 0.854655, "_source": { "content": "战狼2真是个好电影啊" }, "highlight": { "content": [ "<tag1>战</tag1><tag1>狼</tag1><tag1>2</tag1>真是个好电影啊" ] } }, { "_index": "index", "_type": "fulltext", "_id": "2", "_score": 0.5716521, "_source": { "content": "战狼良心之作啊" }, "highlight": { "content": [ "<tag1>战</tag1><tag1>狼</tag1>良心之作啊" ] } } ] } }
-
查询 match_phrase
curl -XGET 'localhost:9200/index/fulltext/_search?pretty' -H 'Content-Type: application/json' -d' { "query": { "match_phrase" : { "content" : "战狼2" } } } ' Result { "took" : 1, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.85465515, "hits" : [ { "_index" : "index", "_type" : "fulltext", "_id" : "1", "_score" : 0.85465515, "_source" : { "content" : "战狼2真是个好电影啊" } } ] } }
-
查看分词器效果
格式:http://localhost:9200/your_index/_analyze?text=中华人民共和国MN&tokenizer=my_ik
示例:http://localhost:9200/index/_analyze?text=中华人民共和国MN&tokenizer=chinese
网友评论