使用的ElasticSearch版本为 7.2
创建一个员工目录
员工文档内容包括:
- 姓名(first_name、last_name)
- 年龄(age)
- 简介(about)
- 兴趣(interests)
为一个员工文档建立索引,文档的类型为employee
,属于索引 megacorp
:
curl -X PUT "localhost:9200/megacorp/employee/1?pretty" -H 'Content-Type: application/json' -d'
{
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing", "interests": [ "sports", "music" ]
}
'
/megacorp/employee/1
表示/索引名/类型名/员工ID
。
再建立2个员工索引:
curl -X PUT "localhost:9200/megacorp/employee/2?pretty" -H 'Content-Type: application/json' -d'
{
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests": ["music" ]
}
'
curl -X PUT "localhost:9200/megacorp/employee/3?pretty" -H 'Content-Type: application/json' -d'
{
"first_name" : "Douglas",
"last_name" : "Fir",
"age" : 35,
"about" : "I like to build cabinets",
"interests": ["forestry" ]
}
'
搜索
查询megacorp
索引中类型employee
下ID为1的员工文档:
curl -X GET "localhost:9200/megacorp/employee/1?pretty"
查询所有员工:
curl -X GET "localhost:9200/megacorp/employee/_search?pretty"
查询last_name
为Smith
的员工:
# query string 形式
curl -X GET "localhost:9200/megacorp/employee/_search?q=last_name:Smith&pretty"
# DSL 形式
curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"last_name":"Smith"
}
}
}
'
查询last_name
为Smith
的员工,并且年龄大于30的,需要使用过滤器:
curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query":{
"bool" : {
"filter" : {
"range" : {
"age" : { "gt" : 30 }
}
},
"must" : {
"match" : {
"last_name" : "Smith"
}
}
}
}
}
'
全文搜索
查询about
描述中匹配rock climbing
的员工:
curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query" : {
"match" : {
"about" : "rock climbing"
}
}
}
'
查询结果:
{
...
"hits" : {
...
"hits" : [
{
...
"_id" : "1",
"_score" : 1.4167402,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
},
{
...
"_id" : "2",
"_score" : 0.45895916,
"_source" : {
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests" : [
"music"
]
}
}
]
}
}
查到2条记录,每条都包含_score
,含义是”相关性评分“,默认会根据其进行排序。
第一条的分值高,是因为其about
中明确包含rock climbing
,第二条中只包含rock
。
相关性
是ES中很重要的一个概念,在传统数据库中对记录的查询只有匹配或者不匹配。
短语搜索
上面的搜索中rock climbing
会被拆成2个词进行匹配,如果想将其视为一个整体进行匹配,可以使用match_phrase
:
curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query" : {
"match_phrase" : {
"about" : "rock climbing"
}
}
}
'
高亮搜索结果
curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query" : {
"match_phrase" : {
"about" : "rock climbing"
}
},
"highlight" : {
"fields" : {
"about": {}
}
}
}
'
聚合
ES有强大的聚合功能,可以在数据上生成复杂的分析统计,类似SQL中的group by
。
例如,查询所有员工的最大的兴趣爱好:
curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"all_interests" : {
"terms" : { "field": "interests.keyword" }
}
}
}
'
注意:
interests
要写成interests.keyword
,否则会报错:
Fielddata is disabled on text fields by default ...
查询结果:
{
...
"aggregations" : {
"all_interests" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "music",
"doc_count" : 2
},
{
"key" : "forestry",
"doc_count" : 1
},
{
"key" : "sports",
"doc_count" : 1
}
]
}
}
}
可以看到喜欢music
的最多。
上面是对所有文档进行查询,可以添加查询条件:
curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"last_name": "smith"
}
},
"aggs" : {
"all_interests" : {
"terms" : { "field": "interests.keyword" }
}
}
}
'
查询结果中会同时给出匹配的记录和聚合结果。
可以分级汇总,例如统计每种兴趣下员工的平均年龄:
curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"all_interests" : {
"terms" : { "field": "interests.keyword" },
"aggs" : {
"avg_age" : {
"avg" : { "field": "age" }
}
}
}
}
}
'
查询结果:
...
"aggregations" : {
"all_interests" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "music",
"doc_count" : 2,
"avg_age" : {
"value" : 28.5
}
},
{
"key" : "forestry",
"doc_count" : 1,
"avg_age" : {
"value" : 35.0
}
},
{
"key" : "sports",
"doc_count" : 1,
"avg_age" : {
"value" : 25.0
}
}
]
}
}
网友评论