Search Template 二三事

作者: 饿虎嗷呜 | 来源:发表于2020-04-22 23:39 被阅读0次

Search Template 二三事
75_elasticsearch高手进阶_使用search te
查询模板
基础篇（7）LeetCode--CHAPTER 6. BINAR
我的
Vue 项目触发移动软键盘搜索点击事件
4.9-使用SearchTemplate和IndexAlias进
Elasticsearch search template将搜索
【elasticsearch】19、使用search templ
ZXAlgorithm - C2 Binary Search

Search Template 二三事

最近在准备ElasticSearch认证工程师考试，其中Search Template是一个考点，原本以为可能只是简单考一考知识点，做完真题后发现还真不是。于是决定把官方文档好好理一理。

基本使用方法

设置 Search Template

POST _scripts/<templateid>
{
    "script": {
        "lang": "mustache",
        "source": {
            "query": {
                "match": {
                    "title": "{{query_string}}"
                }
            }
        }
    }
}

如上，SearchTemplate实际上是一段用mustache语言写的script，其存储也是利用的script的存储方式进行的。SearchTemplate的格式实际上和正常的ES query DSL是很类似的，区别在于有一些需要进行query的数据是以"{{query_string}}"占位符的形式显示的。后续调用时，用"params"语句段传入。

需要注意的一点是，"source"里面的内容和painless不同，直接是query段，或者是压缩成字符串的query段。我之前犯了一个错，用""" """将query的内容包含起来，最后搜索直接就出错了。

删除Search Template

删除某个Search Template的方法和删除一个脚本是类似的。

DELETE <templateid>

查看Search Template

GET _scripts/<templateid>

使用Search Template

以下面的方法来使用Search Template

GET _search/template
{
    "id": "<templateid>",
    "params": {
        "query_string": "query_data"
    }
}

id就是上面设置的template id。params块中设置的是template中的参数。

除了使用事先存储的template以外，也可以直接在_search中使用source和params字段。

GET _search/template
{
    "source": "{\"query\":{\"bool\":{\"must\": {{#toJson}}clauses{{/toJson}} }}}",
    "params": {
        "clauses": [
            { "term": { "user" : "foo" } },
            { "term": { "user" : "bar" } }
        ]
   }
}

验证template

可以使用render API，在其响应中查看渲染过的template。

GET _render/template
{
    "source": {...},
    "params": {...}
}

一些注意点

以JSON形式传参

Search Template支持以JSON的形式进行穿参。在设置SearchTemplate的时候，相关的参数以{{#toJson}}params{{/toJson}}的形式标记。

GET _search/template
{
    "source": "{\"query\":{\"bool\":{\"must\": {{#toJson}}clauses{{/toJson}} }}}",
    "params": {
        "clauses": [
            { "term": { "user" : "foo" } },
            { "term": { "user" : "bar" } }
        ]
   }
}

这里需要注意，如果有一个字段要用json的方式进行转换，那此时"source"中的内容一定要用字符串的形式进行描述。

将一个数组的字段连接起来

{{#join}}params_array{{/join}}

SearchTemplate支持将一个数组的成员连接成一个字符串，以默认,进行分隔。

比如：模板中是这样{{#join}}emails{{/join}}，而使用时，emails的值是"emails": [ "username@email.com", "lastname@email.com" ]。那么在使用时，emails会是一个字符串:"emails" : "username@email.com,lastname@email.com"。

除了默认分隔符“,”，SearchTemplate还支持自定义分隔符，用delimiter进行设置：

{{#join delimiter='||'}}date.formats{{/join delimiter='||'}}

需要注意前后两段都需要指定。

默认值

SearchTemplate支持给参数设置默认值，使用如下格式:

{{var}}{{^var}}default{{/var}}

var是要设置的参数名，default则是默认值。

对URL进行

以如下格式{{#url}}value{{/url}}可以对value字符串以HTML格式进行编码。

条件块

这个翻译是我直译的，不一定很准确。这个功能的目的是，我们在SearchTemplate中设置的某些字段可能在实际使用中不使用，如果这时候任然要使用这些字段进行搜索，那是会出问题的。因此，有一套标记系统来判断只有当该字段存在时，block中的内容才起作用。这个标记手法长这样{{#var}} blocks {{/var}}。不过需要注意的是，如果使用了这套标记手法，script的内容必须以字符串的方式设置。这一点是比较头疼的。

下面是一道真题


# Create a search template for the above query, so that the template 
#   (i) is named "with_response_and_tag", 
#   (ii) has a parameter "with_min_response" to represent the lower bound of the 
#   `response` field, 
#   (iii) has a parameter "with_max_response" to 
#   represent the upper bound of the `response` field, 
#   (iv) has a 
#   parameter "with_tag" to represent a possible value of the `tags` 
#   field
# Test the "with_response_and_tag" search template by setting the 
#   parameters as follows: (i) "with_min_response": 400, (ii) 
#   "with_max_response": 500 (iii) "with_tag": "security"

# Update the "with_response_and_tag" search template, so that (i) if 
#   the "with_max_response" parameter is not set, then don't set an 
#   upper bound to the `response` value, and (ii) if the "with_tag" 
#   parameter is not set, then do not apply that filter at all

# Test the "with_response_and_tag" search template by setting only 
#   the "with_min_response" parameter to 500

# Test the "with_response_and_tag" search template by setting the 
#   parameters as follows: (i) "with_min_response": 500, (ii) 
#   "with_tag": "security"

第一部分其实还算比较简单。简单地写了一下，如下：

PUT _scripts/with_response_and_tag
{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "response": {
                  "gte": "{{with_min_response}}",
                  "lte": "{{with_max_response}}"
                }
              }
            },
            {
              "term": {
                "tags": "{{with_tag}}"
              }
            }
          ]
        }
      }
    }
  }
}

难点在下面一个问题：

如果参数"with_max_response"不存在，就不要设置"response"字段的范围上限。
如果参数"with_tag"不存在，则不使用这个filter。

使用上条件块，可能像这样：

PUT _scripts/with_response_and_tag
{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "response": {
                  "gte": "{{with_min_response}}"
                  {{#with_max_response}}
                  ,{{/with_max_response}}
                  {{#with_max_response}}
                  "lte": "{{with_max_response}}"
                  {{/with_max_response}}
                }
              }
            }
            {{#with_tag}},{{/with_tag}}
            {{#with_tag}}
            {
              "term": {
                "tags": "{{with_tag}}"
              }
            }
            {{/with_tag}}
          ]
        }
      }
    }
  }
}

但是，这样的结果是错误的，如果使用了条件块，source的内容必须使用字符串表示。

真实有效果的是这样：

PUT _scripts/with_response_and_tag
{
  "script": {
    "lang": "mustache",
    "source": "{\"query\": {\"bool\": {\"filter\": [{\"range\": {\"response\": {\"gte\": \"{{with_min_response}}\"{{#with_max_response}},{{/with_max_response}}{{#with_max_response}}\"lte\": \"{{with_max_response}}\"{{/with_max_response}}}}}{{#with_tag}},{{/with_tag}}{{#with_tag}}{\"term\": {\"tags\": \"{{with_tag}}\"}}{{/with_tag}}]}}}"
  }
}

特别注意：标点符号要单独用{{#var}} {{/var}}标记，比如上面的：

            {{#with_tag}},{{/with_tag}}
            {{#with_tag}}
            {
              "term": {
                "tags": "{{with_tag}}"
              }
            }
            {{/with_tag}}

否则一定会报错。

其实另外还有一个问题。如果其中一个字段，比如说“tags”，我要以数组的形式传入，那该怎么办？

其实这种情况也好办，无非就是原来的\"{{with_tag}}\"换成了{{#toJson}}with_tags{{/toJson}}，而外层的控制标签仍然使用{{#with_tag}} {{/with_tag}}。结果如下：

GET _render/template
{
    "source": "{\"query\": {\"bool\": {\"filter\": [{\"range\": {\"response\": {\"gte\": \"{{with_min_response}}\"{{#with_max_response}},{{/with_max_response}}{{#with_max_response}}\"lte\": \"{{with_max_response}}\"{{/with_max_response}}}}}{{#with_tag}},{{/with_tag}}{{#with_tag}}{\"term\": {\"tags\": {{#toJson}}with_tag{{/toJson}}}}{{/with_tag}}]}}}",
    "params": {
      "with_min_response": 400,
      "with_tag": ["security"]
    }
  
}

source中内容转字符串的一些心得

把上面那段转换成字符串简直是灾难，如果直接写字符串的话，光是那么多花括号就能把人搞得头晕脑胀。不过我总结了几点：

转换流程：

先写query块，用真实的search确保搜索能够成功后，再将需要的字段用占位符替换。
加上需要的条件块，主要标点符号要单独加条件。
将现有内容保存一份，防止出错。
用""将souce：后面部分整个包含起来。
给中间所有的双引号加上反斜杠"\"
去除所有token之间的空格和换行符。

总体而言，条件块这边是最难的，因为试错成本太高了，如果中间某一步做错，要重新回头修改。修改完后可以使用_renderAPI进行调试，可以看到转换效果。不过如果中间有错误的话，还是很难找。报错提示会告诉你在字符串的的偏移。但是以人力来看这个问题还是非常困难。因此最好还是用固定的模式来进行处理，以降低错误发生的概率。