美文网首页
4.6-单字符串多字段查询-Multi-Match

4.6-单字符串多字段查询-Multi-Match

作者: 落日彼岸 | 来源:发表于2020-04-02 15:08 被阅读0次

    三种场景

    • 最佳字段 (Best Fields)

      • 当字段之间相互竞争,⼜相互关联。例如 title 和 body 这样的字段。评分来⾃最匹配字段
    • 多数字段 (Most Fields)

      • 处理英⽂内容时:⼀种常⻅的⼿段是,在主字段( English Analyzer),抽取词⼲,加⼊同义词,以 匹配更多的⽂档。相同的⽂本,加⼊⼦字段(Standard Analyzer),以提供更加精确的匹配。其他字 段作为匹配⽂档提⾼相关度的信号。匹配字段越多则越好
    • 混合字段 (Cross Field)

      • 对于某些实体,例如⼈名,地址,图书信息。需要在多个字段中确定信息,单个字段只能作为整体 的⼀部分。希望在任何这些列出的字段中找到尽可能多的词

    Multi Match Query

    • Best Fields 是默认类型,可以不⽤指定

    • Minimum should match 等参数可以传递到⽣成的 query 中

    POST blogs/_search
    {
      "query": {
        "multi_match": {
          "type": "best_fields",
          "query": "Quick pets",
          "fields": ["title","body"],
          "tie_breaker": 0.2,
          "minimum_should_match": "20%"
        }
      }
    }
    

    ⼀个查询案例

    • 英⽂分词器,导致精确度降低,时态信息丢失
    PUT /titles
    {
      "mappings": {
        "properties": {
          "title": {
            "type": "text",
            "analyzer": "english"
          }
        }
      }
    }
    
    POST titles/_bulk
    { "index": { "_id": 1 }}
    { "title": "My dog barks" }
    { "index": { "_id": 2 }}
    { "title": "I see a lot of barking dogs on the road " }
    
    
    GET titles/_search
    {
      "query": {
        "match": {
          "title": "barking dogs"
        }
      }
    }
    
    image.png

    使⽤多数字段匹配解决

    • ⽤⼴度匹配字段 title 包括尽可能多的⽂档——以提 升召回率——同时⼜使⽤字段 title.std 作为信号 将 相关度更⾼的⽂档置于结果顶部。

    • 每个字段对于最终评分的贡献可以通过⾃定义值 boost 来控制。⽐如,使 title 字段更为重要, 这样同时也降低了其他信号字段的作⽤

    DELETE /titles
    PUT /titles
    {
      "mappings": {
        "properties": {
          "title": {
            "type": "text",
            "analyzer": "english",
            "fields": {"std": {"type": "text","analyzer": "standard"}}
          }
        }
      }
    }
    
    POST titles/_bulk
    { "index": { "_id": 1 }}
    { "title": "My dog barks" }
    { "index": { "_id": 2 }}
    { "title": "I see a lot of barking dogs on the road " }
    
    GET /titles/_search
    {
       "query": {
            "multi_match": {
                "query":  "barking dogs",
                "type":   "most_fields",
                "fields": [ "title", "title.std" ]
            }
        }
    }
    
    GET /titles/_search
    {
       "query": {
            "multi_match": {
                "query":  "barking dogs",
                "type":   "most_fields",
                "fields": [ "title^10", "title.std" ]
            }
        }
    }
    

    跨字段搜索

    • ⽆法使⽤ Operator

    • 可以⽤ copy_to 解决,但是需要额外的存储空间

    PUT address/_doc/1
    {
      "street": "5 Poland Street",
      "city": "London",
      "country": "United Kingdom",
      "postcode": "W1V 3Dg"
    }
    
    
    POST address/_search
    {
     "query": {
        "multi_match": {
          "query": "Poland Street W1V",
          "type": "most_fields",
          "fields": ["street", "city", "country", "postcode"]
        }
      }
    }
    

    跨字段搜索 [cross_fields解决]

    POST address/_search
    {
     "query": {
        "multi_match": {
          "query": "Poland Street W1V",
          "type": "cross_fields",
          "operator": "and", 
          "fields": ["street", "city", "country", "postcode"]
        }
      }
    }
    
    • ⽀持使⽤ Operator

    • 与 copy_to, 相⽐,其中⼀个优势就是它可以在搜索时为单个字段提升权重。

    本节知识点回顾

    • Multi Match 查询的基本语法

    • 查询的类型

    • 最佳字段 / 多数字段 / 跨字段

    • Boosting

    • 控制 Precision

    • 以及使⽤⼦字段多数字段算分,控制

    • 使⽤ Operator

    课程demo

    POST blogs/_search
    {
        "query": {
            "dis_max": {
                "queries": [
                    { "match": { "title": "Quick pets" }},
                    { "match": { "body":  "Quick pets" }}
                ],
                "tie_breaker": 0.2
            }
        }
    }
    
    POST blogs/_search
    {
      "query": {
        "multi_match": {
          "type": "best_fields",
          "query": "Quick pets",
          "fields": ["title","body"],
          "tie_breaker": 0.2,
          "minimum_should_match": "20%"
        }
      }
    }
    
    
    
    POST books/_search
    {
        "multi_match": {
            "query":  "Quick brown fox",
            "fields": "*_title"
        }
    }
    
    
    POST books/_search
    {
        "multi_match": {
            "query":  "Quick brown fox",
            "fields": [ "*_title", "chapter_title^2" ]
        }
    }
    
    
    
    DELETE /titles
    PUT /titles
    {
      "settings": {
        "number_of_replicas": 1
      },
      "mappings": {
        "properties": {
          "title": {
            "type": "text",
            "analyzer": "english",
            "fields": {
              "std": {
                "type": "text",
                "analyzer": "standard"
              }
            }
          }
        }
      }
    }
    
    PUT /titles
    {
      "mappings": {
        "properties": {
          "title": {
            "type": "text",
            "analyzer": "english"
          }
        }
      }
    }
    
    POST titles/_bulk
    { "index": { "_id": 1 }}
    { "title": "My dog barks" }
    { "index": { "_id": 2 }}
    { "title": "I see a lot of barking dogs on the road " }
    
    
    GET titles/_search
    {
      "query": {
        "match": {
          "title": "barking dogs"
        }
      }
    }
    
    DELETE /titles
    PUT /titles
    {
      "mappings": {
        "properties": {
          "title": {
            "type": "text",
            "analyzer": "english",
            "fields": {"std": {"type": "text","analyzer": "standard"}}
          }
        }
      }
    }
    
    POST titles/_bulk
    { "index": { "_id": 1 }}
    { "title": "My dog barks" }
    { "index": { "_id": 2 }}
    { "title": "I see a lot of barking dogs on the road " }
    
    GET /titles/_search
    {
       "query": {
            "multi_match": {
                "query":  "barking dogs",
                "type":   "most_fields",
                "fields": [ "title", "title.std" ]
            }
        }
    }
    
    GET /titles/_search
    {
       "query": {
            "multi_match": {
                "query":  "barking dogs",
                "type":   "most_fields",
                "fields": [ "title^10", "title.std" ]
            }
        }
    }
    
    
    
    PUT address/_doc/1
    {
      "street": "5 Poland Street",
      "city": "London",
      "country": "United Kingdom",
      "postcode": "W1V 3Dg"
    }
    
    
    POST address/_search
    {
     "query": {
        "multi_match": {
          "query": "Poland Street W1V",
          "type": "most_fields",
          "fields": ["street", "city", "country", "postcode"]
        }
      }
    }
    
    
    POST address/_search
    {
     "query": {
        "multi_match": {
          "query": "Poland Street W1V",
          "type": "cross_fields",
          "operator": "and", 
          "fields": ["street", "city", "country", "postcode"]
        }
      }
    }
    

    相关阅读

    相关文章

      网友评论

          本文标题:4.6-单字符串多字段查询-Multi-Match

          本文链接:https://www.haomeiwen.com/subject/idxzuhtx.html