美文网首页
结构化搜索

结构化搜索

作者: 滴流乱转的小胖子 | 来源:发表于2020-07-16 06:34 被阅读0次

    一、结构化数据

    1.1 结构化搜索(Structured search)是指对结构化数据的搜索

    • 日期,布尔类型和数字都是结构化的

    1.2 文本也可以是结构化的

    • 如彩色笔可以有离散的颜色集合:红(red) 绿(green) 蓝(blue)
    • 一个博客可能被标记为标签,例如,分布式(distributed)和搜索(search)
    • 电商网站上的商品都有UPCS(通用商品码Universal Product Codes)或其他的唯一标识,他们都需要遵从严格规定的、结构化的格式

    二、ES中的结构化搜索

    2.1 布尔,时间,日期和数字这类结构化数据:有精确的格式,我们可以对这些格式进行逻辑操作。

    • 包括比较数字或时间的范围,或判定两个值的大小

    2.2 结构化的文本可以做精确匹配或者部分匹配

    • Term 查询 / Prefix 前缀查询

    2.3 结构化的结果只有“是”或“否”两个值

    • 根据场景需要,可以决定结构化搜索是否需要打分

    三、布尔值

    image.png

    对布尔值,通过constant score 转成 filtering,没有算分


    image.png

    四、数字

    • 数字类型 Term 单个值
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "term": {
          "price": 30
        }
      }
    }
    
    • 数字类型 Terms 多个值 类比 SQL中的in函数
    POST products/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "terms": {
              "price": [
                "20",
                "30"
              ]
            }
          }
        }
      }
    }
    
    • 数字范围 20 <= price <= 30


      image.png

    五、日期 Range 范围

    image.png

    六、处理空值 exists

    image.png

    七、查询多个精确值 terms

    image.png
    • 匹配一个精确值
    #数字类型 terms
    POST products/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "terms": {
              "price": [
                "20"
              ]
            }
          }
        }
      }
    }
    

    八、Term查询是包含,不是完全相等。针对多字段查询要尤其注意

    image.png
    #结构化搜索,精确匹配
    DELETE products
    POST /products/_bulk
    { "index": { "_id": 1 }}
    { "price" : 10,"avaliable":true,"date":"2018-01-01", "productID" : "XHDK-A-1293-#fJ3" }
    { "index": { "_id": 2 }}
    { "price" : 20,"avaliable":true,"date":"2019-01-01", "productID" : "KDKE-B-9947-#kL5" }
    { "index": { "_id": 3 }}
    { "price" : 30,"avaliable":true, "productID" : "JODL-X-1937-#pV7" }
    { "index": { "_id": 4 }}
    { "price" : 30,"avaliable":false, "productID" : "QQPX-R-3956-#aD8" }
    
    GET products/_mapping
    
    
    
    #对布尔值 match 查询,有算分
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "term": {
          "avaliable": true
        }
      }
    }
    
    
    
    #对布尔值,通过constant score 转成 filtering,没有算分
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "constant_score": {
          "filter": {
            "term": {
              "avaliable": true
            }
          }
        }
      }
    }
    
    
    #数字类型 Term
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "term": {
          "price": 30
        }
      }
    }
    
    #数字类型 terms
    POST products/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "terms": {
              "price": [
                "20",
                "30"
              ]
            }
          }
        }
      }
    }
    
    #数字 Range 查询
    GET products/_search
    {
        "query" : {
            "constant_score" : {
                "filter" : {
                    "range" : {
                        "price" : {
                            "gte" : 20,
                            "lte"  : 30
                        }
                    }
                }
            }
        }
    }
    
    
    # 日期 range
    POST products/_search
    {
        "query" : {
            "constant_score" : {
                "filter" : {
                    "range" : {
                        "date" : {
                          "gte" : "now-1y"
                        }
                    }
                }
            }
        }
    }
    
    
    
    #exists查询
    POST products/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "exists": {
              "field": "date"
            }
          }
        }
      }
    }
    
    #处理多值字段
    POST /movies/_bulk
    { "index": { "_id": 1 }}
    { "title" : "Father of the Bridge Part II","year":1995, "genre":"Comedy"}
    { "index": { "_id": 2 }}
    { "title" : "Dave","year":1993,"genre":["Comedy","Romance"] }
    
    
    #处理多值字段,term 查询是包含,而不是等于
    POST movies/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "term": {
              "genre.keyword": "Comedy"
            }
          }
        }
      }
    }
    
    
    #字符类型 terms
    POST products/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "terms": {
              "productID.keyword": [
                "QQPX-R-3956-#aD8",
                "JODL-X-1937-#pV7"
              ]
            }
          }
        }
      }
    }
    
    
    
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "match": {
          "price": 30
        }
      }
    }
    
    
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "term": {
          "date": "2019-01-01"
        }
      }
    }
    
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "match": {
          "date": "2019-01-01"
        }
      }
    }
    
    
    
    
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "constant_score": {
          "filter": {
            "term": {
              "productID.keyword": "XHDK-A-1293-#fJ3"
            }
          }
        }
      }
    }
    
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "term": {
          "productID.keyword": "XHDK-A-1293-#fJ3"
        }
      }
    }
    
    #对布尔数值
    POST products/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "term": {
              "avaliable": "false"
            }
          }
        }
      }
    }
    
    POST products/_search
    {
      "query": {
        "term": {
          "avaliable": {
            "value": "false"
          }
        }
      }
    }
    
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "term": {
          "price": {
            "value": "20"
          }
        }
      }
    }
    
    POST products/_search
    {
      "profile": "true",
      "explain": true,
      "query": {
        "match": {
          "price": "20"
        }
        }
      }
    }
    
    
    POST products/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "bool": {
              "must_not": {
                "exists": {
                  "field": "date"
                }
              }
            }
          }
        }
      }
    }
    

    https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-exists-query.html https://www.elastic.co/guide/en/elasticsearch/reference/7.1/term-level-queries.html

    相关文章

      网友评论

          本文标题:结构化搜索

          本文链接:https://www.haomeiwen.com/subject/ibruhktx.html