美文网首页DevOps
Elastic Certified Engineer 德国博士题

Elastic Certified Engineer 德国博士题

作者: caster | 来源:发表于2020-11-19 15:00 被阅读0次

    elasticsearch认证工程师考试必备练习题,难度较考试大一些

    1. 安装配置集群

    Exercise1

    1. 定义集群名字
    2. 定义三个节点名字
    3. 所有节点配置为eligible master
    4. 将节点网络绑定到ip和端口
    5. 配置cluster discovery使node2,node3使用node1作为seed host
    6. 配置集群避免脑裂
    7. 配置node1为data但不是ingest
    8. 配置node2,node3为data和ingest
    9. 配置node1禁止swapping
    10. 配置jvm min/max 都为1GB
    11. 修改log目录
    12. 修改transport-related events的日志级别为debug
    13. 配置节点禁止以通配符删除索引

    解:

    http.port: 9200
    transport.port: 9300
    
    cluster.initial_master_node: ["node1","node2","node3"]
    
    node.master: true 
    node.data: false 
    node.ingest: false 
    node.ml: false 
    cluster.remote.connect: false
    
    swapoff -a
    vim /etc/security/limits.conf
    * hard memlock unlimited
    * soft memlock unlimited
    bootstrap.memory_lock:true
    
    //配置该节点会与哪些候选地址进行通信以加入集群
    discovery.seed_hosts: ["node1"]
    //集群初始化的提供的master候选地址,第一次启动时将从该列表中获取master
    cluster.initial_master_nodes: ["node1"] 
    
    logger.org.elasticsearch.transport: debug
    
    action.destructive_requires_name: true
    
    PUT /_cluster/settings
    {
      "transient": {
        "logger.org.elasticsearch.transport": "debug"
      }
    }
    

    Exercise2

    1. 运行集群带有一个kibana实例
    2. 集群内没有hamlet索引
    3. 启用集群的xpack security
    4. 设置elastic和kibana用户的密码
    5. 使用elastic用户登录kibana
    6. 构造hamlet索引并bulk进数据
    7. 创建francisco_role在native realm,有cluster的monitor权限,index hamlet的所有权限
    8. 将用户francisco_role赋予给新建用户francisco
    9. 验证用户francisco
    10. 创建nardo_role在native realm,有cluster的monitor权限,index read-only权限,只能看到BERNARDO=speaker的文档,只能看到text_entry字段
    11. 将用户nardo_role赋予给新建用户nardo
    12. 验证用户nardo
    13. 修改用户nardo密码并验证

    解:

    开启security,开启ssl认证,生成证书,分发节点,添加密码到keystore:

    xpack.security.enabled: true
    xpack.security.transport.ssl.enabled: true
    xpack.security.transport.ssl.verification_mode: certificate
    xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12
    xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12
    

    elasticsearch-users添加的本地用户只有本地才有效,存在本地的config文件夹下
    7以前版本需要申请证书注册白金版本试用,开启field级别security
    7以后版本/_license/start_trial?acknowledge=true开启试用

    {
      "role1": {
        "cluster": [
          "monitor"
        ],
        "indices": [
          {
            "names": [
              "t1"
            ],
            "privileges": [
              "all"
            ],
            "allow_restricted_indices": false
          }
        ]
      }
    }
    
    {
      "role2": {
        "cluster": [
          "monitor"
        ],
        "indices": [
          {
            "names": [
              "t1"
            ],
            "privileges": [
              "read"
            ],
            "field_security": {
              "grant": [
                "f1"
              ]
            },
            "query": "{\"match\":{\"f2\":\"a\"}}",
            "allow_restricted_indices": false
          }
        ]
      }
    }
        
    POST /_security/user/user1/_password
    {
      "password" : "123456"
    }
    

    2. 管理集群

    Exercise1

    1. 创建2分片1副本索引
    2. 查看分配状态和分片分布
    3. 将2个主分片都分配到node1
    4. 将2个主分片设置都不分配到node3
    5. 去除相关的分配策略
    6. 设置名为zone的attribute,node1,node2的zone为z1,node3为z2
    7. 设置集群awareness基于这两个zone,并persist这个配置在集群重启后仍生效
    8. 设置集群hot/warm策略,node1为hot,node2,3为warm,分配索引t1所有分片到warm节点
    9. 移除t1的hot/warm策略

    解:

    _cat/shards/test1?v
    //允许集群分片分配:
    cluster.routing.allocation.enable:all/primaries/new_primaries/none
    //允许分片重新平衡:
    cluster.routing.rebalance.enable:all/primaries/replicas/none
    //分配类型:
    require/include/exclude  配合修改副本数
    PUT test1/_settings
    {
      "index.routing.allocation.require._name": "sinan02"
    }
    PUT test1/_settings
    {
      "index.routing.allocation.require._name": null
    }
    
    node.attr.zone: zone2
    node.attr.type: warm
    
    PUT _cluster/settings
    {
      "persistent": {//持久的  transient:暂时的
        "cluster.routing.allocation.awareness.attributes": "zone",//副本分配到zone值不同的节点
        "cluster.routing.allocation.awareness.force.zone.values": "zone1,zone2"//强制感知,不满足不分配副本
        //设置为null取消感知
      }
    }
    GET _cat/shards
    GET _cat/nodeattrs?v
    

    Exercise2

    1. 配置单节点集群节点node1存储快照到指定目录
    2. 创建hamlet_backup shared file system repository到指定目录
    3. 创建hamlet的snapshot名字为hamlet_snapshot_1存储到hamlet_backup
    4. 删除hamlet再通过hamlet_snapshot_1恢复
    5. 再启动单节点集群节点node2,同样构造索引test1_pirate
    6. 配置cross跨集群搜索:remote cluster name为original,seed为node1监听transport port,跨集群配置为persists
    7. 运行跨集群查询

    解:

    path.repo: ["/home/caster/repo"]
    PUT /_snapshot/hamlet_backup
    {
      "type": "fs",
      "settings": {
        "location": "/home/caster/repo",
        "compress": true
      }
    }
    PUT /_snapshot/hamlet_backup/hamlet_snapshot_1?wait_for_completion=true
    {
      "indices": "hamlet",
      "ignore_unavailable": true,// 快照创建期间不存在的索引被忽略
      "include_global_state": false//防止集群全局状态作为快照一部分存储起来
    }
    GET /_snapshot/hamlet_backup/_all
    DELETE hamlet
    POST /_snapshot/hamlet_backup/hamlet_snapshot_1/_restore
    
    PUT _cluster/settings
    {
      "persistent": {
        "cluster": {
          "remote": {
            "cluster_two": {
              "seeds": [
                "sinan03:9300"
              ]
            },
            "cluster_one": {
              "seeds": [
                "sinan04:9300"
              ]
            }
          }
        }
      }
    }
    GET /cluster_one:test1/_search
    

    3. 加载数据

    Exercise1

    1. 创建索引hamlet-raw一分片三副本
    2. 插入一个文档id=1,默认type,一个字段 line值为"To be, or not to be: that is the question"
    3. 通过id=1更新文档添加字段line_number值为3.1.64
    4. 插入一个文档使用自动生成的id,默认type,字段text_entry值为" tis nobler in the mind to suffer",字段line_number值为3.1.66
    5. 更新上一个文档通过设置line_number值为3.1.65
    6. 通过一个请求,更新全部文档添加一个字段speaker值为hamlet
    7. 更新id=1的文档将字段line重命名为text_entry
    8. 创建名为set_is_hamlet的脚本存到cluster state,脚本为每个文档添加一个字段is_hamlet,如果文档的speaker值为HAMLET就将is_hamlet值为true,否则设置为false,运行脚本更新hamlet所有文档(_update_by_query)
    9. 通过_delete_by_query删除hamlet中speaker为KING CLAUDIUS,LAERTES的文档

    解:

    PUT hamlet
    {
      "settings": {
        "number_of_replicas": 0,
        "number_of_shards": 1
      }
    }
    
    POST hamlet/_doc/1
    {
      "line":"To be, or not to be: that is the question"
    }
    
    POST hamlet/_update/1
    {
      "doc": {
        "line_number": "3.1.64"
      }
    }
    
    POST hamlet/_update/1
    {
      "script" : "ctx._source.line_number = '3.1.64'"
    }
    
    POST hamlet/_doc
    {
      "text_entry":"tis nobler in the mind to suffer",
      "line_number":"3.1.66"
    }
    
    POST hamlet/_update_by_query
    {
      "script": {
        "source": "ctx._source.line_number='3.1.65'",
        "lang": "painless"
      },
      "query": {
        "match": {
          "line_number": "3.1.66"
        }
      }
    }
    
    PUT _ingest/pipeline/speaker-Hamlet
    {
      "description": "speaker-Hamlet",
      "processors": [
        {
          "set": {
            "field": "speaker",
            "value": "Hamlet"
          }
        }
      ]
    }
    
    POST hamlet/_update_by_query?pipeline=speaker-Hamlet
    
    POST hamlet/_update_by_query
    {
      "script": {
        "source": "ctx._source.speaker = 'hamlet'",
        "lang": "painless"
      },
      "query": {
        "match_all": {}
      }
    }
    
    PUT _ingest/pipeline/p1
    {
      "processors": [
        {
          "rename": {
            "field": "line",
            "target_field": "text_entry"
          }
        }
      ]
    }
    
    POST hamlet/_update_by_query?pipeline=p1
    {
      "query": {
        "term": {
          "_id": "1"
        }
      }
    }
    
    POST _scripts/s1
    {
      "script": {
        "source": "if (ctx._source.speaker=='hamlet') { ctx._source.is_hamlet='true' } else{ctx._source.is_hamlet='false'}",
        "lang": "painless"
      }
    }
    
    GET _scripts/s1
    
    POST hamlet/_update_by_query
    {
      "script": {
        "id": "s1"
      },
      "query": {
        "match_all": {}
      }
    }
    
    POST hamlet/_delete_by_query
    {
      "query": {
        "bool": {
          "should": [
            {
              "term": {
                "speaker.keyword": {
                  "value": "KING CLAUDIUS"
                }
              }
            },
            {
              "term": {
                "speaker.keyword": {
                  "value": "LAERTES"
                }
              }
            }
          ]
        }
      }
    }
    

    Exercise2

    1. 创建一个名为hamlet_template匹配以hamlet_ or hamlet-开头的索引,设置1主0副
    2. 创建hamlet2和hamlet_test验证只有hamlet_test使用了模板
    3. 更新模板hamlet_template设置type:_doc的mapping:字为speaker,line_number, and text_entry(english分词器)
    4. 验证模板更新没有应用到已创建的索引上
    5. 删除hamlet2和hamlet_test
    6. 创建hamlet-1并添加数据验证模板应用
    7. 更新模板hamlet_template拒绝未定义的字段写入
    8. 验证hamlet-1不能写入未定义的字段
      dynamic mapping 和 dynamic templates:
    9. 更新模板hamlet_template允许动态mapping:number_开头的字段设置为integer类型,string字段设置为text类型
    10. 创建hamlet-2并添加文档验证

    解:

    PUT _template/hamlet_template
    {
      "index_patterns": [
        "hamlet_*",
        "hamlet-*"
      ],
      "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
      }
    }
    PUT hamlet_1
    PUT hamlettest
    GET hamlet_1,hamlettest
    
    PUT _template/hamlet_template
    {
      "index_patterns": [
        "hamlet_*",
        "hamlet-*"
      ],
      "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
      },
      "mappings": {
        "properties": {
          "speaker": {
            "type": "keyword"
          },
          "line_number": {
            "type": "keyword"
          },
          "text_entry": {
            "type": "text",
            "analyzer": "english"
          }
        }
      }
    }
    GET hamlet_1,hamlettest
    DELETE hamlet_1,hamlettest
    
    POST _bulk
    {"index":{"_index":"hamlet-1","_id":0}}
    {"line_number":"1.1.1","speaker":"BERNARDO","text_entry":"Whos there?"}
    GET hamlet-1
    
    PUT _template/hamlet_template
    {
      "index_patterns": [
        "hamlet_*",
        "hamlet-*"
      ],
      "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
      },
      "mappings": {
        "dynamic": "strict",
        "properties": {
          "speaker": {
            "type": "keyword"
          },
          "line_number": {
            "type": "keyword"
          },
          "text_entry": {
            "type": "text",
            "analyzer": "english"
          }
        }
      }
    }
    DELETE hamlet-1
    
    POST _bulk
    {"index":{"_index":"hamlet-1","_id":0}}
    {"line_number":"1.1.1","speaker":"BERNARDO","text_entry":"Whos there?"}
    GET hamlet-1
    
    POST _bulk
    {"index":{"_index":"hamlet-1","_id":1}}
    {"line_number":"1.1.1","speaker":"BERNARDO","text_entry":"Whos there?","zxc":"a"}
    
    POST hamlet-2/_doc/4
    {
      "text_entry": "With turbulent and dangerous lunacy?",
      "line_number": "3.1.4",
      "number_act": "3",
      "speaker": "KING CLAUDIUS"
    }
    
    PUT _template/hamlet_template
    {
      "index_patterns": [
        "hamlet_*",
        "hamlet-*"
      ],
      "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
      },
      "mappings": {
        "dynamic_templates": [//顺序优先生效
          {
            "my": {
              //"match_mapping_type": "string",
              "match": "number_*",
              "mapping": {
                "type": "integer"
              }
            }
          },
          {
            "strings": {
              "match_mapping_type": "string",
              "mapping": {
                "type": "text"
              }
            }
          }
        ]
      }
    }
    POST hamlet-2/_doc/4
    {
     "text_entry": "With turbulent and dangerous lunacy?",
     "line_number": "3.1.4",
     "number_act":3,
     "speaker": "KING CLAUDIUS"
    }
    GET hamlet-2
    

    Exercise3

    1. 创建hamlet-1和hamlet-2,2分片1副本,加载数据
    2. 设置别名hamlet指向两个索引,检查hamlet文档数是否为总和
    3. 设置hamlet-1为hamlet的写索引
    4. 添加文档到hamlet,id为8,type为_doc,text_entry字段值为With turbulent and dangerous lunacy?,line_number为3.1.4,speaker为KING CLAUDIUS
    5. 创建control_reindex_batch脚本存储cluster state,脚本检查文档是否有reindexBatch字段,有则将其值增加parameter的increment参数数值,没有则添加此字段设置值为1
    6. 创建hamlet-new为2主0副,将hamlet别名reindex到hamlet-new,使用control_reindex_batch脚本并将increment参数设置为1,reindex using two parallel slices
    7. 将hamlet别名指向new删掉1,2
    8. 创建管道split_act_scene_line:将line_number以 . 分割为三部分,分别存入number_act,number_scene,number_line
    9. 用simulate测试管道,doc: {"line_number": "1.2.3"}
    10. 用此管道更新hamlet-new的全部文档

    解:

    POST /_aliases
    {
      "actions": [
        {
          "add": {
            "indices": [
              "hamlet-1",
              "hamlet-2"
            ],
            "alias": "hamlet"
          }
        }
      ]
    }
    POST hamlet/_count
    POST /_aliases
    {
      "actions": [
        {
          "add": {
            "index": "hamlet-1",
            "alias": "hamlet",
            "is_write_index": true
          }
        },
        {
          "add": {
            "index": "hamlet-2",
            "alias": "hamlet"
          }
        }
      ]
    }
    POST hamlet/_doc/8
    {
      "text_entry": "With turbulent and dangerous lunacy?",
      "line_number": "3.1.4",
      "speaker": "KING CLAUDIUS"
    }
    
    POST _scripts/control_reindex_batch
    {
      "script": {
        "source": """
        if (ctx._source['reindexBatch']!=null) {
          ctx._source['reindexBatch']+=params.increment 
        } else{
          ctx._source['reindexBatch']=1
        }""",
        "lang": "painless"
      }
    }
    POST _reindex?slices=2
    {
      "source": {
        "index": "hamlet"
      },
      "dest": {
        "index": "hamlet-new"
      },
      "script": {
        "id": "control_reindex_batch",
        "params": {
          "increment": 1
        }
      }
    }
    POST /_aliases
    {
      "actions": [
        {
          "remove": {
            "index": "hamlet-1",
            "alias": "hamlet"
          }
        },
        {
          "remove": {
            "index": "hamlet-2",
            "alias": "hamlet"
          }
        },
        {
          "add": {
            "index": "hamlet-new",
            "alias": "hamlet"
          }
        }
      ]
    }
    
    PUT _ingest/pipeline/split_act_scene_line
    {
      "processors": [
        {
          "split": {
            "field": "line_number",
            "separator": "\\."
          }
        },
        {
          "script": {
            "source": """
                ctx. number_act = ctx.line_number.0;//.size()获取数组长度
                ctx. number_scene = ctx.line_number.1;
                ctx. number_line = ctx.line_number.2;
            """
          }
        }
      ]
    }
    or:
    {
      "set": {
        "field": "number_line",
        "value": "{{line_number.2}}"
      }
    }
    
    POST _ingest/pipeline/split_act_scene_line/_simulate
    {
      "docs": [
        {
          "_source": {
            "line_number": "1.2.3"
          }
        }
      ]
    }
    
    POST hamlet-new/_update_by_query?pipeline=split_act_scene_line
    

    4. 映射和文本分析

    Exercise1

    1. 创建索引hamlet_1,1分片0副本,_doc type三个字段,speaker,line_number,text_entry, speaker,line_number为不分词string,
    2. 更新mapping禁用line_number的聚合//禁止更新重新创建
    3. 创建索引hamlet_2,1分片0副本,将1的mapping复制到2里面,将speaker设置为multi-field:tokens类型为默认的analysed string
    4. 将1 reindex到2
    5. 验证speaker.tokens的全文检索:
      解:
    PUT hamlet_1
    {
      "settings": {
        "number_of_replicas": 0,
        "number_of_shards": 1
      },
      "mappings": {
        "properties": {
          "speaker": {
            "type": "keyword"
          },
          "line_number": {
            "type": "keyword",
             "doc_values": false
          },
          "text_entry": {
            "type": "text"
          }
        }
      }
    }
    POST /hamlet_1/_search?size=0
    {
      "aggs": {
        "t1": {
          "terms": {
            "field": "line_number"
          }
        }
      }
    }
    
    PUT hamlet_2
    {
      "settings": {
        "number_of_replicas": 0,
        "number_of_shards": 1
      },
      "mappings": {
        "properties": {
          "speaker": {
            "type": "keyword",
            "fields": {
              "tokens": {
                "type": "text"
              }
            }
          },
          "line_number": {
            "type": "keyword",
            "doc_values": false
          },
          "text_entry": {
            "type": "text"
          }
        }
      }
    }
    POST _reindex
    {
      "source": {
        "index": "hamlet_1"
      },
      "dest": {
        "index": "hamlet_2"
      }
    }
    GET hamlet_2/_search
    {
      "query": {
        "match": {
          "speaker.tokens": "hamlet"
        }
      }
    }
    

    Exercise2

    1. 创建hamlet_1,加载关系型数据进行查询,发现返回结果不符
    2. 创建hamlet_2,mapping的relationship字段可以正确搜索(nested)
    3. 创建hamlet_3,1分片0副本,将hamlet_2的mapping复制到3,添加一个join字段character_or_line,character为parent,line为child
    4. reindex 2到3
    5. 创建脚本init_lines,参数为characterId,添加character_or_line到文档,设置character_or_line.name为line,设置character_or_line.parent=characterId参数的值
    6. 更新hamlet演员文档id=C0,添加character_or_line设置name为character
    7. 通过脚本init_lines更新索引中speaker为hamlet的文档设置characterId=C0指向hamlet演员父文档
    8. 运行has_parent查询验证

    解:

    PUT hamlet_1
    {
      "settings": {
        "number_of_replicas": 0,
        "number_of_shards": 1
      }
    }
    PUT hamlet_1/_doc/_bulk
    {"index":{"_index":"hamlet_1","_id":"C0"}}
    {"name":"HAMLET","relationship":[{"name":"HORATIO","type":"friend"},{"name":"GERTRUDE","type":"mother"}]}
    {"index":{"_index":"hamlet_1","_id":"C1"}}
    {"name":"KING CLAUDIUS","relationship":[{"name":"HAMLET","type":"nephew"}]}
    
    GET hamlet_1/_search
    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "relationship.name": "gertrude"
              }
            },
            {
              "match": {
                "relationship.type": "friend"
              }
            }
          ]
        }
      }
    }
    
    PUT hamlet_2
    {
      "settings": {
        "number_of_replicas": 0,
        "number_of_shards": 1
      },
      "mappings": {
        "properties": {
          "name": {
            "type": "keyword"
          },
          "relationship": {
            "type": "nested",
            "properties": {
              "name": {
                "type": "keyword"
              },
              "type": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
    GET hamlet_2/_search
    {
      "query": {
        "nested": {
          "path": "relationship",
          "query": {
            "bool": {
              "must": [
                {
                  "match": {
                    "relationship.name": "GERTRUDE"
                  }
                },
                {
                  "match": {
                    "relationship.type": "mother"
                  }
                }
              ]
            }
          }
        }
      }
    }
    
    PUT hamlet_3
    {
      "settings": {
        "number_of_replicas": 0,
        "number_of_shards": 1
      },
      "mappings": {
        "properties": {
          "name": {
            "type": "keyword"
          },
          "relationship": {
            "type": "nested",
            "properties": {
              "name": {
                "type": "keyword"
              },
              "type": {
                "type": "keyword"
              }
            }
          },
          "character_or_line": { 
            "type": "join",
            "relations": {
              "character": "line" 
            }
          }
        }
      }
    }
    POST _reindex
    {
      "source": {
        "index": "hamlet_2"
      },
      "dest": {
        "index": "hamlet_3"
      }
    }
    
    解:
    PUT _scripts/init_lines
    {
      "script": {
        "lang": "painless",
        "source": """
            HashMap map = new HashMap();
            map.name = 'line';
            map.parent = params.characterId;
            ctx._source.character_or_line = map;
        """
      }
    }
    PUT hamlet_3/_doc/C0?refresh
    {
      "character_or_line": "character" 
    }
    POST hamlet_3/_update_by_query?routing=C0
    {
      "script": {
        "id": "init_lines",
        "params": {
          "characterId": "C0"
        }
      },
      "query": {
        "match": {
          "speaker": "HAMLET"
        }
      }
    }   
    GET hamlet_3/_search
    {
      "query": {
        "has_parent": {
          "parent_type": "character",
          "query": {
            "match": {
              "name": "HAMLET"
            }
          }
        }
      }
    }
    

    Exercise3

    1. 创建hamlet_1,定义mappings三个字段speaker,line_number,text_entry,text_entry为english分词器
    2. 创建hamlet_2添加自定义分词器shy_hamlet_analyzer,包含char filter将Hamlet替换为CENSORED;tokenizer 分离tokens on whitespaces and columns; token filter忽略字符数小于5的characters,hamlet2的mapping设置字段text_entry使用此分词器
    3. 使用分词器验证api验证shy_hamlet_analyzer
    4. 将hamlet_1 reindex到 hamlet_2,查询censored验证生效

    解:

    PUT hamlet_2
    {
      "settings": {
        "number_of_replicas": 0,
        "number_of_shards": 1,
        "analysis": {
          "analyzer": {
            "shy_hamlet_analyzer": {
              "type": "custom",
              "tokenizer": "my_tokenizer",
              "char_filter": [
                "my_char_filter"
              ],
              "filter": [
                "my_filter"
              ]
            }
          },
          "tokenizer": {
            "my_tokenizer": {
              "type": "char_group",
              "tokenize_on_chars": [
                "whitespace",
                "\n"
              ]
            }
          },
          "char_filter": {
            "my_char_filter": {
              "type": "mapping",
              "mappings": [
                "Hamlet => CENSORED"
              ]
            }
          },
          "filter": {
            "my_filter": {
              "type": "length",
              "min": "5"
            }
          }
        }
      },
      "mappings": {
        "properties": {
          "speaker": {
            "type": "keyword"
          },
          "line_number": {
            "type": "keyword"
          },
          "text_entry": {
            "type": "text",
            "analyzer": "shy_hamlet_analyzer"
          }
        }
      }
    }
    POST hamlet_2/_analyze
    {
      "analyzer": "shy_hamlet_analyzer", 
      "text": "Though yet of Hamlet our dear brothers death"
    }
    POST hamlet_2/_search
    {
      "query": {
        "match": {
          "text_entry": "CENSORED"
        }
      }
    }
    

    5. 查询和聚合

    Exercise1

    1. 添加kibana日志和电商样例数据
    2. 插找日志索引message包含firefox的文档,firefox大小写不影响结果,因为standard分词器默认将全部tokens转为小写存储
    3. 分页查询message包含firefox的文档返回50条,然后再返回第二个50条
    4. 使用search after方式进行翻页查询
    5. 查询message包含firefox或者kibana的文档
    6. 查询message包含firefox和kibana的文档,查询message包含至少firefox,kibana和159.64.35.129中的两个
    7. 查询message包含firefox或者kibana的文档,高亮message字段并用{{ }}包裹
    8. 查询message包含短语HTTP/1.1 200 51的文档
    9. 查询message包含短语HTTP/1.1 200 51的文档,使用machine.os desc排序结果,同时使用timestamp asc排序
    10. 查询电商索引day_of_week包含Monday的文档,使用products.base_price desc排序,使用数组中的最小值进行排序

    解:

    POST kibana_sample_data_logs/_search
    {
      "query": {
        "match": {
          "message": "firefox"
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "from": 0,//设置为50返回第二页
      "size": 50, 
      "query": {
        "match": {
          "message": "firefox"
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "size": 5,
      "query": {
        "match": {
          "message": "firefox"
        }
      },
      "sort": [
        {
          "_id": "asc"
        }
      ]
    }
    
    POST kibana_sample_data_logs/_search
    {
      "size": 5,
      "query": {
        "match": {
          "message": "firefox"
        }
      },
      "search_after": [
        "zmHi7HQBoeWdfd48WTl-"//使用第一次返回的最后一个_id进行查询
      ],
      "sort": [
        {
          "_id": "asc"
        }
      ]
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "match": {
          "message": "firefox kibana"
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "match": {
          "message": {
            "query": "firefox kibana",
            "operator": "and"
          }
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "match": {
          "message": {
            "query": "firefox kibana 159.64.35.129",
            "operator": "or",
            "minimum_should_match": 2
          }
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "match": {
          "message": "firefox kibana"
        }
      },
      "highlight": {
        "fields": {
          "message": {
            "pre_tags": ["{{"],
            "post_tags": ["}}"]
          }
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "match_phrase": {
          "message": "HTTP/1.1 200 51"
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "match_phrase": {
          "message": "HTTP/1.1 200 51"
        }
      },
      "sort": [
        {
          "machine.os.keyword": {
            "order": "desc"
          },
          "timestamp": {
            "order": "asc"
          }
        }
      ]
    }
    
    POST kibana_sample_data_ecommerce/_search
    {
      "query": {
        "match": {
          "day_of_week": "Monday"
        }
      },
      "sort": [
        {
          "products.base_price": {
            "order": "desc",
            "mode": "min"
          }
        }
      ]
    }
    

    Exercise2

    1. 过滤日志索引response字段≥400且<500,同时过滤referer字段为http://twitter.com/success/guion-bluford
    2. 过滤文档referer字段以http://twitter.com/success开头,过滤文档request字段以/people开头
    3. 过滤文档memory有值,过滤文档memory无值
    4. 查询agent包含windows,url包含name:john,phpmemory不空的文档
    5. 查询response字段≥400或者tags包含error的文档
    6. 查询tags不包含warning,error,info的文档
    7. 过滤timestamp字段包含日期在一周前到今天的文档
    8. 查询kibana_sample_data_flights索引,过滤文档OriginCityName或者DestCityName字段匹配Sydney,但允许不精准匹配,最大允许Levenshtein Edit Distance设置为2。测试Sydney,Sidney,Sidnei匹配到相同的结果

    解:

    POST kibana_sample_data_logs/_search
    {
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "response": {
                  "gte": 400,
                  "lt": 500
                }
              }
            },
            {
              "term": {
                "referer": "http://twitter.com/success/guion-bluford"
              }
            }
          ]
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "bool": {
          "filter": {
            "prefix": {
              "referer": "http://twitter.com/succes"
            }
          }
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "bool": {
          "filter": {
            "match_phrase_prefix": {
              "request": {
                "query": "/people"
              }
            }
          }
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "bool": {
          "filter": {
             "exists": {
                "field": "memory"
            }
          }
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    { 
      "query": {
        "bool": {
          "must_not": {
             "exists": {
                "field": "memory"
            }
          }
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "_source": ["agent","url","phpmemory"], 
      "query": {
        "bool": {
          "filter": [
            {
              "term": {
                "agent": "windows"
              }
            },
            {
              "term": {
                "url": "name:john"
              }
            },
            {
              "exists": {
                "field": "phpmemory"
            }
            }
          ]
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "bool": {
          "should": [
            {
              "range": {
                "response": {
                  "gte": 400
                }
              }
            },
            {
              "term": {
                "tags": "error"
              }
            }
          ]
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "bool": {
          "must_not": [
            {
              "terms": {
                "tags": [
                  "warning",
                  "error",
                  "info"
                ]
              }
            }
          ]
        }
      }
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "bool": {
          "filter": {
            "range": {
              "timestamp": {
                "gte": "now-1w/d",
                "lte":"now/d"
              }
            }
          }
        }
      }
    }
    
    POST kibana_sample_data_flights/_search
    {
      "_source": [
        "DestCityName",
        "OriginCityName"
      ],
      "query": {
        "bool": {
          "should": [
            {
              "bool": {
                "filter": {
                  "fuzzy": {
                    "OriginCityName": {
                      "value": "Sydney",
                      " fuzziness ": 2
                    }
                  }
                }
              }
            },
            {
              "bool": {
                "filter": {
                  "fuzzy": {
                    "DestCityName": {
                      "value": "Sydney",
                      " fuzziness ": 2
                    }
                  }
                }
              }
            }
          ],
          "minimum_should_match": 1
        }
      }
    }
    

    Exercise3

    1. 使用scroll查询所有索引前100个文档,保持搜索上下文2分钟,并使用返回的scroll id查询下一个批次
    2. 查询kibana_sample_data_logs索引过滤字段response值≥400
    3. 构造查询模板with_response_and_tag,有一个参数with_min_response代表response的最小值,参数with_max_response代表最大值,参数with_tag代表tags可能含有的值
    4. 测试查询模板with_response_and_tag设置参数分别为:400,500,security
    5. 更新模板,如果with_max_response未设置,则不设置最大值,如果with_tag未设置,则不执行此项过滤
    6. 测试模板设置with_min_response为500,测试模板设置参数为:min=500,tags=security

    解:

    POST kibana_sample*/_search?scroll=2m
    {
      "size": 100,
      "query": {
        "match_all": {}
      }
    }
    
    POST /_search/scroll 
    {
      "scroll" : "2m", 
      "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAASGqFnJDN0NmdS0tUm5hU08wcFBHOWFPMGcAAAAAAAEhqxZyQzdDZnUtLVJuYVNPMHBQRzlhTzBnAAAAAAABIawWckM3Q2Z1LS1SbmFTTzBwUEc5YU8wZw==" 
    }
    
    POST kibana_sample_data_logs/_search
    {
      "query": {
        "bool": {
          "filter": {
            "range": {
              "response": {
                "gte": 400
              }
            }
          }
        }
      }
    }
    
    POST _scripts/with_response_and_tag
    {
      "script": {
        "lang": "mustache",
        "source": {
          "query": {
            "bool": {
              "filter": [
                {
                  "range": {
                    "response": {
                      "gte": "{{with_min_response}}",
                      "lte": "{{with_max_response}}"
                    }
                  }
                },
                {
                  "term": {
                    "tags": "{{with_tag}}"
                  }
                }
              ]
            }
          }
        }
      }
    }
    
    POST kibana_sample_data_logs/_search/template
    {
      "id": "with_response_and_tag",
      "params": {
        "with_min_response": 400,
        "with_max_response": 500,
        "with_tag": "security"
      }
    }
    
    //具体操作方法可以参考查询模板文章:https://www.jianshu.com/p/11412cc44c46
    POST _scripts/with_response_and_tag
    {
      "script": {
        "lang": "mustache",
        "source":"{\"query\":{\"bool\":{\"filter\":[{\"range\": {\"response\":{\"gte\":\"{{with_min_response}}\"{{#with_max_response}},{{/with_max_response}}{{#with_max_response}}\"lte\":\"{{with_max_response}}\"{{/with_max_response}}}}}{{#with_tag}},{{/with_tag}}{{#with_tag}}{\"term\": {\"tags\":\"{{with_tag}}\"}}{{/with_tag}}]}}}"
      }
    }
    
    POST kibana_sample_data_logs/_search/template
    {
      "id": "with_response_and_tag",
      "params": {
        "with_min_response": 400
      }
    }
    POST kibana_sample_data_logs/_search/template
    {
      "id": "with_response_and_tag",
      "params": {
        "with_min_response": 400,
        "with_max_response": 500,
        "with_tag": "security"
      }
    }
    

    Exercise4

    1. 创建一个聚合,名为max_distance,计算DistanceKilometers字段的最大值
    2. 创建一个聚合,名为stats_flight_time,计算FlightTimeMin字段的统计数据
    3. 创建两个聚合,cardinality_origin_cities,cardinality_dest_cities计算OriginCityName和DestCityName去重数量
    4. 创建一个名为popular_origin_cities的聚合计算以OriginCityName字段分组的航班数量,只返回5个以des排序的buckets
    5. 创建一个名为avg_price_histogram的聚合将文档以250为间隔对AvgTicketPrice字段进行分组统计
    6. 创建一个名为popular_carriers的聚合以Carrier字段进行分组计算航班数量,添加一个子聚合名为carrier_stats_delay计算相关carrier桶的FlightDelayMin字段的统计数据,添加另一个子聚合名为carrier_max_delay展示每个carrier桶的FlightDelayMin字段的最大值
    7. 使用timestamp字段创建聚合flights_every_10_days,将航班信息按10天间隔进行分组
    8. 使用timestamp字段创建聚合flights_by_day,将航班信息按天进行分组,添加一个子聚合destinations_by_day将每天的桶按DestCityName字段再分组
    9. 添加一个子聚合popular_destinations_by_day到destinations_by_day子聚合下面,返回每个桶前三个最流行的文档(对score排序)
    10. 更新popular_destinations_by_day只显示top hit对象的DestCityName字段

    解:

    POST kibana_sample_data_flights/_search
    {
      "size": 0, 
      "aggs": {
        "max_distance": {
          "max": {
            "field": "DistanceKilometers"
          }
        }
      }
    }
    
    POST kibana_sample_data_flights/_search
    {
      "size": 0, 
      "aggs": {
        "stats_flight_time": {
          "stats": {
            "field": "FlightTimeMin"
          }
        }
      }
    }
    
    POST kibana_sample_data_flights/_search
    {
      "size": 0,
      "aggs": {
        "cardinality_origin_cities": {
          "cardinality": {
            "field": "OriginCityName"
          }
        },
        "cardinality_dest_cities": {
          "cardinality": {
            "field": "DestCityName"
          }
        }
      }
    }
    
    POST kibana_sample_data_flights/_search
    {
      "size": 0, 
      "aggs": {
        "popular_origin_cities": {
          "terms": {
            "field": "OriginCityName",
            "order" : { "_count" : "desc" },
            "size" : 5
          }
        }
      }
    }
    
    POST kibana_sample_data_flights/_search
    {
      "size": 0, 
      "aggs": {
        "avg_price_histogram": {
          "histogram": {
            "field": "AvgTicketPrice",
            "interval": 250
          }
        }
      }
    }
    
    POST kibana_sample_data_flights/_search
    {
      "size": 0,
      "aggs": {
        "popular_carriers": {
          "terms": {
            "field": "Carrier"
          },
          "aggs": {
            "carrier_stats_delay": {
              "stats": {
                "field": "FlightDelayMin"
              }
            },
            "carrier_max_delay": {
              "max": {
                "field": "FlightDelayMin"
              }
            }
          }
        }
      }
    }
    
    POST kibana_sample_data_flights/_search
    {
      "size": 0, 
      "aggs": {
        "flights_every_10_days": {
          "date_histogram": {
            "field": "timestamp",
            "fixed_interval": "10d"
          }
        }
      }
    }
    
    POST kibana_sample_data_flights/_search
    {
      "size": 0, 
      "aggs": {
        "flights_by_day": {
          "date_histogram": {
            "field": "timestamp",
            "calendar_interval": "day"
          },
          "aggs": {
            "destinations_by_day": {
              "terms": {
                "field": "DestCityName"
              }
            }
          }
        }
      }
    }
    
    POST kibana_sample_data_flights/_search
    {
      "size": 0,
      "aggs": {
        "flights_by_day": {
          "date_histogram": {
            "field": "timestamp",
            "calendar_interval": "day"
          },
          "aggs": {
            "destinations_by_day": {
              "terms": {
                "field": "DestCityName"
              },
              "aggs": {
                "popular_destinations_by_day": {
                  "top_hits": {
                    "sort": [
                      {
                        "_score": {
                          "order": "desc"
                        }
                      }
                    ],
                    "size": 3
                  }
                }
              }
            }
          }
        }
      }
    }
    
    POST kibana_sample_data_flights/_search
    {
      "size": 0,
      "aggs": {
        "flights_by_day": {
          "date_histogram": {
            "field": "timestamp",
            "calendar_interval": "day"
          },
          "aggs": {
            "destinations_by_day": {
              "terms": {
                "field": "DestCityName"
              },
              "aggs": {
                "popular_destinations_by_day": {
                  "top_hits": {
                    "sort": [
                      {
                        "_score": {
                          "order": "desc"
                        }
                      }
                    ],
                    "_source": "DestCityName", 
                    "size": 3
                  }
                }
              }
            }
          }
        }
      }
    }
    
    1. 移除flights_by_day聚合的popular_destinations_by_day子聚合,添加一个管道聚合most_popular_destination_of_the_day,定义popular_destinations_by_day桶标识每天里面最流行的目的地,即文档最多的;添加一个管道聚合day_with_most_flights标识flights_by_day每天分组的桶最多文档的那一天的桶;添加一个管道聚合day_with_the_most_popular_destination_over_all_days标识flights_by_day桶的最大most_popular_destination_of_the_day值

    解:

    1.航班按天分组;
    2.然后每天里面再按目的地分组;
    3.求出每天里面最流行的目的地;
    4.求出航班最多的那一天5.求出3里面的最大值。

    POST kibana_sample_data_flights/_search
    {
      "size": 0,
      "aggs": {
        "flights_by_day": {
          "date_histogram": {
            "field": "timestamp",
            "fixed_interval": "1d"
          },
          "aggs": {
            "destinations_by_day": {
              "terms": {
                "field": "DestCityName"
              }
            },
            "most_popular_destination_of_the_day": {
              "max_bucket": {
                "buckets_path": "destinations_by_day>_count"
              }
            }
          }
        },
        "day_with_most_flights": {
          "max_bucket": {
            "buckets_path": "flights_by_day._count"
          }
        },
        "day_with_the_most_popular_destination_over_all_days":{
          "max_bucket": {
            "buckets_path": "flights_by_day.most_popular_destination_of_the_day"
          }
        }
      }
    }
    

    相关文章

      网友评论

        本文标题:Elastic Certified Engineer 德国博士题

        本文链接:https://www.haomeiwen.com/subject/joxliktx.html