众多初学者,如果有老的数据,从编辑器里生成出来的html代码片段,导入elasticsearch中,会出现搜索高亮时把html显示出来,体验不好,同步logstash时,需要进行filter过滤器先过滤掉html代码
filter{
mutate{
gsub => [ "content", "<script(.*?)</script>", "" ]
}
mutate{
gsub => [ "content", "<iframe(.*?)</iframe>", "" ]
}
mutate{
gsub => [ "content", "<style(.*?)</style>", "" ]
}
mutate{
gsub => [ "content", "<(.*?)>", "" ]
}
mutate{
gsub => [ "content", " ", "" ]
}
}
许多需要先在mysql中过滤,尤其是时间类型字段,建索引时也要指定格式:
"format"=>"yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis||strict_date_optional_time"
SELECT a.id,a.title,b.content,b.content as content_old,CONCAT(a.addtime) AS addtime,CONCAT(a.autotime) AS autotime,a.views,a.zans,a.type_a,a.type_b,CONCAT(a.isshow) AS isshow,CONCAT(a.isdelete) AS isdelete,if(isnull(a.deletetime),0,a.deletetime) as deletetime FROM web_information a
网友评论