1. ES使用场景
- 给网站/APP添加搜索功能
- 存储、分析数据
- 管理、交互、分析空间信息,将ES用于GIS
2. ES简介
- Elasticsearch是一个基于Lucene构建的开源、分布式、RESTful接口全文搜索引擎。
- Elasticsearch也是一个分布式文档数据库。
- Elasticsearch可以在很短时间内存储、搜索大量数据。
- Elasticsearch有很强的水平扩展能力。
3. ES发展历程
4. ES架构
5. Linux 安装 ES
点击查看我的另一篇文章: Linux安装Elasticsearch(ES)
- 测试的例子
# index songs_v1
PUT /songs_v1
#type对数据进行逻辑划分
PUT /songs_v1/_mappings/popular
{
"properties": {
"songName": {"type": "text"},
"singer": {"type": "text"},
"lyrics": {"type": "text"}
}
}
# 索引数据到es
# songName: take me to your heart
# singer: Michael Learns To Rock
# lyrics: Hiding from the rain and snow, Trying to forget but I won't let go, Looking at a crowded street, Listening to my own heart beat, So many people all around the world, Tell me where do I find someone like you girl, Take me to your heart, Take me to your soul, Give me your hand before I'm old, Show me what love is haven't got a clue, Show me that wonders can be true, They say nothing lasts forever, We're only here today, Love is now or never, Bring me far away, Take me to your heart, Take me to your soul, Give me your hand and hold me, Show me what love is be my guiding star, It's easy take me to your heart, Standing on a mountain high, Looking at the moon through a clear blue sky, I should go and see some friends, But they don't really comprehend, Don't need too much talking without saying anything, All I need is someone who makes me wanna sing, Take me to your heart, Take me to your soul, Give me your hand before I'm old, Show me what love is haven't got a clue, Show me that wonders can be true, They say nothing lasts forever, We're only here today, Love is now or never, Bring me far away, Take me to your heart, Take me to your soul, Give me your hand and hold me, Show me what love is be my guiding star, It's easy take me to your heart, Take me to your heart, Take me to your soul, Give me your hand and hold me, Show me what love is be my guiding star, It's easy take me to your heart.
# songName: you are beautiful
# singer: James Blunt
# lyrics: My life is brilliant, My love is pure, I saw an angel, Of that I'm sure, She smiled at me on the subway, She was with another man, But I won't lose no sleep on that, 'Cause I've got a plan, You're beautiful, You're beautiful, You're beautiful it's true, I saw your face in a crowded place, And I don't know what to do, 'Cause I'll never be with you, Yeah she caught my eye, As we walked on by, She could see from my face that I was flying high, And I don't think that I'll see her again, But we shared a moment that will last till the end, You're beautiful, You're beautiful, You're beautiful it's true, I saw your face in a crowded place, And I don't know what to do, 'Cause I'll never be with you, You're beautiful, You're beautiful, You're beautiful it's true, There must be an angel with a smile on her face, When she thought up that I should be with you, But it's time to face the truth, I will never be with you.
POST /songs_v1/popular
{
"songName":"take me to your heart",
"singer":"Michael Learns To Rock",
"lyrics":"Hiding from the rain and snow, Trying to forget but I won't let go, Looking at a crowded street, Listening to my own heart beat, So many people all around the world, Tell me where do I find someone like you girl, Take me to your heart, Take me to your soul, Give me your hand before I'm old, Show me what love is haven't got a clue, Show me that wonders can be true, They say nothing lasts forever, We're only here today, Love is now or never, Bring me far away, Take me to your heart, Take me to your soul, Give me your hand and hold me, Show me what love is be my guiding star, It's easy take me to your heart, Standing on a mountain high, Looking at the moon through a clear blue sky, I should go and see some friends, But they don't really comprehend, Don't need too much talking without saying anything, All I need is someone who makes me wanna sing, Take me to your heart, Take me to your soul, Give me your hand before I'm old, Show me what love is haven't got a clue, Show me that wonders can be true, They say nothing lasts forever, We're only here today, Love is now or never, Bring me far away, Take me to your heart, Take me to your soul, Give me your hand and hold me, Show me what love is be my guiding star, It's easy take me to your heart, Take me to your heart, Take me to your soul, Give me your hand and hold me, Show me what love is be my guiding star, It's easy take me to your heart."
}
POST /songs_v1/popular
{
"songName":"you are beautiful",
"singer":"James Blunt",
"lyrics":"My life is brilliant, My love is pure, I saw an angel, Of that I'm sure, She smiled at me on the subway, She was with another man, But I won't lose no sleep on that, 'Cause I've got a plan, You're beautiful, You're beautiful, You're beautiful it's true, I saw your face in a crowded place, And I don't know what to do, 'Cause I'll never be with you, Yeah she caught my eye, As we walked on by, She could see from my face that I was flying high, And I don't think that I'll see her again, But we shared a moment that will last till the end, You're beautiful, You're beautiful, You're beautiful it's true, I saw your face in a crowded place, And I don't know what to do, 'Cause I'll never be with you, You're beautiful, You're beautiful, You're beautiful it's true, There must be an angel with a smile on her face, When she thought up that I should be with you, But it's time to face the truth, I will never be with you."
}
# 按照歌手进行查询
GET /songs_v1/_search?q=singer:james
# 按照歌名进行查询
GET /songs_v1/_search?q=lyrics:My life is brilliant
# 按照歌词进行查询
DELETE /songs_v1/popular/-PBoXIQBSpIcVAlSbEmr
6. ES 核心概念
在ES最初的设计中,index被当做类似DB的级别,能够对数据进行物理隔离,type相当于数据库中的表,对数据进行逻辑划分,document是ES中的一条数据记录。
但这样的设计在ES5.6以后开始有了变化,新版本的ES会逐步弱化type的概念,直到8.0将其移除。
7. ES Java Client
使用简单的Java Client的例子- 这里仅作为理解用,现在一般都不用这个了:
- 修改pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>cn.lazyfennec</groupId>
<artifactId>es-study</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>es-study</name>
<description>Demo project for Spring Boot</description>
<properties>
<java.version>1.8</java.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<!--<version>2.7.5</version>-->
<version>1.5.1.RELEASE</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<dependencies>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>6.4.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>6.4.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.alibaba/fastjson -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.1.35</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
- 编写一个测试例子 SearchDemo
package cn.lazyfennec.es.search;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.TransportAddress;
import org.elasticsearch.common.unit.TimeValue;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.transport.client.PreBuiltTransportClient;
import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
/**
* @Author: Neco
* @Description:
* @Date: create in 2022/11/9 22:47
*/
public class SearchDemo {
public static void main(String[] args) throws UnknownHostException {
// 配置信息
Settings settings = Settings.builder()
.put("cluster.name", "es-study") //指定集群名称
.put("client.transport.sniff", true) // 自动嗅探
.build();
/**
嗅探:客户端只需要指定一个ES服务端节点连接信息,连接上之后,如果开启了嗅探机制,
就会自动拉取服务端各节点的信息到客户端,从而避免我们需要配置一长串服务端连接信息
*/
// 构建client
TransportClient client = new PreBuiltTransportClient(settings);
// 指定IP、Port
client.addTransportAddress(new TransportAddress(InetAddress.getByName("192.168.1.6"), 9300));
try {
// 构建请求
SearchRequest request = new SearchRequest();
request.indices("songs_v1");
request.types("popular");
// search 信息
SearchSourceBuilder search = new SearchSourceBuilder();
search.query(QueryBuilders.matchQuery("songName", "you are beautiful"));
search.timeout(new TimeValue(60, TimeUnit.SECONDS));
// search 放入 request 中
request.source(search);
// 真正执行请求
SearchResponse response = client.search(request).get();
// 获取命中的文档
SearchHits hits = response.getHits();
SearchHit[] hitArr = hits.getHits();
System.out.println("搜索到" + hits.totalHits + "个文档");
// 处理文档
for (SearchHit hit : hitArr) {
// 处理元信息
System.out.println(hit.getType() + "," + hit.getScore());
// 打印源文档
String sourceAsString = hit.getSourceAsString();
System.out.println(sourceAsString);
}
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
} finally {
client.close();
}
}
}
- 运行结果
搜索到1个文档
popular,0.8630463
{
"songName":"you are beautiful",
"singer":"James Blunt",
"lyrics":"My life is brilliant, My love is pure, I saw an angel, Of that I'm sure, She smiled at me on the subway, She was with another man, But I won't lose no sleep on that, 'Cause I've got a plan, You're beautiful, You're beautiful, You're beautiful it's true, I saw your face in a crowded place, And I don't know what to do, 'Cause I'll never be with you, Yeah she caught my eye, As we walked on by, She could see from my face that I was flying high, And I don't think that I'll see her again, But we shared a moment that will last till the end, You're beautiful, You're beautiful, You're beautiful it's true, I saw your face in a crowded place, And I don't know what to do, 'Cause I'll never be with you, You're beautiful, You're beautiful, You're beautiful it's true, There must be an angel with a smile on her face, When she thought up that I should be with you, But it's time to face the truth, I will never be with you."
}
9. index alias(索引别名) 的应用
在Elasticsearch中给index起一个alias (别名),能够非常优雅的解决两个索引无缝切换的问题。
- 实现数据迁移的例子(大致的流程解析):
# 指定索引别名
PUT /songs_v1/_alias/songs
# 尝试使用别名查询数据
GET /songs/_search?q=songName:are
# 创建新的index
PUT /songs_v2
# 将db中的数据同步到新的index
# 重新指向index,将songs指向songs_v2
POST /_aliases
{
"actions": [
{
"remove": {
"index": "songs_v1",
"alias": "songs"
}
},
{
"add": {
"index":"songs_v2",
"alias": "songs"
}
}
]
}
- 使用别名实现一个搜索查询多个索引中的数据:
# 一个查询搜索多个索引中的内容
POST /_aliases
{
"actions": [
{
"add": {
"index": "index_v1",
"alias": "myindex"
}
},
{
"add": {
"index": "index_v2",
"alias": "myindex"
}
}
]
}
GET /myindex/_search?q=queryParam:params
- 通过别名实现过滤视图
# 通过别名实现过滤视图
POST /_aliases
{
"actions": [
{
"add": {
"index": "songs_v1",
"alias": "songs_james",
"filter": {
"match": {
"singer": "James"
}
}
}
}
]
}
GET /songs_james/_search
10. 分布式分片和冗余备份
-
实现分布式分片的好处有两个:
1.可以将大数据分片储存成小的数据,解决存储空间不足的问题
2.可以提升搜索的效率 -
实现冗余备份的好处:
解决某些情况下一个系统宕机导致整个服务都无法使用的问题。 -
一些index相关的操作的例子:
#### 创建和删除索引
PUT /songs_v3
DELETE /songs_v3
# 创建index, 指定settings中的分片个数和副本个数
PUT /songs_v4
{
"settings": {
"number_of_shards": 5,
"number_of_replicas": 1
}
}
# 获取index的配置信息
GET /songs_v4/_settings
# 修改index的配置信息
index的配置分为两类:
static: number_of_shards\ index.shard.check_on_startup
dynamic: 在index正常工作时,就能修改的配置信息 number_of_replicas
PUT /songs_v4/_settings
{
"number_of_replicas": 2
}
# 下面的配置在开启状态会失败,需要先关闭再执行
PUT /songs_v4/_settings
{
"index.shard.check_on_startup":true
}
# 关闭
POST /songs_v4/_close
# 开启
POST /songs_v4/_open
# 获取index中的mapping types
GET /songs_v1/_mappings
GET /songs_v4/_mappings
------------------------------------------------------
# 删除 mapping type (这个会执行失败,因为不能删除type)
DELETE /songs_v1/_mappings/product
# 上边的执行结果
{
"error": "Incorrect HTTP method for uri [/songs_v1/_mappings/product?pretty] and method [DELETE], allowed: [GET, PUT, POST]",
"status": 405
}
-
总结:
index操作:创建/删除index、开启/关闭index、添加/查看mapping、设置/查看settings
-
document操作:
索引/查询/更新/删除document、搜索document,执行script
# 如何索引文档
# 显示指定文档ID
PUT /songs_v4/popular/5
{
"songName":"could this be love",
"singer":"Victoria Acosta",
"lyrics":"Could This Be love, Woke Up This Morning Just Sat In My Bed, 8 a.m. First Thing In My Head, Is A Certain Someone Who's Always On My Mind, He Treats Me Like A Lady In Every Way, He Smiles And Warms Me Through Up The Day, Should I Tell Him I Love You Wish I Knew What To Say, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, The Way He Looks So Deep In My Eyes, Our Hearts Are So Warm I Just Wanna Cry, Then He's So Hardworking He Wants To Be Someone, Should I Tell Him That I Love You, What If He Doesn't Say It Too, I'm Feeling So Nervous What Should I Do, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Will This Be My Turn, Two Hearts Beating Together As One, No More Loneliness, Only Love Laughter And Fun, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel"
}
# 随机生成文档ID
POST /songs_v4/popular
{
"songName":"could this be love",
"singer":"Victoria Acosta",
"lyrics":"Could This Be love, Woke Up This Morning Just Sat In My Bed, 8 a.m. First Thing In My Head, Is A Certain Someone Who's Always On My Mind, He Treats Me Like A Lady In Every Way, He Smiles And Warms Me Through Up The Day, Should I Tell Him I Love You Wish I Knew What To Say, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, The Way He Looks So Deep In My Eyes, Our Hearts Are So Warm I Just Wanna Cry, Then He's So Hardworking He Wants To Be Someone, Should I Tell Him That I Love You, What If He Doesn't Say It Too, I'm Feeling So Nervous What Should I Do, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Will This Be My Turn, Two Hearts Beating Together As One, No More Loneliness, Only Love Laughter And Fun, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel"
}
# 通过ID可以更新文档
PUT /songs_v4/popular/5
{
"songName":"could this be love",
"singer":"James",
"lyrics":"Could This Be love, Woke Up This Morning Just Sat In My Bed, 8 a.m. First Thing In My Head, Is A Certain Someone Who's Always On My Mind, He Treats Me Like A Lady In Every Way, He Smiles And Warms Me Through Up The Day, Should I Tell Him I Love You Wish I Knew What To Say, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, The Way He Looks So Deep In My Eyes, Our Hearts Are So Warm I Just Wanna Cry, Then He's So Hardworking He Wants To Be Someone, Should I Tell Him That I Love You, What If He Doesn't Say It Too, I'm Feeling So Nervous What Should I Do, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Will This Be My Turn, Two Hearts Beating Together As One, No More Loneliness, Only Love Laughter And Fun, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel, So Strong So Deep And So Real, If I Lost You Would I Ever Heal, Could This Be Love That I Feel, Could This Be Love That I Feel"
}
# 通过ID查询文档
GET /songs_v4/popular/5
# 根据ID删除文档
DELETE /songs_v4/popular/5
# 搜索一个文档
GET /songs_v4/_search?q=singer:Victoria
11. 映射详解
- 什么是映射(mapping)?
映射定义索引中有什么字段、字段的类型等结构信息。相当士数据)厍中衣结构定义,或solr中的schema。因为lucene索引文档时需要知道该如何来索引存储文档的字段。
ES中支持静态映射,动态映射两种方式。
通过dynamic字段来指定mapping的动态效果,dynamic字段可以有如下选项:
##################################################
PUT /books
PUT /books/_mapping/science
{
"properties": {
"name": {"type": "text"},
"author": {"type": "text"}
}
}
# 查看所有的properties
GET /books/_mappings
# 这个执行后悔自动插入一个property =》 content
POST /books/science
{
"content": "this is the content of the book, and this properties will auto add into de science"
}
# 与上面的类似,但是这个会识别出来是boolean的类型
POST /books/science
{
"for_child": true
}
# 与上边类似,但是这个会自动识别出来是日期类型的字符串,会自动转换类型
POST /books/science
{
"publish_date": "2022-11-10"
}
# 但是上边的内容在某些情况下会导致错误,比如我们原本就打算将其存储为字符串格式,例子
POST /books/science
{
"publish_date": "this is a string"
}
# 这是错误提示
#{
# "error": {
# "root_cause": [
# {
# "type": "mapper_parsing_exception",
# "reason": "failed to parse field [publish_date] of type [date] in document with id '_vAoYYQBSpIcVAlSfUl9'"
# }
# ],
# "type": "mapper_parsing_exception",
# "reason": "failed to parse field [publish_date] of type [date] in document with id '_vAoYYQBSpIcVAlSfUl9'",
# "caused_by": {
# "type": "illegal_argument_exception",
# "reason": "Invalid format: \"this is a string\""
# }
# },
# "status": 400
#}
# 关闭日期自动检测
PUT /books/_mapping/science
{
"date_detection": false
}
# 此时不会将类别识别为date
POST /books/science
{
"publish_date_new": "2022-11-10"
}
###########################
put teachers
PUT /teachers/_mappings/math
{
"dynamic":"strict",
"properties": {
"name": {"type": "text"},
"gender":{"type": "boolean"}
}
}
# 这里会报错,因为 "dynamic"设置成了"strict"
POST /teachers/math
{
"age":27
}
12. 映射详解之字段类型
字段类型定义了该如何索引字段值,ES提供了丰富的字段类型定义.
官网描述: https://www.elastic.co/guide/en/elasticsearch/reference/6.5/mapping-types.html
text 和 keyword 类型的区别
text会分词,keyword不会进行分词
其他的内容,请自行查看官网解释
13. 映射详解之映射参数
字段的type (Datatype)定义了如何索引存储字段值,还有一些属性可以让我们根据需要来覆盖默认的值或进行特别定义。请参考官网介绍详细了解:
当前版本:https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-params.html
之前的6.5版本:https://www.elastic.co/guide/en/elasticsearch/reference/6.5/mapping-params.html
不同版本的内容一般会有不一样的地方,建议需要的时候查看对应的具体的官网介绍。
14. 映射详解之分词器
一个可分词字符串字段的值输入到analyzer,会被转换成一组词,ES有一系列内建的analyzer:
https://www.elastic.co/guide/en/elasticsearch/reference/6.5/analysis-analyzers.html(这里是6.5版本的,其他的请查看官网介绍)
一般对于中文的分词器,需要配置自定义的分词器。
https://www.elastic.co/guide/en/elasticsearch/reference/6.5/analysis-custom-analyzer.html
Analyzer 分析器 在ES中一个Analyzer由下面三种组件组合而成:
- character filter : 字符过滤器,对文本进行字符过滤处理,如处理文本中的html标签字符。处理完后再交给tokenizer进行分词。一个analyzer中可包含0个或多个字符过滤器,多个按配置顺序依次进行处理。
- tokenizer:分词器,对文本进行分词。一个analyzer必需且只可包含一个tokenizer。
- token filter:词项过滤器,对tokenizer分出的词进行过滤处理。如转小写停用词处理、同义词处理。一个analyzer可包含0个或多个词项过滤器,按配置顺序进行过滤。
15. 映射详解之多重字段
当我们需要对一个字段进行多种不同方式的索引时,可以使用fields多重字段定义。如一个字符串字段即需要进行text分词索引,也需要进行keyword关键字索引来支持排序、聚合;或需要用不同的分词器进行分词索引。
16. doc values、fielddata、 index
- doc_values:大多数字段进行了反向索引,因此可以用于搜索,但排序、聚合、scripts操作等需要正向索引。
- fielddata:大多数字段可利用doc_values来进行排序、聚合、scripts等操作,但doc_values不支持text字段,text字段利用fielddata机制来替代。
- index: doc_values指定文档是否进行正向索引,index指定文档是否进行反向索引
select count(*),age from bank where genders='F' group by age;
17. store
默认情况下_source会存储文档所有的字段,当一个字段的store属性设置为true时,ES会单独存储一份该字段。
18. 映射详解之元字段
18. 同步DB数据到ES
市面上讨论的,将数据从DB同步到ES有logstash-input-jdbc、go-mysql-elasticsearch、elasticsearch-jdbc,我们选用Logstash-input-jdbc来实现数据迁移。
18.1 安装 logstash-input-jdbc
- 如果没有安装ruby可能会有问题,所以我们要确保自己已经安装了ruby ,如果没有安装,请自行安装
这里是centos的版本的操作
yum install ruby -y
- 下载对应版本的logstash,下载地址如下:
https://www.elastic.co/cn/downloads/past-releases#logstash
- 然后解压
tar -zxvf logstash-xxxxxx.tar.gz
- 测试一下
[es@192 bin]$ ./logstash -e 'input { stdin {} } output { stdout {} }'
Sending Logstash logs to /home/es/es6/logstash-6.8.23/logs which is now configured via log4j2.properties
[2022-11-10T08:53:45,360][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.queue", :path=>"/home/es/es6/logstash-6.8.23/data/queue"}
[2022-11-10T08:53:45,390][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.dead_letter_queue", :path=>"/home/es/es6/logstash-6.8.23/data/dead_letter_queue"}
[2022-11-10T08:53:46,688][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2022-11-10T08:53:46,742][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.8.23"}
[2022-11-10T08:53:46,804][INFO ][logstash.agent ] No persistent UUID file found. Generating new UUID {:uuid=>"0c0bbbcc-39d8-46e6-87c0-8ea6b34d1e4e", :path=>"/home/es/es6/logstash-6.8.23/data/uuid"}
[2022-11-10T08:54:17,776][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2022-11-10T08:54:18,171][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x6ea21823 run>"}
The stdin plugin is now waiting for input:
[2022-11-10T08:54:18,397][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2022-11-10T08:54:20,144][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
HelloWorld
/home/es/es6/logstash-6.8.23/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
"message" => "HelloWorld",
"host" => "192.168.1.6",
"@version" => "1",
"@timestamp" => 2022-11-10T13:55:48.106Z
}
运行结束后记得关闭
- 修改Gemfile 和 Gemfile.lock 中的内容,主要是将网址替换成中国的镜像,记得做好备份哦(但是其实不换似乎也不会有什么问题,而且换了之后很容易出现一些莫名其妙的问题,如果可以的话,不建议修改)
即
https://rubygems.org
改成
https://gems.ruby-china.com/
- 安装
[es@192 bin]$ ./logstash-plugin install logstash-input-jdbc
Validating logstash-input-jdbc
Installing logstash-input-jdbc
Installation successful
- mysql相关的内容
# 创建数据库
mysql> create database music;
Query OK, 1 row affected (0.10 sec)
# 设置密码校验相关的内容,如果要设置成很简单的内容密码的话,否则可以忽略
mysql> set global validate_password_policy=0;
Query OK, 0 rows affected (0.00 sec)
mysql> set global validate_password_length=1;
Query OK, 0 rows affected (0.10 sec)
#创建账号并且授权
mysql> grant all privileges on music.* to 'music'@'%' identified by 'music';
Query OK, 0 rows affected, 1 warning (0.02 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
#创建表
mysql> use music;
Database changed
mysql> create table songs(
-> song_id int primary key auto_increment,
-> song_name varchar(18),
-> singer varchar(18),
-> lyrics varchar(3000),
-> create_time DATETIME
-> );
Query OK, 0 rows affected (0.61 sec)
# 数据库建表语句
create table songs(
song_id int primary key auto_increment,
song_name varchar(18),
singer varchar(18),
lyrics varchar(3000),
create_time DATETIME
);
- 创建配置文件
mkdir jdbc_conf
cd jdbc_conf
vi jdbc.conf
jdbc.conf
input {
stdin {
}
jdbc {
jdbc_connection_string => "jdbc:mysql://192.168.1.6:3306/music?serverTimezone=GMT%2B8&useUnicode=true&characterEncoding=utf-8&useSSL=false"
jdbc_user => "music"
jdbc_password => "music"
jdbc_driver_library => "/root/mysql-connector-java-5.1.48.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
statement =>"select song_id, song_name, singer, lyrics, create_time from music.songs where create_time >= :sql_last_value"
schedule => "* * * * *"
type => "jdbc"
lowercase_column_names => "false"
}
}
filter {}
output {
elasticsearch {
hosts => ["192.168.1.6:9200"]
index => "songs_v11"
document_id => "%{song_id}"
template_overwrite => true
}
stdout {
codec => json_lines
}
}
尤其注意:
jdbc_connection_string => "jdbc:mysql://192.168.1.6:3306/music?serverTimezone=GMT%2B8&useUnicode=true&characterEncoding=utf-8&useSSL=false"
中的useSSL=false,如果不加上useSSL=false
, 尝试同步的时候会出现一些问题,如下
2018 error:
2018 2018
2018 Sequel::DatabaseConnectionError
2018 Java::ComMysqlJdbcExceptionsJdbc4::CommunicationsException: Communications link failure
- 启动
bin/logstash -f jdbc_conf/jdbc.conf
然后在mysql中执行下面的语句
insert into music.songs(song name , singer,lyrics, create_time) values( 'take me ', James Blunt', 'fasdfasfasdfasdfasdfasdfas ' now() ) ;
此时可以再次查看logstash中的内容:
[root@192 logstash-6.8.23]# bin/logstash -f jdbc_conf/jdbc.conf
Sending Logstash logs to /home/es/es6/logstash-6.8.23/logs which is now configured via log4j2.properties
[2022-11-10T10:31:53,096][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2022-11-10T10:31:53,137][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.8.23"}
[2022-11-10T10:32:09,558][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2022-11-10T10:32:10,927][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://192.168.1.6:9200/]}}
[2022-11-10T10:32:11,495][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://192.168.1.6:9200/"}
[2022-11-10T10:32:11,617][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2022-11-10T10:32:11,637][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}
[2022-11-10T10:32:11,723][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//192.168.1.6:9200"]}
[2022-11-10T10:32:11,782][INFO ][logstash.outputs.elasticsearch] Using default mapping template
[2022-11-10T10:32:11,856][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2022-11-10T10:32:11,964][INFO ][logstash.outputs.elasticsearch] Installing elasticsearch template to _template/logstash
The stdin plugin is now waiting for input:
[2022-11-10T10:32:12,365][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x47c02d3d run>"}
[2022-11-10T10:32:12,529][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2022-11-10T10:32:13,867][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
/home/es/es6/logstash-6.8.23/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/cronline.rb:77: warning: constant ::Fixnum is deprecated
[2022-11-10T10:33:03,909][INFO ][logstash.inputs.jdbc ] (0.021284s) SELECT version()
[2022-11-10T10:33:04,018][INFO ][logstash.inputs.jdbc ] (0.011402s) SELECT version()
[2022-11-10T10:33:04,466][INFO ][logstash.inputs.jdbc ] (0.037989s) SELECT count(*) AS `count` FROM (select song_id, song_name, singer, lyrics, create_time from music.songs where create_time >= '1970-01-01 00:00:00') AS `t1` LIMIT 1
[2022-11-10T10:33:04,579][INFO ][logstash.inputs.jdbc ] (0.023222s) SELECT * FROM (select song_id, song_name, singer, lyrics, create_time from music.songs where create_time >= '1970-01-01 00:00:00') AS `t1` LIMIT 50000 OFFSET 0
{"song_id":1,"type":"jdbc","singer":"James Blunt","song_name":"take me","lyrics":"fasdfasfasdfasdfasdfasdfas","@version":"1","@timestamp":"2022-11-10T15:33:04.759Z","create_time":"2022-11-10T14:36:34.000Z"}
{"song_id":2,"type":"jdbc","singer":"James Blunt","song_name":"take me","lyrics":"fasdfasfasdfasdfasdfasdfas","@version":"1","@timestamp":"2022-11-10T15:33:04.813Z","create_time":"2022-11-10T14:53:15.000Z"}
{"song_id":3,"type":"jdbc","singer":"James Blunt","song_name":"take me","lyrics":"fasdfasfasdfasdfasdfasdfas","@version":"1","@timestamp":"2022-11-10T15:33:04.815Z","create_time":"2022-11-10T14:53:17.000Z"}
{"song_id":4,"type":"jdbc","singer":"James Blunt","song_name":"take me","lyrics":"fasdfasfasdfasdfasdfasdfas","@version":"1","@timestamp":"2022-11-10T15:33:04.816Z","create_time":"2022-11-10T14:53:17.000Z"}
{"song_id":5,"type":"jdbc","singer":"James Blunt","song_name":"take me","lyrics":"fasdfasfasdfasdfasdfasdfas","@version":"1","@timestamp":"2022-11-10T15:33:04.817Z","create_time":"2022-11-10T14:53:18.000Z"}
{"song_id":6,"type":"jdbc","singer":"James Blunt","song_name":"take me","lyrics":"fasdfasfasdfasdfasdfasdfas","@version":"1","@timestamp":"2022-11-10T15:33:04.829Z","create_time":"2022-11-10T14:53:18.000Z"}
{"song_id":7,"type":"jdbc","singer":"James Blunt","song_name":"take me","lyrics":"fasdfasfasdfasdfasdfasdfas","@version":"1","@timestamp":"2022-11-10T15:33:04.830Z","create_time":"2022-11-10T14:53:19.000Z"}
如果觉得有收获就点个赞吧,更多知识,请点击关注查看我的主页信息哦~
网友评论