美文网首页玩转大数据大数据
Elasticsearch插件jdbc之MySQL数据源导入

Elasticsearch插件jdbc之MySQL数据源导入

作者: lookphp | 来源:发表于2018-01-31 17:00 被阅读297次

为了便于测试,我们先在 MySQL中创建一个用于测试的数据表 article ,并添加几条数据(注意, update_time 字段我加了ON UPDATE CURRENT_TIMESTAMP,数据发生改变就会更新此字段)。

1. 在mysql中创建测试数据:

DROP TABLE IF EXISTS `article`;
CREATE TABLE `article` (
  `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `subject` varchar(150) NOT NULL,
  `author` varchar(15) DEFAULT NULL,
  `create_time` timestamp NULL DEFAULT NULL,
  `update_time` timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8;
# 数据
INSERT INTO `article` VALUES ('1', '"闺蜜"崔顺实被韩检方传唤 韩总统府促彻查真相', 'jam', '2016-10-31 17:49:21', '2016-10-31 17:50:21');
INSERT INTO `article` VALUES ('2', '韩举行"护国训练" 青瓦台:决不许国家安全出问题', 'jam00', '2016-10-31 17:50:39', '2016-10-31 17:50:51');
INSERT INTO `article` VALUES ('3', '媒体称FBI已经取得搜查令 检视希拉里电邮', 'tomi', '2016-10-31 17:51:03', '2016-10-31 17:51:08');
INSERT INTO `article` VALUES ('4', '村上春树获安徒生奖 演讲中谈及欧洲排外问题', 'jason', '2016-10-31 17:51:38', '2016-10-31 17:51:41');
INSERT INTO `article` VALUES ('5', '希拉里团队炮轰FBI 参院民主党领袖批其“违法”', 'tommy', '2016-10-31 17:52:07', '2016-10-31 17:52:09');

2.编写导入数据脚本

首先执行全部数据导入(注:ES 使用的是默认配置)
我们写一个名叫 mysql-article.sh 的bash脚本,并放在 /usr/local/elasticsearch-2.2.0/jdbc2.2/bin/mysql-article.sh 下面,脚本内容如下:

#执行
/usr/local/elasticsearch-2.2.0/jdbc2.2/bin/mysql-article.sh 
#文件内容如下
#!/bin/sh
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
bin=${DIR}/../bin
lib=${DIR}/../lib

echo '
{
    "type" : "jdbc",
    "jdbc" : {
        "url" : "jdbc:mysql://localhost:3306/test",
        "user" : "root",
        "password" : "123456",
        "sql" : "select *, id as _id from article",
        "index" : "jdbctest",
        "type" : "article",
        "index_settings" : {
            "analysis" : {
                "analyzer" : {
                    "ik" : {
                        "tokenizer" : "ik"
                    }
                }
            }
        },
        "type_mapping": {
            "article" : {
                "properties" : {
                    "id" : {
                        "type" : "integer",
                        "index" : "not_analyzed"
                    },
                    "subject" : {
                        "type" : "string",
                        "analyzer" : "ik"
                    },
                    "author" : {
                        "type" : "string",
                        "analyzer" : "ik"
                    },
                    "create_time" : {
                        "type" : "date"
                    },
                    "update_time" : {
                        "type" : "date"
                    }
                }
            }
        }
    }
}
' | java \
    -cp "${lib}/*" \
    -Dlog4j.configurationFile=${bin}/log4j2.xml \
    org.xbib.tools.Runner \
    org.xbib.tools.JDBCImporter

执行后会自动创建 jdbctest 索引(若不存在) ,article 类型 和几个对应的字段,这里因为有中文,我使用了 ik 分词器(如何使用?
若执行失败,请查看日志文件,jdbc 的日志存放在 /usr/local/elasticsearch-2.2.0/logs/jdbc.log

查看是否导入成功,使用如下命令:

curl -XGET 'http://localhost:9200/jdbctest/article/_search?pretty'
#返回
{
  "took" : 33,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "jdbctest",
      "_type" : "article",
      "_id" : "5",
      "_score" : 1.0,
      "_source" : {
        "id" : 5,
        "subject" : "希拉里团队炮轰FBI 参院民主党领袖批其“违法”",
        "author" : "tommy",
        "create_time" : "2016-10-31T17:52:07.000+08:00",
        "update_time" : "2016-10-31T17:52:09.000+08:00"
      }
    }, {
      "_index" : "jdbctest",
      "_type" : "article",
      "_id" : "2",
      "_score" : 1.0,
      "_source" : {
        "id" : 2,
        "subject" : "韩举行"护国训练" 青瓦台:决不许国家安全出问题",
        "author" : "jam00",
        "create_time" : "2016-10-31T17:50:39.000+08:00",
        "update_time" : "2016-10-31T17:50:51.000+08:00"
      }
    }, {
      "_index" : "jdbctest",
      "_type" : "article",
      "_id" : "4",
      "_score" : 1.0,
      "_source" : {
        "id" : 4,
        "subject" : "村上春树获安徒生奖 演讲中谈及欧洲排外问题",
        "author" : "jason",
        "create_time" : "2016-10-31T17:51:38.000+08:00",
        "update_time" : "2016-10-31T17:51:41.000+08:00"
      }
    }, {
      "_index" : "jdbctest",
      "_type" : "article",
      "_id" : "1",
      "_score" : 1.0,
      "_source" : {
        "id" : 1,
        "subject" : ""闺蜜"崔顺实被韩检方传唤 韩总统府促彻查真相",
        "author" : "jam",
        "create_time" : "2016-10-31T17:49:21.000+08:00",
        "update_time" : "2016-10-31T17:50:21.000+08:00"
      }
    }, {
      "_index" : "jdbctest",
      "_type" : "article",
      "_id" : "3",
      "_score" : 1.0,
      "_source" : {
        "id" : 3,
        "subject" : "媒体称FBI已经取得搜查令 检视希拉里电邮",
        "author" : "tomi",
        "create_time" : "2016-10-31T17:51:03.000+08:00",
        "update_time" : "2016-10-31T17:51:08.000+08:00"
      }
    } ]
  }
}

如果返回如上结果,则内容已经成功导入elasticsearch中。

相关文章

网友评论

    本文标题:Elasticsearch插件jdbc之MySQL数据源导入

    本文链接:https://www.haomeiwen.com/subject/obhlzxtx.html