美文网首页
presto与hadoop2 hive的整合

presto与hadoop2 hive的整合

作者: 在路上_Rogge | 来源:发表于2017-04-12 18:07 被阅读0次

    一、hadoop2.6.4 hive-2.1.1 presto-server-0.172.tar.gz, jdk1.8u121
    二、配置

    1. hadoop搭建分布式集群
    2. hive搭建配置
    3. 安装
      将文件上传到安装目录下/usr/local/
    tar -zxvf tar -zxvf apache-hive-2.1.1-bin.tar.gz
    mv tar -zxvf apache-hive-2.1.1 hive-2.1
    
    1. 配置环境变量,编辑/etc/profile
    export HIVE_HOME=/usr/local/hive-2.1
    export PATH=$HIVE_HOME/bin:$PATH
    

    执行 source /etc/profile

    1. hive配置 .../conf/
      • 修改 hive.env.sh 增加
    export JAVA_HOME=/usr/local/jdk
    export HADOOP_HOME=/usr/local/hadoop2
    export HIVE_HOME=/usr/local/hive-1.2
    
     * 修改log4j文件
    
    cp hive-log4j.properties.template hive-log4j.properties
    将EventCounter修改成org.apache.hadoop.log.metrics.EventCounter
    #log4j.appender.EventCounter=org.apache.hadoop.hive.shims.HiveEventCounter
    log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter
    
    • 配置hive-site.xml
      touch hive-site.xml
      写入:
    <configuration>
    <property>
     <name>hive.metastore.warehouse.dir</name>  <value>/usr/hivedata/warehouse</value>
    </property>
    <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
    </property>
    <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
    </property>
    <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>username to use against metastore database</description>
    </property>
    <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>root</value>
    <description>password to use against metastore database</description>
    </property>
    <property>
      <name>hive.metastore.uris</name>
      <value>thrift://192.168.172.103:9083</value>
    </property>
    
    1. 安装mysql并配置hive数据库及权限
    yum install mysql
    service mysqld start
    对hive元数据库进行赋权,开放远程连接,开放localhost连接
    grant all privileges on *.* to root@"%" identified by "root" with grant option;
    grant all privileges on *.* to root@"localhost" identified by "root" with grant option;
    
    1. 问题有解决方案
     * 如果报错Terminal initialization failed; falling back to unsupported;
    

    解决方法:将.../hive2.1/lib 里面的jline2.12替换了hadoop 中.../hadoop-2.6.4/share/hadoop/yarn/lib/jline-0.09*.jar
    * Exception in thread "main" java.lang.RuntimeException: Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql);
    解决方法:schematool -dbType mysql -initSchema
    * hive再插入数据的时候报错,列超过限制,需要MySQL设置:alter database hive character set latin1;

    1. presto搭建配置
    2. 搭建过程请参考官方文档中文文档
    3. 在presto根目录下新建的etc文件夹中,配置文件如下:
      2.1. config.properties配置文件
    #coordinator
    coordinator=true
    node-scheduler.include-coordinator=false
    http-server.http.port=8080
    task.max-memory=1GB
    discovery-server.enabled=true
    discovery.uri=http://192.168.172.103:8080
    #work
    #coordinator=false
    #http-server.http.port=8080
    #task.max-memory=512m
    #discovery.uri=http://192.168.172.103:8080
    #用一台机器进行测试,那么这一台机器将会即作为coordinator,也作为worker。
    #coordinator=true
    #node-scheduler.include-coordinator=true
    #http-server.http.port=8080
    #task.max-memory=1GB
    #discovery-server.enabled=true
    #discovery.uri=http://example.net:8080
    

    2.2. jvm.config配置文件

    -server
    -Xmx1G
    -XX:+UseConcMarkSweepGC
    -XX:+ExplicitGCInvokesConcurrent
    -XX:+CMSClassUnloadingEnabled
    -XX:+AggressiveOpts
    -XX:+HeapDumpOnOutOfMemoryError
    -XX:OnOutOfMemoryError=kill -9 %p
    -XX:ReservedCodeCacheSize=150M
    

    2.3. log.properties配置文件
    com.facebook.presto=INFO
    2.4. node.properties配置文件(node.id不能重复)

    node.environment=production
    node.id=XXXX
    node.data-dir=/usr/presto/data
    

    2.5. catalog文件下是数据源配置,可以有hive,kafaka,关系型数据库等等。。我这里配置了hive.properties

    connector.name=hive-hadoop2
    hive.metastore.uri=thrift://192.168.172.103:9083
     #修改为 hive-metastore 服务所在的主机名称,这里我是安装在 cdh1节点
    hive.config.resources=/usr/local/hadoop2/etc/hadoop/core-site.xml,/usr/local/hadoop2/e
    tc/hadoop/conf/hdfs-site.xml
    

    2.6创建客户端 下载presto-cli-0.172-executable.jar 将其重命名为 presto-cli 添加执行权限chmode +x
    ./presto-cli --server 192.168.172.103:8080 --catalog hive --schema default

    1. 启动与停止
      bin/launcher start
      也可以前台启动,观察输出日志:
      bin/launcher run
      另外,你也可以通过下面命令停止:
      bin/launcher stop

    4.启动与测试

    1. 启动 hdfs
    2. 启动hive hive --service metastore 、.../hive
    #创建并插入数据
    create table t1(name string);
    load data local inpath '/usr/name.txt' into table t1;
    
    1. 启动 bin/launcher run
    2. 运行测试
      ./presto-cli --server 192.168.172.103:8080 --catalog hive --schema default
      show tables;

    相关文章

      网友评论

          本文标题:presto与hadoop2 hive的整合

          本文链接:https://www.haomeiwen.com/subject/gedmattx.html