美文网首页我爱编程
Spark扫描kerberos hbase环境配置

Spark扫描kerberos hbase环境配置

作者: 阿甘骑士 | 来源:发表于2018-06-13 21:47 被阅读0次
    集成kerberos后,很多使用服务的程序代码需要改写,例如java通过jdbc链接impala;java扫描hbase表;java的kafka客户端.......,除了 spark程序以外。
    下面介绍下集成kerberos后,要做什么准备才能让spark程序正常跑起来
    在实施方案前,假设读者已经基本熟悉以下技术 (不细说)
    • 熟悉spark程序,spark-submit脚本
    • cdh集成kerberos没有问题
    • hbase没问题
    • 这里不提供spark读hbase代码
    方案实施
    • 安装完成之后能看到这个玩意


      spark2.png
    • spark集成kerberos


      spark集成kerberos.png
    • 重启spark2并部署客户端

    • 添加软链接

    [root@node1 spark2]# vi /etc/extra-lib/hbase/classpath.txt
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/htrace-core-3.0.4.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/htrace-core-3.2.0-incubating.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/htrace-core4-4.0.1-incubating.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/commons-pool2-2.4.3.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/jedis-2.9.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/druid-1.1.6.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-annotations-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-annotations-1.2.0-cdh5.11.0-tests.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-client-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-client.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-common-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-common-1.2.0-cdh5.11.0-tests.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-examples-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-examples.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-external-blockcache-1.2.0-cdh5.11.0.jar
    "/etc/extra-lib/hbase/classpath.txt" 41L, 3584C
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/htrace-core-3.0.4.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/htrace-core-3.2.0-incubating.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/htrace-core4-4.0.1-incubating.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/commons-pool2-2.4.3.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/jedis-2.9.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/druid-1.1.6.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-annotations-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-annotations-1.2.0-cdh5.11.0-tests.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-client-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-client.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-common-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-common-1.2.0-cdh5.11.0-tests.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-examples-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-examples.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-external-blockcache-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-external-blockcache.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-hadoop2-compat-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-hadoop2-compat-1.2.0-cdh5.11.0-tests.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-hadoop-compat-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-hadoop-compat-1.2.0-cdh5.11.0-tests.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-it-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-it-1.2.0-cdh5.11.0-tests.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-prefix-tree-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-procedure-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-protocol-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-resource-bundle-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-rest-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-rsgroup-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-rsgroup-1.2.0-cdh5.11.0-tests.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-server-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-server-1.2.0-cdh5.11.0-tests.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-shell-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-spark-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hbase/hbase-thrift-1.2.0-cdh5.11.0.jar
    /opt/cloudera/parcels/CDH/jars/ojdbc6.jar
    /etc/extra-lib/hbase/spring/spring-beans-5.0.2.RELEASE.jar
    /etc/extra-lib/hbase/spring/spring-core-5.0.2.RELEASE.jar
    /etc/extra-lib/hbase/spring/spring-expression-5.0.2.RELEASE.jar
    /etc/extra-lib/hbase/spring/spring-jcl-5.0.2.RELEASE.jar
    :wq
    #每个节点都要有
    scp /etc/extra-lib/hbase/classpath.txt  root@bi-slave1:/etc/extra-lib/hbase/
    ...
    

    修改Spark2在CM内的配置:
    “spark2-conf/spark-env.sh 的 Spark 2 服务高级配置代码段(安全阀)”
    “spark2-conf/spark-env.sh 的 Spark 2 客户端高级配置代码段(安全阀)”
    在以上两个配置框内加入:
    export SPARK_DIST_CLASSPATH="SPARK_DIST_CLASSPATH:(paste -sd: "/etc/extra-lib/hbase/classpath.txt")"

    spark2_server.png
    spark2_client.png
    • 重启spark,并部署客户端

    • 添加hbase_site.xml到$SPARK_HOME/conf
    • 下载客户端代码


      客户端配置_hbase.png
    • 解压获得hbase_site.xml文件, 并添加到$SPARK_HOME/conf里面


      hbase_site.png

    yarn-conf里面也需要有该xml

    • 拷贝到其他节点


      拷贝8.png
    • 到这里大功告成!

    spark测试
    • 因为写的spark程序需要读取hbase的表,先给用户赋权限,这里以deng_yb账号为例子
    • 用hbase账号登录hbase shell
    [root@node1 ~]# kinit -kt /opt/cm-5.11.0/run/cloudera-scm-agent/process/1247-hbase-HBASERESTSERVER/hbase.keytab hbase/node1@W.COM
    [root@node1 ~]# klist
    Ticket cache: FILE:/tmp/krb5cc_0
    Default principal: hbase/node1@W.COM
    
    Valid starting     Expires            Service principal
    06/13/18 21:20:03  06/14/18 21:20:03  krbtgt/W.COM@W.COM
            renew until 06/18/18 21:20:03
    [root@node1 ~]# hbase shell
    .....
    
    • 为deng_yb账号赋权

    • 这里赋权U:DAY_ORG_CMP_OSI和U:DAY_ORG_PRO_CATE_SPARK读写

    hbase(main):001:0> list
    TABLE                                                                                                                           
    U:DAY_ORG_CMP_OSI                                                                                                     
    U:DAY_ORG_PRO_CATE_SPARK                                                                                          
    test                                                                                                                            
    test_user                                                                                                                       
    user_info                                                                                                                       
    5 row(s) in 0.5330 seconds
    
    => ["U:DAY_ORG_CMP_OSI", "U:DAY_ORG_PRO_CATE_SPARK", "test", "test_user", "user_info"]
    hbase(main):002:0> grant 'deng_yb','RW','U:DAY_ORG_CMP_OSI'
    0 row(s) in 0.6180 seconds
    
    hbase(main):003:0> grant 'deng_yb','RW','U:DAY_ORG_PRO_CATE_SPARK'
    0 row(s) in 0.1320 seconds
    
    • 登录deng_yb账号
    [root@node1 ~]# kinit deng_yb
    Password for deng_yb@W.COM: 
    [root@node1 ~]# klist
    Ticket cache: FILE:/tmp/krb5cc_0
    Default principal: deng_yb@W.COM
    
    Valid starting     Expires            Service principal
    06/13/18 21:24:18  06/14/18 21:24:18  krbtgt/W.COM@W.COM
            renew until 06/20/18 21:24:18
    
    • 查看hbase权限下的表
    hbase(main):001:0> list
    TABLE                                                                                                                           
    U:DAY_ORG_CMP_OSI                                                                                                     
    U:DAY_ORG_PRO_CATE_SPARK                                                                                          
    2 row(s) in 0.5380 seconds
    
    => ["U:DAY_ORG_CMP_OSI", "U:DAY_ORG_PRO_CATE_SPARK"]
    hbase(main):002:0>
    #赋权成功
    
    • 执行spark程序 (执行的账号必须能够访问hive元数据)
     spark2-submit --master yarn --deploy-mode cluster  --executor-memory 4G --total-executor-cores 4 --driver-memory 4g --class com.W.Main  /usr/local/W/bi-bdap-0.1.0-SNAPSHOT.jar
    
    • 执行完之后,记得去查看AM日志


      am_log.png
    • AM日志没问题,再去看executors日志


      executor_1.png
      executor_2.png
    • 到这里,假如日志没显示异常代表spark运行成功

    问题解惑
    • 部分开发遇到认证失败


      异常.png

      这种情况一般是因为$SPARK_CLASSPATH里面没有hbase的包
      以下几个jar是必须要有的

    spark_必备hbase包.png
    • 连接hbase一直报超时


      链接hbase超时.png

    这种情况下一般是因为$SPARK_HOME/conf下缺少hbase_site.xml文件。

    相关文章

      网友评论

        本文标题:Spark扫描kerberos hbase环境配置

        本文链接:https://www.haomeiwen.com/subject/jmyseftx.html