美文网首页
部署Hive3自定密码验证机制,Hadoop3配置proxy u

部署Hive3自定密码验证机制,Hadoop3配置proxy u

作者: 旋转马达 | 来源:发表于2021-03-15 18:16 被阅读0次

    本文讲解部署hive3的过程中遇到的问题和解决方案

    一 :hive的部署

    安装方法见Hive3整合Hadoop3的安装配置

    二:安装tez作为hive的计算引擎

    下载tez的安装包下载地址,解压到安装目录,安装指南(英文版)

    简要说一下安装步骤:
    1:确保部署tez之前先部署hadoop,并且版本大于等于2.7.0
    2:编译tez,如果下载的是编译好的bin版本,该步骤可以省略,我们用的是bin版本
    3:复制tez相关的jar和配置tez-site.xml文件

    hadoop fs -mkdir /user/tez
    hadoop fs -put ${TEZ_HOME}/tez.tar.gz /user/tez
    

    3.1 在tez-site.xml中设置 tez.lib.uris参数,只想我们刚刚put上的hdfs的路径

            <property>
                    <name>tez.lib.uris</name>
                    <value>/user/tez/tez.tar.gz</value> <!-- 这里指向hdfs上的tez.tar.gz包 -->
            </property>
    

    确保 tez.use.cluster.hadoop-libs没有被设置在tez-site.xml中,如果设置了该参数则应该值为false。
    4:如果要运行MapReduce任务(job)在tez之上,修改hadoop的mapred-site.xml配置文件的以下参数

            <property>
                    <name>mapreduce.framework.name</name>
                    <value>yarn-tez</value>
            </property>
    

    5:修改客户端节点配置,保证tez相关的类库在hadoop的classpath下
    编辑 hadoop-env.sh,在文件末尾追加以下配置

    TEZ_CONF_DIR=/opt/programs/hadoop-3.2.2/etc/hadoop/tez-site.xml
    TEZ_JARS=/opt/programs/apache-tez-0.9.2-bin
    export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*
    

    请注意“*”,在为包含jar文件的目录设置类路径时,这是一个重要的要求。

    6:在tez-examples.jar中有一个使用MRR作业的基本示例。请参阅源代码中的OrderedWordCount.java。要运行这个例子:

    hadoop jar tez-examples.jar orderedwordcount <input> <output>
    

    :hive支持多用户,自定义增加验证机制,hadoop配置修改

    默认hive是不需要认证的,所有人都可以直接访问数据,这样是很不安全的,所以这里我们需要自定义认证机制,然后才能通过beeline或者jdbc连接hive。

    代码主要是实现一个PasswdAuthenticationProvider的子类,实现抽象方法,实现认证逻辑,

    public class BasicUsernamePasswdAuthenticator implements PasswdAuthenticationProvider {
        private final static Logger LOGGER = LoggerFactory.getLogger(BasicUsernamePasswdAuthenticator.class);
        private static final String HIVE_JDBC_PASSWD_AUTH_PREFIX = "hive.jdbc.passwd.%s";
    
    
        private Configuration conf = null;
    
        @Override
        public void Authenticate(String user, String password) throws AuthenticationException {
            LOGGER.info("user: " + user + " try login.");
            String passwdFromConf = getConf().get(String.format(HIVE_JDBC_PASSWD_AUTH_PREFIX, user));
            LOGGER.info("读取到用户{}的配置密码为{},传入密码为{}", user, passwdFromConf, password);
            if (passwdFromConf == null) {
                String message = "user's ACL configuration is not found. user:" + user + ",passwdFromConf:" + passwdFromConf;
                LOGGER.info(message);
                throw new AuthenticationException(message);
            }
            if (!passwdFromConf.equals(password)) {
                String message = "user name and password is mismatch. user:" + user + ",passwdFromConf:" + passwdFromConf;
                LOGGER.error(message);
                throw new AuthenticationException(message);
            }
            LOGGER.info("认证通过");
        }
    
    
        public Configuration getConf() {
            if (conf == null) {
                this.conf = new Configuration(new HiveConf());
            }
            return conf;
        }
    
        public void setConf(Configuration conf) {
            this.conf = conf;
        }
    }
    

    然后将打包该class为jar,上传到hive 的lib目录中,修改hive的配置,增加如下配置

     <property>
        <name>hive.server2.authentication</name>
        <value>CUSTOM</value>
        <description>
          Expects one of [nosasl, none, ldap, kerberos, pam, custom].
          Client authentication types.
            NONE: no authentication check
            LDAP: LDAP/AD based authentication
            KERBEROS: Kerberos/GSSAPI authentication
            CUSTOM: Custom authentication provider
                    (Use with property hive.server2.custom.authentication.class)
            PAM: Pluggable authentication module
            NOSASL:  Raw transport
        </description>
      </property>
      <property>
        <name>hive.server2.custom.authentication.class</name>
        <value>org.puppy.hive.auth.basic.BasicUsernamePasswdAuthenticator</value>
      </property>
      <property>
        <name>hive.jdbc.passwd.hadoop</name>
        <value>123456789</value>
      </property>
    

    以上配置了一个hadoop用户,密码是1-9,用这个账户连接hive操作hdfs中的数据,用户名hadoop是在代码中用String.format模式匹配得到的,所以这里只有一个配置。

    如果只是配置到这里,我们尝试启动hive,然后用beeline连接

    $ nohup hiveserver2 >> /opt/programs/apache-hive-3.1.2-bin/logs/hive.log &
    $ beeline
    Beeline version 3.1.2 by Apache Hive
    beeline> !connect jdbc:hive2://hadoop000:10000
    Enter username for jdbc:hive2://hadoop000:10000: hadoop
    Enter password for jdbc:hive2://hadoop000:10000: *********
    

    回车之后会得到一个错误

    Error: Could not open client transport with JDBC Uri:
     jdbc:hive2://hadoop000:10000: Failed to open new session: java.lang.RuntimeException: 
    org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): 
    User: puppy is not allowed to impersonate joo (state=08S01,code=0)
    

    看后台的hive日志会看到这样的异常

    hadoop is not allowed to impersonate puppy
    

    解释一下,意思就是hadoop这个用户不允许乔装为puppy这个用户,为什么会这样呢?
    有两个原因,这哥就是hadoop的安全机制了,hadoop不是随便哪个用户都可以操作的,要用super user来代替hadoop这个用户操作hdfs才管用,hadoop提供了一种impersonate的机制,
    Superusers Acting On Behalf Of Other Users,然后super user可以proxy user的身份来提交任务,需要修改以下配置
    hive-site.xml

    <property>
        <name>hive.server2.enable.doAs</name>
        <value>true</value>
      </property>
    

    core-site.xml

    <property>
            <name>hadoop.proxyuser.puppy.hosts</name>
            <value>hadoop000,172.24.163.174,localhost,127.0.0.1</value>
    </property>
    <property>
            <name>hadoop.proxyuser.puppy.groups</name>
            <value>supergroup</value>
    </property>
    <property>
            <name>hadoop.proxyuser.puppy.users</name>
            <value>hadoop,bob,joe</value>
    </property>
    

    权限刷新

    hdfs dfsadmin -refreshUserToGroupsMappings
    yarn rmadmin -refreshSuperUserGroupsConfiguration
    为了保险起见,重启hadoop,就可以了
    

    如果上面的hadoop.proxyuser.puppy.hosts配置错误,会有下面的错误

    Caused by: org.apache.hadoop.ipc.RemoteException: Unauthorized connection for super-user: puppy from IP /172.24.163.174
            at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1562) ~[hadoop-common-3.2.2.jar:?]
            at org.apache.hadoop.ipc.Client.call(Client.java:1508) ~[hadoop-common-3.2.2.jar:?]
            at org.apache.hadoop.ipc.Client.call(Client.java:1405) ~[hadoop-common-3.2.2.jar:?]
    

    :jdbc客户端连接hive

    代码如下

        public static void main(String[] args) throws SQLException {
    
            DruidDataSource source = new DruidDataSource();
            source.setUrl("jdbc:hive2://hadoop000:10000");
            source.setDbType("hive");
            source.setUsername("hadoop");
            source.setPassword("123456789");
            DruidPooledConnection connection = source.getConnection();
            System.out.println("获取到连接:" + connection);
            PreparedStatement statement = connection.prepareStatement("select * from test_db.u_data");
            try {
                ResultSet resultSet = statement.executeQuery();
                while (resultSet.next()) {
                    String phone = resultSet.getString("phone");
                    System.out.println("手机号码为:" + phone);
                }
            } catch (SQLException e) {
                e.printStackTrace();
            } finally {
                connection.close();
                statement.close();
            }
        }
    

    maven依赖

          <dependency>
                <groupId>org.apache.hive</groupId>
                <artifactId>hive-jdbc</artifactId>
                <version>3.1.2</version> 
            </dependency>
    
            <dependency>
                <groupId>com.alibaba</groupId>
                <artifactId>druid</artifactId>
                <version>1.2.5</version>
            </dependency>
    

    相关文章

      网友评论

          本文标题:部署Hive3自定密码验证机制,Hadoop3配置proxy u

          本文链接:https://www.haomeiwen.com/subject/ipikcltx.html