背景
- 使用的是CDH版本的hdfs,想通过java api去连接采集hdfs上的数据.
- 也可以直接load本地的hadoop配置文件如hdfs-site.xml或core-site.xml,但是这样的话就是每个集群都得维护对应的这个文件,还是比较麻烦的。
相关代码
public class hdfsKerberos {
public static final String USER_KEY = <your_pricinal>;
public static final String KEY_TAB_PATH = <your_keytab>;;
static Configuration conf = new Configuration();
static {
System.setProperty("java.security.krb5.conf",<your_krb5.conf>);
conf.set("dfs.namenode.kerberos.principal", <your_pricinal>);
conf.set("dfs.namenode.kerberos.principal.pattern", "*");
conf.set("hadoop.security.authentication", "kerberos");
conf.set("fs.defaultFS", "hdfs://<active_nn_ip>:8020");
try {
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab(USER_KEY, KEY_TAB_PATH);
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws IOException {
FileSystem fs = FileSystem.get(conf);
Path dstPath = new Path("/tmp/hive");
FileStatus[] listStatus = fs.listStatus(dstPath);
for (FileStatus fileStatus : listStatus) {
Path path = fileStatus.getPath();
System.out.println(path);
}
}
}
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.5</version>
</dependency>
遇到报错
- 刚开始连接的时候老报下面这个Server has invalid Kerberos principal问题,具体原因就不细说了,解决办法是加上:conf.set("dfs.namenode.kerberos.principal.pattern", "*")。
- 详细的报错如下:
java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: xxxxxxx; Host Details : local host is: "xxxxx/xxxxx"; destination host is: "xxxxx":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1474)
at org.apache.hadoop.ipc.Client.call(Client.java:1401)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.getListing(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:554)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.getListing(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1958)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1941)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
at com.netease.kantlin.hdfs.method1.Test.main(Test.java:24)
Caused by: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: hdfs/xxxxxxxxxx
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:645)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:732)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
at org.apache.hadoop.ipc.Client.call(Client.java:1440)
... 20 more
Caused by: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: hdfs/xxxxxxxxxx
at org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:334)
at org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:231)
at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:159)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:724)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:720)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:719)
... 23 more
网友评论