美文网首页
20190824 课堂笔记

20190824 课堂笔记

作者: 赛尔木 | 来源:发表于2019-08-26 15:33 被阅读0次

    20190824 课堂笔记

    设置快捷键

    设置编译

    创建项目

    选择quickstart

    GAV设置

    项目设置

    修改

    添加hadoop-version, repository

    <properties>

    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>

    <maven.compiler.source>1.8</maven.compiler.source>

    <maven.compiler.target>1.8</maven.compiler.target>

    <hadoop-version>2.6.4</hadoop-version>

    </properties>

    <repositories>

    <repository>

    <id>cloudera</id>

    <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>

    </repository>

    </repositories>

    添加hadoop依赖

    <dependency>

    <groupId>org.apache.hadoop</groupId>

    <artifactId>hadoop-client</artifactId>

    <version>${hadoop-version}</version>

    </dependency>

    查看是否加进来了

    maven reimport

    建立一个test项目

    所有的操作入口都是FileSystem

    mkdirs操作

    public static final String HDFS_PATH = "hdfs://192.168.1.64:8020";

    @Test

    public void mkdir() throws Exception{

    Configuration

    configuration = new Configuration();

    FileSystem fileSystem =FileSystem.get(new URI(HDFS_PATH), configuration); // 注意这里的HDFS_PATH 不需要写成"HDFS_PATH"

    boolean isSuccess =

    fileSystem.mkdirs(new Path("/ruozedata/hdfsapi"));

    Assert.assertEquals(true, isSuccess);

    }

    创建目录, 并且指定用户

    @Test

    public void mkdir02() throws Exception{

    Configuration

    configuration = new Configuration();

    FileSystem fileSystem =FileSystem.get(new URI(HDFS_PATH), configuration, "hadoop");

    boolean isSuccess =

    fileSystem.mkdirs(new Path("/ruozedata/hdfsapi"));

    Assert.assertEquals(true, isSuccess);

    }

    创建成功

    从本地拷贝文件到hdfs

    @Test

    public void copyFromLocalFile() throws Exception{

    Path srcPath = new Path("D:/BDP/api/testapi.py");

    Path dstPath = new Path("/ruozedata/hdfsapi");

    fileSystem.copyFromLocalFile(srcPath, dstPath);

    }

    上面的拷贝 Replication 为3

    想要和配置文件中的副本数一致,有两个方法:

    1. 设置 副本

    configuration.set("dfs.replication", "2");

    2. 将 hdfs-site.xml 拷贝进来

    执行如下

    将hdfs 文件拷贝到本地

    @Test

    public void copyToLocal() throws Exception{

    Path srcPath = new Path("/ruozedata/hdfsapi/test20190825.txt");

    Path dstPath = new Path("D:/BDP/ruoze/ruoze20190825.txt");

    fileSystem.copyToLocalFile( srcPath, dstPath);

    }

    这样执行报空指针异常,改成下面代码执行正常

    @Test

    public void copyToLocal() throws Exception{

    Path srcPath = new Path("/ruozedata/hdfsapi/test20190825.txt");

    Path dstPath = new Path("D:/BDP/ruoze/ruoze20190825.txt");

    fileSystem.copyToLocalFile(false, srcPath, dstPath, true);

    }

    false: delSrc

    true: userRawLocalFileSystem

    重命名

    @Test

    public void rename() throws Exception{

    Path srcPath = new Path("/ruozedata/hdfsapi/test20190825.txt");

    Path dstPath = new Path("/ruozedata/hdfsapi/test20190825-2.txt");

    fileSystem.rename(srcPath, dstPath);

    }

    ok

    列出目录内容

    @Test

    public void listFiles() throws Exception{

    RemoteIterator<LocatedFileStatus>

    files = fileSystem.listFiles(new Path("/ruozedata/hdfsapi"), true);

    while (files.hasNext()){

    LocatedFileStatus

    fileStatus = files.next();

    String isDir =

    fileStatus.isDirectory() ? "文件夹" : "文件";

    String permission =

    fileStatus.getPermission().toString();

    short replication =

    fileStatus.getReplication();

    long length =

    fileStatus.getLen();

    String path =

    fileStatus.getPath().toString();

    System.out.println(isDir + "\t"

    + permission + "\t"

    + replication + "\t"

    + length + "\t"

    + path + "\t"

    );

    }

    }

    输出

    文件输出

    @Test

    public void download01() throws Exception{

    FSDataInputStream in = fileSystem.open(new Path("/ruozedata/hdfsapi/spark-2.3.0.tgz"));

    FileOutputStream out = new FileOutputStream(new File("D:/BDP/ruoze/spark01.tgz.part02"));

    in.seek(1024*128*128);

    byte[] buffer = new byte[1024];

    for(int i=0;i< 1024* 128;i++){

    in.read(buffer);

    out.write(buffer);

    }

    IOUtils.closeStream(out);

    IOUtils.closeStream(in);

    }

    这种方式, 得到的文件大小都是128m的

    下面的这样方式, 能够拿到正确的文件大小

    @Test

    public void download01() throws Exception{

    FSDataInputStream in = fileSystem.open(new Path("/ruozedata/hdfsapi/spark-2.3.0.tgz"));

    FileOutputStream out = new FileOutputStream(new File("D:/BDP/ruoze/spark01.tgz.part02"));

    in.seek(1024*1024*128);

    // byte[] buffer = new

    byte[1024];

    // for(int i=0;i< 1024*

    128;i++){

    // in.read(buffer);

    // out.write(buffer);

    // }

    IOUtils.copyBytes(in, out, configuration);

    IOUtils.closeStream(out);

    IOUtils.closeStream(in);

    }

    输出块信息

    BlockLocation[]

    blockLocations = fileStatus.getBlockLocations();

    for(BlockLocation location:

    blockLocations){

    String[] hosts =

    location.getHosts();

    for(String host: hosts){

    System.out.println(host);

    }

    }

    相关文章

      网友评论

          本文标题:20190824 课堂笔记

          本文链接:https://www.haomeiwen.com/subject/kwtdectx.html