美文网首页
hdfs vs webhdfs vs httpfs

hdfs vs webhdfs vs httpfs

作者: OldChicken_ | 来源:发表于2018-11-14 15:12 被阅读8次

Hadoop provides several ways of accessing HDFS
All of the following support almost all features of the filesystem -

  1. FileSystem (FS) shell commands: Provides easy access of Hadoop file system operations as well as other file systems that Hadoop supports, such as Local FS, HFTP FS, S3 FS.
    This needs hadoop client to be installed and involves the client to write blocks directly to one Data Node. All versions of Hadoop do not support all options for copying between filesystems.
  2. WebHDFS: It defines a public HTTP REST API, which permits clients to access Hadoop from multiple languages without installing Hadoop, Advantage being language agnostic way(curl, php etc....).
    WebHDFS needs access to all nodes of the cluster and when some data is read, it is transmitted from the source node directly but there is a overhead of http over (1)FS Shell but works agnostically and no problems with different hadoop cluster and versions.
  3. HttpFS. Read and write data to HDFS in a cluster behind a firewall. Single node will act as GateWay node through which all the data will be transfered and performance wise I believe this can be even slower but preferred when needs to pull the data from public source into a secured cluster.
    Cloudera Doc about HttpFS
hdfs vs webhdfs Q&A from Cloudera community

1. Which one will be faster?
The native protocol of HDFS is hdfs:// and this is the fastest type (purely TCP, with efficient data packet transfers). Other protocols such as webhdfs:// or the deprecated hftp:// add overheads due to their HTTP usage that make them slower overall.

2. Can we use one protocol at source and other at destination (I mean combination of both)

3. When can we webhdfs in particular
Yes to (2).
See http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_admin_distcp_da... for (3).
Rule of thumb is:

  • Use webhdfs:// for source when its a different major version (such as a CDH4 source to CDH5 target).
  • Use hdfs:// otherwise, when the major version is the same (such as between any CDH 5.x).
  • Prefer webhdfs:// over hftp://, unless its a very old version (pre CDH3u5) that has no WebHDFS support.

4. Will there be any speed difference in transfer between in using these protocols.
Yes. This is also a repeat of (1), which I've answered above.

5. What will be the port numbers needed in using these (somewhere I saw commands with 50070 and 80020, when to use what)
Follow the CDH5 ports guide at http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_ig_ports_cdh5.h... to find the right ports for your environment. Defaults are used in the below statement.
HDFS native protocol transfers require every host on the DistCp job cluster (usually target), to be able to talk to the source's 8020 (for NameNode(s)) and 50010/1004, 50020 (across all DataNodes) ports.
WebHDFS or HFTP, HTTP based protocol transfers require every host on the DistCp job cluster (usually target), to be able to talk to the source's 50070 (for NameNode(s)) and 50075/1006 (across all DataNodes) ports.

相关文章

  • hdfs vs webhdfs vs httpfs

    Hadoop provides several ways of accessing HDFSAll of the ...

  • WebHDFS与HttpFS的使用

    WebHDFS与HttpFS的使用 WebHDFS 介绍 提供HDFS的RESTful接口,可通过此接口进行HDF...

  • webhdfs上传与读取文件

    webhdfs服务 修改hdfs配置文件编辑hdfs-site.xml配置文件,添加下列属性配置。 Hadoop ...

  • 【五行论水】

    壬vs癸阳vs阴动vs静白vs黑勇vs谋狂vs逸涛vs渗敏vs睿急vs缓响vs寂奔vs驻吐vs纳 。。。。。。 水...

  • 【土为何物】

    戊vs己阳vs阴厚vs薄重vs轻凸vs平实vs虚深vs浅伟vs卑藏vs盖信vs疑稳vs浮强vs弱 土代表环境环境寓...

  • 【五行论木】

    甲vs乙 阳vs阴 直vs曲 纵vs横 高vs宽 挺vs垂 衝vs铺 优vs良 通vs达 干vs枝 上vs下 粗v...

  • 【五行论火】

    丙vs丁 阳vs阴 散vs聚 泛vs专 远vs近 猛vs文 爆vs缩 快vs慢 扩vs定 周vs点 外vs里 宏v...

  • 【五行论金】

    庚vs辛 阳vs阴 刚vs柔 锋vs锐 劈vs戳 悍vs险 确vs准 决vs绝 边vs端 理vs据 冷vs润 攻v...

  • vscode 免费视频教程

    VS Code教程 VS Code简介 VS Code 界面 VS Code各种视图 VS Code主题 VS C...

  • VS Code导学视频,教你如何入门!

    VS Code教程 VS Code简介 VS Code 界面 VS Code各种视图 VS Code主题 VS C...

网友评论

      本文标题:hdfs vs webhdfs vs httpfs

      本文链接:https://www.haomeiwen.com/subject/wbljfqtx.html