大数据Hadoop工具python教程2-python访问HDF

作者: python测试开发 | 来源:发表于2019-01-21 18:49 被阅读12次

    https://pypi.org/project/hdfs3 已经不维护
    PyArrow
    https://pypi.org/project/hdfs/
    https://pypi.org/project/snakebite/ python2中比较好,对python3支持不好。

    hdfs和PyArrow比较常用,这里以hdfs为例:

    快速入门

    from hdfs import InsecureClient
    client = InsecureClient('http://localhost:50070', user='hduser_')
    
    fs_folders_list = client.list("/")
    print(fs_folders_list)
    with client.read('/user/hduser/input.txt', encoding='utf-8') as reader:
        for line in reader:
            print(line)
    

    执行结果:

    ['user']
    https://china-testing.github.io/
    

    https://diogoalexandrefranco.github.io/interacting-with-hdfs-from-pyspark/
    http://wesmckinney.com/blog/python-hdfs-interfaces/
    https://www.thomashenson.com/hadoop-python-example/
    https://blog.cloudera.com/blog/2013/01/a-guide-to-python-frameworks-for-hadoop/

    https://community.hortonworks.com/articles/92321/interacting-with-hadoop-hdfs-using-python-codes.html
    http://yizhanggou.top/python%E8%AE%BF%E9%97%AEhdfs%E7%9A%84%E5%87%A0%E7%A7%8D%E6%96%B9%E5%BC%8F/
    https://blog.csdn.net/Gamer_gyt/article/details/52446757

    相关文章

      网友评论

        本文标题:大数据Hadoop工具python教程2-python访问HDF

        本文链接:https://www.haomeiwen.com/subject/tuzcjqtx.html