每个月都有那么几天想划水,又...">
美文网首页
Python os.walk() 简介

Python os.walk() 简介

作者: Nisen | 来源:发表于2016-11-26 23:51 被阅读83次

    os.walk目录遍历<a id="sec-1" name="sec-1"></a>

    每个月都有那么几天想划水,又到划水的日子了,今天分享的是刚在处理遍历目录相关用到的相关方法。

    os.walk<a id="sec-1-1" name="sec-1-1"></a>

    os.walk的参数如下:

    os.walk(top, topdown=True, onerror=None, followlinks=False)
    

    其中:

    • top是要遍历的目录。
    • topdown是代表要从上而下遍历还是从下往上遍历。
    • onerror可以用来设置当便利出现错误的处理函数(该函数接受一个OSError的实例作为参数),设置为空则不作处理。
    • followlinks表示是否要跟随目录下的链接去继续遍历,要注意的是,os.walk不会记录已经遍历的目录,所以跟随链接遍历的话有可能一直循环调用下去。

    os.walk返回的是一个3个元素的元组 (root, dirs, files) ,分别表示遍历的路径名,该路径下的目录列表和该路径下文件列表。注意目录列表和文件列表不是具体路径,需要具体路径(从root开始的路径)的话可以用 os.path.join(root,dir)os.path.join(root,dir)

    例子<a id="sec-1-2" name="sec-1-2"></a>

    假设现在存在如下的文件和目录结构:

    ➜  test_os_walk git:(master) ✗ tree
    .
    ├── a.py
    ├── b.py
    ├── c.py
    ├── dir1
    │   ├── dir4
    │   │   ├── g.py
    │   │   └── h.py
    │   ├── dirx
    │   │   ├── diry
    │   │   │   └── k.py
    │   │   └── z.py
    │   ├── e.py
    │   ├── f.py
    │   └── g.py
    ├── dir2
    │   ├── dira
    │   │   └── dirb
    │   │       └── dirc
    │   │           └── aha.py
    │   ├── k.py
    │   ├── l.py
    │   └── m.py
    └── dir3
        ├── dir5
        │   └── z.py
        ├── x.py
        └── y.py
    
    10 directories, 17 files
    

    测试topdown<a id="sec-1-2-1" name="sec-1-2-1"></a>

    当我用 os.walk 遍历这个目录时,程序和输出如下:

    import os
    
    path = '/Users/nisen/Projects/python_advanced_class/test/test_os_walk'
    
    for root, dirs, files in os.walk(path, True):
        print 'root: %s' % root
        print 'dirs: %s' % dirs
        print 'files: %s' % files
        print ''
    

    结果如下,从root的路径可以看出遍历是自上而下的:

    ➜  test git:(master) ✗ python test11.py
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk
    dirs: ['dir1', 'dir2', 'dir3']
    files: ['a.py', 'b.py', 'c.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1
    dirs: ['dir4', 'dirx']
    files: ['e.py', 'f.py', 'g.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dir4
    dirs: []
    files: ['g.py', 'h.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx
    dirs: ['diry']
    files: ['z.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx/diry
    dirs: []
    files: ['k.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2
    dirs: ['dira']
    files: ['k.py', 'l.py', 'm.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira
    dirs: ['dirb']
    files: []
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb
    dirs: ['dirc']
    files: []
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb/dirc
    dirs: []
    files: ['aha.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3
    dirs: ['dir5']
    files: ['x.py', 'y.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3/dir5
    dirs: []
    files: ['z.py']
    

    而当设置os.walk的topdown为False时,结果如下, 可以看出他是自上而下遍历的:

    ➜  test git:(master) ✗ python test11.py
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dir4
    dirs: []
    files: ['g.py', 'h.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx/diry
    dirs: []
    files: ['k.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx
    dirs: ['diry']
    files: ['z.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1
    dirs: ['dir4', 'dirx']
    files: ['e.py', 'f.py', 'g.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb/dirc
    dirs: []
    files: ['aha.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb
    dirs: ['dirc']
    files: []
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira
    dirs: ['dirb']
    files: []
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2
    dirs: ['dira']
    files: ['k.py', 'l.py', 'm.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3/dir5
    dirs: []
    files: ['z.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3
    dirs: ['dir5']
    files: ['x.py', 'y.py']
    
    root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk
    dirs: ['dir1', 'dir2', 'dir3']
    files: ['a.py', 'b.py', 'c.py']
    

    运行时修改遍历目录<a id="sec-1-2-2" name="sec-1-2-2"></a>

    当topdown设置为True时,可以在处理时修改返回的 dirs 列表,这样可以遍历下面的目录时会根据修改后的 dirs 来遍历。比如下面的例子,在遍历的时候不把"CSV"目录包括在内:

    import os
    from os.path import join, getsize
    for root, dirs, files in os.walk('python/Lib/email'):
        print root, "consumes",
        print sum(getsize(join(root, name)) for name in files),
        print "bytes in", len(files), "non-directory files"
        if 'CVS' in dirs:
            dirs.remove('CVS')  # don't visit CVS directories
    

    参考资料<a id="sec-2" name="sec-2"></a>

    相关文章

      网友评论

          本文标题:Python os.walk() 简介

          本文链接:https://www.haomeiwen.com/subject/dqwgpttx.html