美文网首页
递归、深度优先、广度优先 实现目录内文件的遍历

递归、深度优先、广度优先 实现目录内文件的遍历

作者: 夏威夷的芒果 | 来源:发表于2018-06-24 16:50 被阅读19次

    os.path.join()函数

    语法:

    os.path.join(path1[,path2[,......]])
    

    返回值:

    将多个路径组合后返回

    注:第一个绝对路径之前的参数将被忽略

    import os
    def getall(path):
        filelist = os.listdir(path)
        for filename in filelist:
            filepath = os.path.join(path,filename)
            if os.path.isdir(filepath):
                getall(filepath)
                print("目录:",filepath)
            else:
                print("文件:",filename)
    getall(r"/Users/miraco/PycharmProjects")   ##here to type 路径,以这个路径为例
    

    输出:

    /Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/miraco/PycharmProjects/untitled/test333.py
    文件: .DS_Store
    文件: pimrc2017.txt
    文件: pimrc2017.txt
    文件: globecom2017.txt
    文件: wcnc2017.txt
    目录: /Users/miraco/PycharmProjects/Paper Research/所有文章
    文件: 物联网.txt
    目录: /Users/miraco/PycharmProjects/Paper Research/物联网
    文件: .DS_Store
    文件: 众包.txt
    目录: /Users/miraco/PycharmProjects/Paper Research/众包
    文件: Combining Dynamic Clustering and Scheduling for Coordinated Multi-Point Transmission in LTE.pdf
    文件: Capacity of Infrastructure-based Cooperative Vehicular Networks.pdf
    文件: Cooperative Transmission in Cognitive and Energy Harvesting-based D2D Networks.pdf
    文件: .DS_Store
    文件: Cournot-Nash Equilibria for Bandwidth Allocation under Base-Station Cooperation.pdf
    文件: A Benchmark for D2D in Cellular Networks- The Importance of Information.pdf
    文件: Mobility Aware Caching Incentive Scheme for D2D Cellular Networks.pdf
    文件: Hybrid Coordination Function Controlled Channel Access for Latency-Sensitive Tactile Applications .pdf
    文件: 协作通信调研结果:.docx
    文件: An Optimal LTE-V2I-Based Cooperative Communication Scheme for Vehicular Networks.pdf
    文件: A D2D Mode Selection Scheme with Energy Consumption Minimization Underlaying Two-tier Heterogeneous Cellular Networks.pdf
    文件: 08292169.pdf
    文件: Power Allocation for Full-Duplex Cooperative Non-Orthogonal Multiple Access Systems.pdf
    文件: ON:OFF Reporting Mechanism for Robust Cooperative Sensing in Cognitive IoT Networks.pdf
    文件: 协作通信.txt
    文件: User Scheduling for Non-orthogonal Transmission in UAV-Assisted Relay Network.pdf
    文件: Computation Collaboration in Ultra Dense Network Integrated with Mobile Edge Computing.pdf
    文件: High-Throughput and Fair Scheduling for Access Point Cooperation in Dense Wireless Networks.pdf
    目录: /Users/miraco/PycharmProjects/Paper Research/协作通信
    文件: Delay Efficient Disconnected RSU Placement Algorithm for VANET Safety Applications.pdf
    文件: On the Handover Security Key Update and Residence Management in LTE Networks.pdf
    文件: Increasing the Security of Wireless Communication Through Relaying and Interference Generation.pdf
    文件: A Semi-Outsourcing Secure Data Privacy Scheme for IoT Data Transmission.pdf
    文件: Security Enhancement to Successive Interference Cancellation Algorithm for Non-Orthogonal Multiple Access (NOMA).pdf
    文件: Lightweight and Privacy-preserving Fog-assisted Information Sharing Scheme for Health Big Data.pdf
    文件: A Comparative Study of Possible Solutions for Transmission of Vehicular Safety Messages in LTE-based Networks.pdf
    文件: Privacy-Aware Offloading in Mobile-Edge Computing.pdf
    文件: 安全隐私.txt
    文件: Towards Scalable and Privacy Preserving Commercial Content Dissemination in Social Wireless Networks.pdf
    文件: Fairness and Safety Capacity Oriented Resource Allocation Scheme for D2D Communications.pdf
    文件: Physical Layer Security in D2D-enabled Cellular Networks- Artificial Noise Assisted.pdf
    文件: Privacy-Preserving Data Forwarding in VANETs- A Personal-Social Behavior Based Approach.pdf
    文件: Privacy-preserving and Multi-dimensional Range Query in Two-tiered Wireless Sensor Networks.pdf
    文件: UAV Assisted Public Safety Communications with LTE-Advanced HetNets and FeICIC.pdf
    文件: Dependent Interferer Arrangement for Physical Layer Security- Secrecy Outage Probability in Clustered Wireless Networks.pdf
    文件: A Load Balancing Scheme for Supporting Safety Applications in Heterogeneous Software Defined LTE-V Networks.pdf
    文件: Promoting Security and Efficiency in D2D Underlay Communication- A Bargaining Game Approach.pdf
    文件: Enhancing Physical Layer Security of OFDM Systems Using Channel Shortening.pdf
    目录: /Users/miraco/PycharmProjects/Paper Research/安全隐私
    文件: Content-Centric Event-Insensitive Big Data Reduction in Internet of Things .pdf
    文件: Twitter as a Source for Spatial Traffic Information in Big Data-Enabled Self-Organizing Networks.pdf
    文件: Edge Big Data-Enabled Low-Cost Indoor Localization Based on Bayesian Analysis of RSS.pdf
    文件: Reliable Content Dissemination in Internet of Vehicles Using Social Big Data.pdf
    文件: Big Data Driven Similarity Based U-Model for Online Social Networks.pdf
    文件: Lightweight and Privacy-preserving Fog-assisted Information Sharing Scheme for Health Big Data.pdf
    文件: Multi-Keyword Fuzzy and Sortable Ciphertext Retrieval Scheme for big data .pdf
    文件: Profit Maximization Auction and Data Management in Big Data Markets.pdf
    文件: 大数据.txt
    文件: Features Selection Model for Internet of e-Health Things using Big Data.pdf
    文件: A Big Data Deep Reinforcement Learning Approach to Next Generation Green Wireless Networks.pdf
    文件: Big Data Synchronization among Isolated Data Servers in Disaster.pdf
    文件: A Hybrid Location Privacy Protection Scheme in Big Data Environment.pdf
    目录: /Users/miraco/PycharmProjects/Paper Research/大数据
    文件: 自组织和传感器.txt
    目录: /Users/miraco/PycharmProjects/Paper Research/自组织和传感器
    文件: downtitle.cpython-36.pyc
    目录: /Users/miraco/PycharmProjects/Paper Research/__pycache__
    文件: researching.py
    文件: 干扰协调管理缓解.txt
    目录: /Users/miraco/PycharmProjects/Paper Research/干扰协调管理缓解
    文件: D2d中继.txt
    目录: /Users/miraco/PycharmProjects/Paper Research/D2d中继
    文件: 刘绍博的论文调研.zip
    文件: 车联网.txt
    目录: /Users/miraco/PycharmProjects/Paper Research/车联网
    文件: downtitle.py
    文件: sortandfilter.py
    文件: globecom2017.txt
    文件: researching.py
    文件: downtitle.py
    文件: sortandfilter.py
    文件: 运行脚本之前阅读.rtf
    目录: /Users/miraco/PycharmProjects/Paper Research/代码
    文件: wcnc2017.txt
    文件: A Contract-Based Incentive Mechanism for Data Caching in Ultra-Dense Small-Cells Networks .pdf
    文件: Mobility Aware Caching Incentive Scheme for D2D Cellular Networks.pdf
    文件: Fine-grained Incentive Mechanism for Sensing Augmented Spectrum Database.pdf
    文件: Distributed Caching via Rewarding- An Incentive Caching Model for ICN.pdf
    文件: QoS-based Incentive Mechanism for Mobile Data Offloading.pdf
    文件: Incentive Mechanism for Cached-Enabled Small Cell Sharing- A Stackelberg Game Approach.pdf
    文件: 合作激励.txt
    文件: 合作激励调研.docx
    文件: Incentive Based Cooperative Content Caching in Social Wireless Networks.pdf
    目录: /Users/miraco/PycharmProjects/Paper Research/合作激励
    目录: /Users/miraco/PycharmProjects/Paper Research
    文件: .DS_Store
    文件: convert.py
    文件: test.py
    文件: replica.conf.txt
    文件: hosts.txt
    文件: leetcode.py
    文件: hosts2.txt
    文件: encodings.xml
    文件: hosts.iml
    文件: profiles_settings.xml
    目录: /Users/miraco/PycharmProjects/hosts/.idea/inspectionProfiles
    文件: workspace.xml
    文件: modules.xml
    文件: misc.xml
    目录: /Users/miraco/PycharmProjects/hosts/.idea
    目录: /Users/miraco/PycharmProjects/hosts
    文件: .DS_Store
    文件: 666.py
    文件: Wcnc151617Statistics.py
    文件: Globecom141516.py
    文件: WCNC2015.py
    文件: downtitle.cpython-36.pyc
    文件: exp3.cpython-36.pyc
    文件: exp2.cpython-36.pyc
    文件: exp.cpython-36.pyc
    目录: /Users/miraco/PycharmProjects/untitled/__pycache__
    文件: test.py
    文件: exp2.py
    文件: exp3.py
    文件: downtitle.py
    文件: test333.py
    文件: exp.py
    文件: encodings.xml
    文件: profiles_settings.xml
    目录: /Users/miraco/PycharmProjects/untitled/.idea/inspectionProfiles
    文件: workspace.xml
    文件: untitled.iml
    文件: modules.xml
    文件: misc.xml
    目录: /Users/miraco/PycharmProjects/untitled/.idea
    目录: /Users/miraco/PycharmProjects/untitled
    

    当然还可以放在列表里面,一起输出啊:

    import os
    allfilepath = []
    allfilename = []
    def getall(path):
        filelist = os.listdir(path)
        for filename in filelist:
            filepath = os.path.join(path,filename)
            if os.path.isdir(filepath):
                getall(filepath)
            else:
                allfilename.append(filename)
    
    getall(r"/Users/miraco/PycharmProjects")   ##here to type 路径
    print("文件:", allfilename)
    
    输出的文件

    遍历的方式有好几种,深度遍历和广度遍历

    使用深度遍历进行模拟压栈

    def getall(path):
        realfilelist = []
        mystack = []
        #压栈
        mystack.append(path)
    
        while len(mystack)!=0:
            #出栈
            openpath = mystack.pop()
            #找出目录下的所有文件
            filelist = os.listdir(openpath)
            for filename in filelist:
                abspath = os.path.join(openpath,filename)  #这生成个绝对路径
                if os.path.isdir(abspath):
                #是目录,就压栈
                    mystack.append(abspath)
                else:
                    #是文件
                    realfilelist.append(abspath)
        return realfilelist
    arr = getall(r"/Users/miraco/PycharmProjects")
    for item in arr:
        print(item)
    

    输出结果:


    image.png

    说说collection模块(资料来自廖雪峰)

    collections是Python内建的一个集合模块,提供了许多有用的集合类。

    namedtuple

    我们知道tuple可以表示不变集合,例如,一个点的二维坐标就可以表示成:

    >>> p = (1, 2)
    
    

    但是,看到(1, 2),很难看出这个tuple是用来表示一个坐标的。

    定义一个class又小题大做了,这时,namedtuple就派上了用场:

    >>> from collections import namedtuple
    >>> Point = namedtuple('Point', ['x', 'y'])
    >>> p = Point(1, 2)
    >>> p.x
    1
    >>> p.y
    2
    
    

    namedtuple是一个函数,它用来创建一个自定义的tuple对象,并且规定了tuple元素的个数,并可以用属性而不是索引来引用tuple的某个元素。

    这样一来,我们用namedtuple可以很方便地定义一种数据类型,它具备tuple的不变性,又可以根据属性来引用,使用十分方便。

    可以验证创建的Point对象是tuple的一种子类:

    >>> isinstance(p, Point)
    True
    >>> isinstance(p, tuple)
    True
    

    类似地,如果要用坐标和半径表示一个圆,也可以用namedtuple定义:

    # namedtuple('名称', [属性list]):
    Circle = namedtuple('Circle', ['x', 'y', 'r'])
    

    deque

    使用list存储数据时,按索引访问元素很快,但是插入和删除元素就很慢了,因为list是线性存储,数据量大的时候,插入和删除效率很低。

    deque是为了高效实现插入和删除操作的双向列表,适合用于队列和栈:

    >>> from collections import deque
    >>> q = deque(['a', 'b', 'c'])
    >>> q.append('x')
    >>> q.appendleft('y')
    >>> q
    deque(['y', 'a', 'b', 'c', 'x'])
    
    

    deque除了实现list的append()pop()外,还支持appendleft()popleft(),这样就可以非常高效地往头部添加或删除元素。

    defaultdict

    使用dict时,如果引用的Key不存在,就会抛出KeyError。如果希望key不存在时,返回一个默认值,就可以用defaultdict

    >>> from collections import defaultdict
    >>> dd = defaultdict(lambda: 'N/A')
    >>> dd['key1'] = 'abc'
    >>> dd['key1'] # key1存在
    'abc'
    >>> dd['key2'] # key2不存在,返回默认值
    'N/A'
    
    

    注意默认值是调用函数返回的,而函数在创建defaultdict对象时传入。

    除了在Key不存在时返回默认值,defaultdict的其他行为跟dict是完全一样的。

    OrderedDict

    使用dict时,Key是无序的。在对dict做迭代时,我们无法确定Key的顺序。

    如果要保持Key的顺序,可以用OrderedDict

    >>> from collections import OrderedDict
    >>> d = dict([('a', 1), ('b', 2), ('c', 3)])
    >>> d # dict的Key是无序的
    {'a': 1, 'c': 3, 'b': 2}
    >>> od = OrderedDict([('a', 1), ('b', 2), ('c', 3)])
    >>> od # OrderedDict的Key是有序的
    OrderedDict([('a', 1), ('b', 2), ('c', 3)])
    
    

    注意,OrderedDict的Key会按照插入的顺序排列,不是Key本身排序:

    >>> od = OrderedDict()
    >>> od['z'] = 1
    >>> od['y'] = 2
    >>> od['x'] = 3
    >>> list(od.keys()) # 按照插入的Key的顺序返回
    ['z', 'y', 'x']
    
    

    OrderedDict可以实现一个FIFO(先进先出)的dict,当容量超出限制时,先删除最早添加的Key:

    from collections import OrderedDict
    
    class LastUpdatedOrderedDict(OrderedDict):
    
        def __init__(self, capacity):
            super(LastUpdatedOrderedDict, self).__init__()
            self._capacity = capacity
    
        def __setitem__(self, key, value):
            containsKey = 1 if key in self else 0
            if len(self) - containsKey >= self._capacity:
                last = self.popitem(last=False)
                print('remove:', last)
            if containsKey:
                del self[key]
                print('set:', (key, value))
            else:
                print('add:', (key, value))
            OrderedDict.__setitem__(self, key, value)
    
    

    Counter

    Counter是一个简单的计数器,例如,统计字符出现的个数:

    >>> from collections import Counter
    >>> c = Counter()
    >>> for ch in 'programming':
    ...     c[ch] = c[ch] + 1
    ...
    >>> c
    Counter({'g': 2, 'm': 2, 'r': 2, 'a': 1, 'i': 1, 'o': 1, 'n': 1, 'p': 1})
    
    

    Counter实际上也是dict的一个子类,上面的结果可以看出,字符'g''m''r'各出现了两次,其他字符各出现了一次。

    广度优先遍历先进先出

    import os
    import collections
    
    
    def getall(path):
        queue = collections.deque([])  #一个队列
        realfilelist = []  #列表,用来放文件名
        #进入队列
        queue.append(path)
    
        while len(queue) != 0:
            onepath = queue.popleft()  #先进先出的队列,最左端取出元素
            filelist  = os.listdir(onepath)    #列出取出元素的目录的元素
            for filename in filelist:     #检索每个文件(夹)
                abspath = os.path.join(onepath,filename)     #合成绝对路径
                if os.path.isdir(abspath):        #如果路径是是文件夹
                    queue.append(abspath)         #进入队列
                else:
                    realfilelist.append(abspath)   #如果是文件就输出文件名
        return realfilelist
    
    arr = getall(r"/Users/miraco/PycharmProjects")
    for item in arr:
        print(item)
    ···
    

    相关文章

      网友评论

          本文标题:递归、深度优先、广度优先 实现目录内文件的遍历

          本文链接:https://www.haomeiwen.com/subject/pscsyftx.html