美文网首页
isEmpty vs count == 0

isEmpty vs count == 0

作者: 土豆董 | 来源:发表于2020-06-20 22:38 被阅读0次

    我们需要检查一个数组,集合, 字符串或者是其他的集合类型是空,这在日常开发中是常见的操作之一。之前在oc中会这么判断,我们先排除字符串为nil 或者 null的情况:

    - (void)test6{
        NSString *text = @"hello world";
        NSLog(@"%ld",[text length]);
    
        NSArray *array = @[@"a",@"b",@"c"];
        NSLog(@"%ld",[array count]);
    }
    
    2020-06-20 16:32:49.657371+0800 objcTest[99024:8064113] 11
    2020-06-20 16:32:49.657554+0800 objcTest[99024:8064113] 3
    

    Swift中的运用

    按照以往的习惯我们可能会这么使用

    func test(){
            let text = "hello world"
            print("\(text.count)")
            let array = ["a","b","c"]
            print("\(array.count)")
     }
    输出:11
    输出:3
    

    这么写也是没有一点毛病,但是我们会发现在String系列 以及Array系列都会有个isEmpty属性
    在String.Swift中有isEmpty的描述

    /// To check whether a string is empty, use its `isEmpty` property instead of
    /// comparing the length of one of the views to `0`. Unlike with `isEmpty`,
    /// calculating a view's `count` property requires iterating through the
    /// elements of the string.
    

    大意就是如果要判断string是否是空,请用isEmpty属性,而不要用count属性,因为这会迭代整个string的元素。
    我们知道在Swift中面向协议的思想贯穿始终,我们在Collection协议中找到isEmpty的定义

    public protocol Collection: Sequence {
    /// A Boolean value indicating whether the collection is empty.
      ///
      /// When you need to check whether your collection is empty, use the
      /// `isEmpty` property instead of checking that the `count` property is
      /// equal to zero. For collections that don't conform to
      /// `RandomAccessCollection`, accessing the `count` property iterates
      /// through the elements of the collection.
      /// - Complexity: O(1)
      var isEmpty: Bool { get }
    }
    
    

    同样是判断空建议用isEmpty ,这里还补充了一下 对于不符合 RandomAccessCollection 协议的类型,count 属性会迭代遍历集合里面的全部元素。

    public protocol RandomAccessCollection: BidirectionalCollection
    where SubSequence: RandomAccessCollection, Indices: RandomAccessCollection
    {
    ...
      override var startIndex: Index { get }
      override var endIndex: Index { get }
    ...
    }
    

    我们现在只讨论两个典型的类型String和 Array,它们都符合Collection协议。

    String的isEmpty、count 比较

    先上代码

    func test(){
            
            let text = "hello world,coding is happy,so we will going all along"
            let time1 = CFAbsoluteTimeGetCurrent()
            for i in 0..<500000 {
                print("\(text.count == 0) index:\(i)")
            }
            let time2 = CFAbsoluteTimeGetCurrent()
            print("time:\(time2 - time1)")
        }
    time:4.98923397064209
    
    func test6(){
            let text = "hello world,coding is happy,so we will going all along"
            let time1 = CFAbsoluteTimeGetCurrent()
            for i in 0..<500000 {
                print("\(text.isEmpty) index:\(i)")
            }
            let time2 = CFAbsoluteTimeGetCurrent()
            print("time:\(time2 - time1)")
        }
    time:5.166522026062012
    

    这不对啊,这和Swift源码代码注释说的不大一样啊,两个几乎差不多,甚至isEmpty 比count更加耗时,另外Check if a Swift Array Is Empty: Best Practice Revisited 说的,也是String 建议用isEmpty 来判断是否为空,不然会迭代整个集合的元素来计算。见鬼了,还是去翻翻源码把。

    public struct String {
      public // @SPI(Foundation)
      var _guts: _StringGuts
    
      @inlinable @inline(__always)
      internal init(_ _guts: _StringGuts) {
        self._guts = _guts
        _invariantCheck()
      }
    ...
    }
    

    在StringLegacy.swift 中

    extension String {
    ...
      /// A Boolean value indicating whether a string has no characters.
      @inlinable
      public var isEmpty: Bool {
        @inline(__always) get { return _guts.isEmpty }
      }
    }
    

    从这些能知道String的本质是_StringGuts ,String的方法基本都是基于_StringGuts构建的,在_StringGuts中

    xtension _StringGuts {
      // The number of code units
      @inlinable @inline(__always)
      internal var count: Int { return _object.count }
    
      @inlinable @inline(__always)
      internal var isEmpty: Bool { return count == 0 }
    ...
    }
    

    能看出来其实String 的isEmpty 和 count 其实还是一样的,isEmpty的具体实现还是拿count == 0来判断的

    暂时结论:在String中,isEmpty 和 count 其实是相同的性能,都是通过迭代集合内元素来计算得出
    为啥是暂时结论呢,因为我也不明白为啥文档说String的isEmpty 是O(1)操作 现在看来是O(n)操作。

    Array的isEmpty、count 比较(也包含ContiguousArray)

    在Collection协议中

    public var isEmpty: Bool {
        return startIndex == endIndex
      }
    

    Array.swift中

    @inlinable
      public var startIndex: Int {
        return 0
      }
    ...
    /// If the array is empty, `endIndex` is equal to `startIndex`.
      @inlinable
      public var endIndex: Int {
        @inlinable
        get {
          return _getCount()
        }
      }
    ...
    @_semantics("array.get_count")
      internal func _getCount() -> Int {
        return _buffer.immutableCount
      }
    

    immutableCount 是从_ArrayBody里面的count的值取出来的,既然Array 是符合 RandomAccessCollection协议的,count 和 isEmpty 的操作性能几乎一样都是O(1)操作,从哪里得出来的呢,得需要从array的扩容来研究,先从append方法来看

    @inlinable
      @_semantics("array.append_element")
      public mutating func append(_ newElement: __owned Element) {
        // Separating uniqueness check and capacity check allows hoisting the
        // uniqueness check out of a loop.
        _makeUniqueAndReserveCapacityIfNotUnique()
        let oldCount = _buffer.mutableCount
        _reserveCapacityAssumingUniqueBuffer(oldCount: oldCount)
        _appendElementAssumeUniqueAndCapacity(oldCount, newElement: newElement)
        _endMutation()
      }
    

    #源码分析
    ·第一行的函数,看名字表示,如果这个数组不是惟一的,就把它变成惟一的,而且保留数组容量。在Xcode里面搜索一下这个函数:

    @inlinable
      @_semantics("array.make_mutable")
      internal mutating func _makeUniqueAndReserveCapacityIfNotUnique() {
        if _slowPath(!_buffer.beginCOWMutation()) {
          _createNewBuffer(bufferIsUnique: false,
                           minimumCapacity: count + 1,
                           growForAppend: true)
        }
      }
    
    /// Returns `true` and puts the buffer in a mutable state iff the buffer's
      /// storage is uniquely-referenced.
      ///
      /// - Precondition: The buffer must be immutable.
      ///
      /// - Warning: It's a requirement to call `beginCOWMutation` before the buffer
      ///   is mutated.
      @_alwaysEmitIntoClient
      internal mutating func beginCOWMutation() -> Bool {
        let isUnique: Bool
        if !_isClassOrObjCExistential(Element.self) {
          isUnique = _storage.beginCOWMutationUnflaggedNative()
        } else if !_storage.beginCOWMutationNative() {
          return false
        } else {
          isUnique = _isNative
        }
    #if INTERNAL_CHECKS_ENABLED
        if isUnique {
          _native.isImmutable = false
        }
    #endif
        return isUnique
      }
    

    如果数组不是唯一引用的,就会调用_createNewBuffer创建新的数组buffer,假设我们的数组是唯一引用的,这个方法就不会执行 ,咱们继续走
    ·第二行用一个变量oldCount保存数组当前长度。
    ·第三行表示:保存数组容量,假如是唯一引用的情况下。之所以做出这样的假设,是因为此前已经调用过_makeUniqueAndReserveCapacityIfNotUnique方法,即使这个数组不是唯一引用,也被复制了一份新的。我们来看看_reserveCapacityAssumingUniqueBuffer方法的实现:

    @inlinable
      @_semantics("array.mutate_unknown")
      internal mutating func _reserveCapacityAssumingUniqueBuffer(oldCount: Int) {
        // Due to make_mutable hoisting the situation can arise where we hoist
        // _makeMutableAndUnique out of loop and use it to replace
        // _makeUniqueAndReserveCapacityIfNotUnique that preceeds this call. If the
        // array was empty _makeMutableAndUnique does not replace the empty array
        // buffer by a unique buffer (it just replaces it by the empty array
        // singleton).
        // This specific case is okay because we will make the buffer unique in this
        // function because we request a capacity > 0 and therefore _copyToNewBuffer
        // will be called creating a new buffer.
        let capacity = _buffer.mutableCapacity
        _internalInvariant(capacity == 0 || _buffer.isMutableAndUniquelyReferenced())
    
        if _slowPath(oldCount + 1 > capacity) {
          _createNewBuffer(bufferIsUnique: capacity > 0,
                           minimumCapacity: oldCount + 1,
                           growForAppend: true)
        }
      }
    

    检查如果oldCount + 1 > capacity 那就说明容量不够了,就需要扩容了,这时候调用_createNewBuffer 方法

    @_alwaysEmitIntoClient
      internal mutating func _createNewBuffer(
        bufferIsUnique: Bool, minimumCapacity: Int, growForAppend: Bool
      ) {
        _internalInvariant(!bufferIsUnique || _buffer.isUniquelyReferenced())
        _buffer = _buffer._consumeAndCreateNew(bufferIsUnique: bufferIsUnique,
                                               minimumCapacity: minimumCapacity,
                                               growForAppend: growForAppend)
      }
    
    internal __consuming func _consumeAndCreateNew(
        bufferIsUnique: Bool, minimumCapacity: Int, growForAppend: Bool
      ) -> _ArrayBuffer {
        let newCapacity = _growArrayCapacity(oldCapacity: capacity,
                                             minimumCapacity: minimumCapacity,
                                             growForAppend: growForAppend)
        let c = count
        _internalInvariant(newCapacity >= c)
        
        let newBuffer = _ContiguousArrayBuffer<Element>(
          _uninitializedCount: c, minimumCapacity: newCapacity)
    
        if bufferIsUnique {
          // As an optimization, if the original buffer is unique, we can just move
          // the elements instead of copying.
          let dest = newBuffer.firstElementAddress
          dest.moveInitialize(from: mutableFirstElementAddress,
                              count: c)
          _native.mutableCount = 0
        } else {
          _copyContents(
            subRange: 0..<c,
            initializing: newBuffer.mutableFirstElementAddress)
        }
        return _ArrayBuffer(_buffer: newBuffer, shiftedToStartIndex: 0)
      }
    

    另外提下:在_consumeAndCreateNew方法中, _growArrayCapacity 方法里面做了对容量的处理

    internal func _growArrayCapacity(_ capacity: Int) -> Int {
      return capacity * 2
    }
    
    @_alwaysEmitIntoClient
    internal func _growArrayCapacity(
      oldCapacity: Int, minimumCapacity: Int, growForAppend: Bool
    ) -> Int {
      if growForAppend {
        if oldCapacity < minimumCapacity {
          // When appending to an array, grow exponentially.
          return Swift.max(minimumCapacity, _growArrayCapacity(oldCapacity))
        }
        return oldCapacity
      }
      // If not for append, just use the specified capacity, ignoring oldCapacity.
      // This means that we "shrink" the buffer in case minimumCapacity is less
      // than oldCapacity.
      return minimumCapacity
    }
    

    当旧的数组容量小于minimumCapacity的时候,minimumCapacity就是oldCount + 1,这时候容量的扩容其实是呈指数增长,每次扩容,旧的容量就翻两倍。
    同时也解释了append方法不是线程安全的,如果在某一个线程中插入新元素,导致了数组扩容,那么Swift会创建一个新的数组(意味着地址完全不同)。然后ARC会为我们自动释放旧的数组,但这时候可能另一个线程还在访问旧的数组对象。

    初始化新的newBuffer 用_ContiguousArrayBuffer构造器 同时对数组storageHeader进行初始化

    internal func _initStorageHeader(count: Int, capacity: Int) {
    #if _runtime(_ObjC)
        let verbatim = _isBridgedVerbatimToObjectiveC(Element.self)
    #else
        let verbatim = false
    #endif
    
        // We can initialize by assignment because _ArrayBody is a trivial type,
        // i.e. contains no references.
        _storage.countAndCapacity = _ArrayBody(
          count: count,
          capacity: capacity,
          elementTypeIsBridgedVerbatim: verbatim)
      }
    
    

    数组storage起始于_ArrayBody,结束于连续的数组元素 , _ArrayBody主要是描述当前array 的信息同理ContiguousArray也是,此时数组count ,capacity等都会到_ArrayBody 对象里;
    取count最终我们会追踪到

    internal var count: Int {
        get {
          return _storage.countAndCapacity.count
        }
        nonmutating set {
          _internalInvariant(newValue >= 0)
    
          _internalInvariant(
            newValue <= mutableCapacity,
            "Can't grow an array buffer past its capacity")
    
          mutableStorage.countAndCapacity.count = newValue
        }
      }
    

    我们会直接从countAndCapacity(_ArrayBody类型)里面取出count,操作为O(1),另外array的isEmpty是根据startIndex endIndex是否相等来判断的,endIndex 取值就是取count,所以对于array 这些服从RandomAccessCollection协议的类型来说,isEmpty 和 count == 0 判断是一模一样的,都是O(1)操作。

    总结:

    1.String 的isEmpty 和 count==0判断效率是一样的,并不像想象中 count==0 会迭代集合里面的全部元素 O(n)。
    2.Array这些符合RandomAccessCollection协议的类型同样,isEmpty 和 count==0判断效率是一样的 O(1)。
    3.另外呢对于数据量大的数组,注意数据过大导致指数级增长的扩容带来的性能消耗。
    4.数组的变更注意如果不是唯一引用的情况,要注意线程安全处理,因为会摧毁旧的数组,创建扩容两倍长度于旧数组的新数组,把旧数组的元素拷贝过来,再把元素插入到新数组中,会形成新的数组地址。

    参考资料:
    官方文档
    Check if a Swift Array Is Empty: Best Practice Revisited
    Swift数组扩容原理

    相关文章

      网友评论

          本文标题:isEmpty vs count == 0

          本文链接:https://www.haomeiwen.com/subject/jyqgxktx.html