美文网首页
isEmpty vs count == 0

isEmpty vs count == 0

作者: 土豆董 | 来源:发表于2020-06-20 22:38 被阅读0次

我们需要检查一个数组,集合, 字符串或者是其他的集合类型是空,这在日常开发中是常见的操作之一。之前在oc中会这么判断,我们先排除字符串为nil 或者 null的情况:

- (void)test6{
    NSString *text = @"hello world";
    NSLog(@"%ld",[text length]);

    NSArray *array = @[@"a",@"b",@"c"];
    NSLog(@"%ld",[array count]);
}

2020-06-20 16:32:49.657371+0800 objcTest[99024:8064113] 11
2020-06-20 16:32:49.657554+0800 objcTest[99024:8064113] 3

Swift中的运用

按照以往的习惯我们可能会这么使用

func test(){
        let text = "hello world"
        print("\(text.count)")
        let array = ["a","b","c"]
        print("\(array.count)")
 }
输出:11
输出:3

这么写也是没有一点毛病,但是我们会发现在String系列 以及Array系列都会有个isEmpty属性
在String.Swift中有isEmpty的描述

/// To check whether a string is empty, use its `isEmpty` property instead of
/// comparing the length of one of the views to `0`. Unlike with `isEmpty`,
/// calculating a view's `count` property requires iterating through the
/// elements of the string.

大意就是如果要判断string是否是空,请用isEmpty属性,而不要用count属性,因为这会迭代整个string的元素。
我们知道在Swift中面向协议的思想贯穿始终,我们在Collection协议中找到isEmpty的定义

public protocol Collection: Sequence {
/// A Boolean value indicating whether the collection is empty.
  ///
  /// When you need to check whether your collection is empty, use the
  /// `isEmpty` property instead of checking that the `count` property is
  /// equal to zero. For collections that don't conform to
  /// `RandomAccessCollection`, accessing the `count` property iterates
  /// through the elements of the collection.
  /// - Complexity: O(1)
  var isEmpty: Bool { get }
}

同样是判断空建议用isEmpty ,这里还补充了一下 对于不符合 RandomAccessCollection 协议的类型,count 属性会迭代遍历集合里面的全部元素。

public protocol RandomAccessCollection: BidirectionalCollection
where SubSequence: RandomAccessCollection, Indices: RandomAccessCollection
{
...
  override var startIndex: Index { get }
  override var endIndex: Index { get }
...
}

我们现在只讨论两个典型的类型String和 Array,它们都符合Collection协议。

String的isEmpty、count 比较

先上代码

func test(){
        
        let text = "hello world,coding is happy,so we will going all along"
        let time1 = CFAbsoluteTimeGetCurrent()
        for i in 0..<500000 {
            print("\(text.count == 0) index:\(i)")
        }
        let time2 = CFAbsoluteTimeGetCurrent()
        print("time:\(time2 - time1)")
    }
time:4.98923397064209
func test6(){
        let text = "hello world,coding is happy,so we will going all along"
        let time1 = CFAbsoluteTimeGetCurrent()
        for i in 0..<500000 {
            print("\(text.isEmpty) index:\(i)")
        }
        let time2 = CFAbsoluteTimeGetCurrent()
        print("time:\(time2 - time1)")
    }
time:5.166522026062012

这不对啊,这和Swift源码代码注释说的不大一样啊,两个几乎差不多,甚至isEmpty 比count更加耗时,另外Check if a Swift Array Is Empty: Best Practice Revisited 说的,也是String 建议用isEmpty 来判断是否为空,不然会迭代整个集合的元素来计算。见鬼了,还是去翻翻源码把。

public struct String {
  public // @SPI(Foundation)
  var _guts: _StringGuts

  @inlinable @inline(__always)
  internal init(_ _guts: _StringGuts) {
    self._guts = _guts
    _invariantCheck()
  }
...
}

在StringLegacy.swift 中

extension String {
...
  /// A Boolean value indicating whether a string has no characters.
  @inlinable
  public var isEmpty: Bool {
    @inline(__always) get { return _guts.isEmpty }
  }
}

从这些能知道String的本质是_StringGuts ,String的方法基本都是基于_StringGuts构建的,在_StringGuts中

xtension _StringGuts {
  // The number of code units
  @inlinable @inline(__always)
  internal var count: Int { return _object.count }

  @inlinable @inline(__always)
  internal var isEmpty: Bool { return count == 0 }
...
}

能看出来其实String 的isEmpty 和 count 其实还是一样的,isEmpty的具体实现还是拿count == 0来判断的

暂时结论:在String中,isEmpty 和 count 其实是相同的性能,都是通过迭代集合内元素来计算得出
为啥是暂时结论呢,因为我也不明白为啥文档说String的isEmpty 是O(1)操作 现在看来是O(n)操作。

Array的isEmpty、count 比较(也包含ContiguousArray)

在Collection协议中

public var isEmpty: Bool {
    return startIndex == endIndex
  }

Array.swift中

@inlinable
  public var startIndex: Int {
    return 0
  }
...
/// If the array is empty, `endIndex` is equal to `startIndex`.
  @inlinable
  public var endIndex: Int {
    @inlinable
    get {
      return _getCount()
    }
  }
...
@_semantics("array.get_count")
  internal func _getCount() -> Int {
    return _buffer.immutableCount
  }

immutableCount 是从_ArrayBody里面的count的值取出来的,既然Array 是符合 RandomAccessCollection协议的,count 和 isEmpty 的操作性能几乎一样都是O(1)操作,从哪里得出来的呢,得需要从array的扩容来研究,先从append方法来看

@inlinable
  @_semantics("array.append_element")
  public mutating func append(_ newElement: __owned Element) {
    // Separating uniqueness check and capacity check allows hoisting the
    // uniqueness check out of a loop.
    _makeUniqueAndReserveCapacityIfNotUnique()
    let oldCount = _buffer.mutableCount
    _reserveCapacityAssumingUniqueBuffer(oldCount: oldCount)
    _appendElementAssumeUniqueAndCapacity(oldCount, newElement: newElement)
    _endMutation()
  }

#源码分析
·第一行的函数,看名字表示,如果这个数组不是惟一的,就把它变成惟一的,而且保留数组容量。在Xcode里面搜索一下这个函数:

@inlinable
  @_semantics("array.make_mutable")
  internal mutating func _makeUniqueAndReserveCapacityIfNotUnique() {
    if _slowPath(!_buffer.beginCOWMutation()) {
      _createNewBuffer(bufferIsUnique: false,
                       minimumCapacity: count + 1,
                       growForAppend: true)
    }
  }
/// Returns `true` and puts the buffer in a mutable state iff the buffer's
  /// storage is uniquely-referenced.
  ///
  /// - Precondition: The buffer must be immutable.
  ///
  /// - Warning: It's a requirement to call `beginCOWMutation` before the buffer
  ///   is mutated.
  @_alwaysEmitIntoClient
  internal mutating func beginCOWMutation() -> Bool {
    let isUnique: Bool
    if !_isClassOrObjCExistential(Element.self) {
      isUnique = _storage.beginCOWMutationUnflaggedNative()
    } else if !_storage.beginCOWMutationNative() {
      return false
    } else {
      isUnique = _isNative
    }
#if INTERNAL_CHECKS_ENABLED
    if isUnique {
      _native.isImmutable = false
    }
#endif
    return isUnique
  }

如果数组不是唯一引用的,就会调用_createNewBuffer创建新的数组buffer,假设我们的数组是唯一引用的,这个方法就不会执行 ,咱们继续走
·第二行用一个变量oldCount保存数组当前长度。
·第三行表示:保存数组容量,假如是唯一引用的情况下。之所以做出这样的假设,是因为此前已经调用过_makeUniqueAndReserveCapacityIfNotUnique方法,即使这个数组不是唯一引用,也被复制了一份新的。我们来看看_reserveCapacityAssumingUniqueBuffer方法的实现:

@inlinable
  @_semantics("array.mutate_unknown")
  internal mutating func _reserveCapacityAssumingUniqueBuffer(oldCount: Int) {
    // Due to make_mutable hoisting the situation can arise where we hoist
    // _makeMutableAndUnique out of loop and use it to replace
    // _makeUniqueAndReserveCapacityIfNotUnique that preceeds this call. If the
    // array was empty _makeMutableAndUnique does not replace the empty array
    // buffer by a unique buffer (it just replaces it by the empty array
    // singleton).
    // This specific case is okay because we will make the buffer unique in this
    // function because we request a capacity > 0 and therefore _copyToNewBuffer
    // will be called creating a new buffer.
    let capacity = _buffer.mutableCapacity
    _internalInvariant(capacity == 0 || _buffer.isMutableAndUniquelyReferenced())

    if _slowPath(oldCount + 1 > capacity) {
      _createNewBuffer(bufferIsUnique: capacity > 0,
                       minimumCapacity: oldCount + 1,
                       growForAppend: true)
    }
  }

检查如果oldCount + 1 > capacity 那就说明容量不够了,就需要扩容了,这时候调用_createNewBuffer 方法

@_alwaysEmitIntoClient
  internal mutating func _createNewBuffer(
    bufferIsUnique: Bool, minimumCapacity: Int, growForAppend: Bool
  ) {
    _internalInvariant(!bufferIsUnique || _buffer.isUniquelyReferenced())
    _buffer = _buffer._consumeAndCreateNew(bufferIsUnique: bufferIsUnique,
                                           minimumCapacity: minimumCapacity,
                                           growForAppend: growForAppend)
  }
internal __consuming func _consumeAndCreateNew(
    bufferIsUnique: Bool, minimumCapacity: Int, growForAppend: Bool
  ) -> _ArrayBuffer {
    let newCapacity = _growArrayCapacity(oldCapacity: capacity,
                                         minimumCapacity: minimumCapacity,
                                         growForAppend: growForAppend)
    let c = count
    _internalInvariant(newCapacity >= c)
    
    let newBuffer = _ContiguousArrayBuffer<Element>(
      _uninitializedCount: c, minimumCapacity: newCapacity)

    if bufferIsUnique {
      // As an optimization, if the original buffer is unique, we can just move
      // the elements instead of copying.
      let dest = newBuffer.firstElementAddress
      dest.moveInitialize(from: mutableFirstElementAddress,
                          count: c)
      _native.mutableCount = 0
    } else {
      _copyContents(
        subRange: 0..<c,
        initializing: newBuffer.mutableFirstElementAddress)
    }
    return _ArrayBuffer(_buffer: newBuffer, shiftedToStartIndex: 0)
  }

另外提下:在_consumeAndCreateNew方法中, _growArrayCapacity 方法里面做了对容量的处理

internal func _growArrayCapacity(_ capacity: Int) -> Int {
  return capacity * 2
}

@_alwaysEmitIntoClient
internal func _growArrayCapacity(
  oldCapacity: Int, minimumCapacity: Int, growForAppend: Bool
) -> Int {
  if growForAppend {
    if oldCapacity < minimumCapacity {
      // When appending to an array, grow exponentially.
      return Swift.max(minimumCapacity, _growArrayCapacity(oldCapacity))
    }
    return oldCapacity
  }
  // If not for append, just use the specified capacity, ignoring oldCapacity.
  // This means that we "shrink" the buffer in case minimumCapacity is less
  // than oldCapacity.
  return minimumCapacity
}

当旧的数组容量小于minimumCapacity的时候,minimumCapacity就是oldCount + 1,这时候容量的扩容其实是呈指数增长,每次扩容,旧的容量就翻两倍。
同时也解释了append方法不是线程安全的,如果在某一个线程中插入新元素,导致了数组扩容,那么Swift会创建一个新的数组(意味着地址完全不同)。然后ARC会为我们自动释放旧的数组,但这时候可能另一个线程还在访问旧的数组对象。

初始化新的newBuffer 用_ContiguousArrayBuffer构造器 同时对数组storageHeader进行初始化

internal func _initStorageHeader(count: Int, capacity: Int) {
#if _runtime(_ObjC)
    let verbatim = _isBridgedVerbatimToObjectiveC(Element.self)
#else
    let verbatim = false
#endif

    // We can initialize by assignment because _ArrayBody is a trivial type,
    // i.e. contains no references.
    _storage.countAndCapacity = _ArrayBody(
      count: count,
      capacity: capacity,
      elementTypeIsBridgedVerbatim: verbatim)
  }

数组storage起始于_ArrayBody,结束于连续的数组元素 , _ArrayBody主要是描述当前array 的信息同理ContiguousArray也是,此时数组count ,capacity等都会到_ArrayBody 对象里;
取count最终我们会追踪到

internal var count: Int {
    get {
      return _storage.countAndCapacity.count
    }
    nonmutating set {
      _internalInvariant(newValue >= 0)

      _internalInvariant(
        newValue <= mutableCapacity,
        "Can't grow an array buffer past its capacity")

      mutableStorage.countAndCapacity.count = newValue
    }
  }

我们会直接从countAndCapacity(_ArrayBody类型)里面取出count,操作为O(1),另外array的isEmpty是根据startIndex endIndex是否相等来判断的,endIndex 取值就是取count,所以对于array 这些服从RandomAccessCollection协议的类型来说,isEmpty 和 count == 0 判断是一模一样的,都是O(1)操作。

总结:

1.String 的isEmpty 和 count==0判断效率是一样的,并不像想象中 count==0 会迭代集合里面的全部元素 O(n)。
2.Array这些符合RandomAccessCollection协议的类型同样,isEmpty 和 count==0判断效率是一样的 O(1)。
3.另外呢对于数据量大的数组,注意数据过大导致指数级增长的扩容带来的性能消耗。
4.数组的变更注意如果不是唯一引用的情况,要注意线程安全处理,因为会摧毁旧的数组,创建扩容两倍长度于旧数组的新数组,把旧数组的元素拷贝过来,再把元素插入到新数组中,会形成新的数组地址。

参考资料:
官方文档
Check if a Swift Array Is Empty: Best Practice Revisited
Swift数组扩容原理

相关文章

网友评论

      本文标题:isEmpty vs count == 0

      本文链接:https://www.haomeiwen.com/subject/jyqgxktx.html