Swift标准库源码之旅 -Collection

作者: Zafir_zzf | 来源:发表于2020-10-09 15:09 被阅读0次

Swift标准库源码之旅 -Collection
Swift标准库源码之旅 - SequenceAlgorithm
Swift标准库源码之旅 - LazySequence
Swift标准库源码之旅 - Zip.swift
Swift标准库源码之旅 - Sequence.swift
数据结构与算法三：Swift Standard Library
Swift标准库源码之旅 -其它集合协议
Swift -- 标准库源码分析
iOS 开发-查看swift源码方法
我从55个Swift标准库协议中学到了什么？

背景

Collection协议是继Sequence之后第二基础的一个容器协议. 距离咱们常用的Array其实还差很远.

选一条比较重要的继承链是下面这样的.

Collection -> BidirectionalCollection -> RandomAccessCollection -> Array

此外还有MutableCollection和RangeReplaceableCollection共同构成Array的各个功能。
这么多的协议一开始是让人头大的. 甚至会让人怀疑是否真的需要拆分这么多协议. 带着疑问先从最上面的Collection来吧.

与Sequence的区别

首先Collection协议遵循Sequence协议，它继承了Sequence的所有功能，此外，在其基础上增加了一些能力。

Collection的遍历是无损耗的。即多次遍历的结果都是相同的，而Sequence无法保证可以进行多次遍历
Collection的元素是有限个，一定会迭代结束。所以有一个Count属性可以获取其个数
可以通过索引访问单个元素

实现一个最低要求的Collection

目前有一个Family的数据类型定义如下

struct Family {
    let father: String
    let me: String
    let son: String
}

让这个Family类型成为一个Collection(虽然它其实并不像是一个Collection)只需要实现下面几个方法

struct Family: Collection {
    
    let father: String
    let me: String
    let son: String

    var startIndex: Int { 0 }
    var endIndex: Int { 3 }

    func index(after i: Int) -> Int {
        return i + 1
    }
    
    subscript(position: Int) -> String {
        get {
            switch position {
            case 0: return father
            case 1: return me
            case 2: return son
            default:
                fatalError()
            }
        }
    }
    
}

第一步要定义startIndex和endIndex及Index的类型(这里是Int)。这样就让这个集合有了首和尾，计算count和遍历都要用到这两个index。这里我们的family只有是写死的0和3。

第二步是要表明遍历时如何进行更新索引的index(after)方法。大部分索引类型为Int的集合都是i + 1。但对于有些类型则未必，所以将这个方法交给了外部去实现。

第三部是要实现如何进行索引访问单个值。

这样Collection便具备了上面所说的应该具有的几个能力。但是Collection协议本身的定义比上面看起来复杂的多

Collection协议的定义

protocol Collection: Sequence {
    override associatedtype Element
    associatedtype Index: Comparable
    var startIndex: Index { get }
    var endIndex: Index { get }

    func index(after i: Index) -> Index

    override __consuming func makeIterator() -> Iterator

    subscript(position: Index) -> Element { get }
    subscript(bounds: Range<Index>) -> SubSequence { get }

    associatedtype Indices: Collection = DefaultIndices<Self>
    where Indices.Element == Index, 
          Indices.Index == Index,
          Indices.SubSequence == Indices

    var indices: Indices { get }
  
    var isEmpty: Bool { get }

    var count: Int { get }

    func index(_ i: Index, offsetBy distance: Int) -> Index

    func formIndex(after i: inout Index)
}

虽然定义了很多方法，但除了必须自己实现的几个，其它的都有了默认实现。放到协议的定义里是提供自己实现的可能以供扩展。

比如要实现遍历方法所需要的Iterator，默认类型是IndexingIterator

associatedtype Iterator = IndexingIterator<Self>

这个迭代器通过startIndex/endIndex和index(after:)方法进行遍历逻辑，我们基本不需要自己去定义实现一个别的迭代器类型

extension Collection where Iterator == IndexingIterator<Self> {
  func makeIterator() -> IndexingIterator<Self> {
    return IndexingIterator(_elements: self)
  }
}

我们直接看一下它的next实现吧

mutating func next() -> Elements.Element? {
    if _position == _elements.endIndex { return nil }
    let element = _elements[_position]
    _elements.formIndex(after: &_position)
    return element
  }

简单明了.

不过笔者目前还不知道使用formIndex而不直接用_position = _elements.index(after: index)是为什么

extension Collection {
  public func formIndex(after i: inout Index) {
    i = index(after: i)
 }

一些默认实现

// 如果类型使用默认的IndexingIterator帮其实现makeIterator()
extension Collection where Iterator == IndexingIterator<Self> {
  func makeIterator() -> IndexingIterator<Self> {
    return IndexingIterator(_elements: self)
  }
}

// 时间复杂度是常数
var isEmpty: Bool {
    return startIndex == endIndex
 }

// 如果遵循了RandomAccessCollection复杂度是常数, 不然是O(n)
var count: Int {
    return distance(from: startIndex, to: endIndex)
  }

// 对于RandomAccessCollection此函数需要时间复杂度为常数
func distance(from start: Index, to end: Index) -> Int {
    var start = start
    var count = 0
    while start != end {
      count = count + 1
      formIndex(after: &start)
    }
    return count
  }

var first: Element? {
    let start = startIndex
    if start != endIndex { return self[start] }
    else { return nil }
  }

这个first的实现为什么没有直接用self[startIndex]呢.
我能想到的是如果用startIndex,在判断startIndex != endIndex之后startIndex在其它的线程被修改了到了return self[startIndex]就会发生意料之外的事.
所以先用一个不会改变的常量接下可能会发生改变的属性是确保线程安全的一种方法。

Indices, SubSequence

associatedtype Indices: Collection = DefaultIndices<Self>
    where Indices.Element == Index, 
          Indices.Index == Index,
          Indices.SubSequence == Indices

var indices: Indices { get }

索引集合 Indicaes: A collection of indices for an arbitrary collection

一个Collection所有索引的一个集合. 在索引类型为Int时候因为我们有endIndex和count属性, 笔者目前想不到其使用场景有哪些. 可能看到String类型对此的实现会明白

struct DefaultIndices<Base: MyCollection> {
    let _base: Base
    let _startIndex: Base.Index
    let _endIndex: Base.Index
    
    init(base: Base, startIndex: Base.Index, endIndex: Base.Index) {
        (self._base, self._startIndex, self._endIndex) = (base, startIndex, endIndex)
    }
}

extension DefaultIndices: MyCollection {
    
    typealias Index = Base.Index
    typealias Element = Base.Index
    typealias Indices = DefaultIndices<Base>
    typealias SubSequence = DefaultIndices<Base>
    
    var startIndex: Index { self._startIndex }
    var endIndex: Index { self._endIndex }
    
    subscript(position: Index) -> Index {
        position
    }
    
    var indices: DefaultIndices<Base> {
        self
    }
    
    subscript(bounds: Range<Base.Index>) -> DefaultIndices<Base> {
        .init(base: _base, startIndex: _startIndex, endIndex: _endIndex)
    }
    func index(after i: Index) -> Index {
        _base.index(after: i)
    }
}

切片

associatedtype SubSequence: Collection = Slice<Self>
  where SubSequence.Index == Index,
        Element == SubSequence.Element,
        SubSequence.SubSequence == SubSequence

对集合类型截取其中某一段返回的类型就是切片类型Slice, 也被命名为SubSequence, 它是一个存储了原集合的引用和起始结束索引的类型. 跟Indices很像, 但是它的下标取值方法返回的是原集合的Element.

/// - Complexity: O(1)
subscript(bounds: Range<Index>) -> SubSequence { get }

可以设想,如果想截取集合的某一段不返回一个切片类型而是直接返回一个新集合(比如Array), 多余的空间开销确实是没必要的,因为我们截取之后可能只是用来一次遍历.

struct Slice<Base: MyCollection> {
    let startIndex: Base.Index
    let endIndex: Base.Index
    let base: Base
    
    init(base: Base, bounds: Range<Base.Index>) {
        self.base = base
        self.startIndex = bounds.lowerBound
        self.endIndex = bounds.upperBound
    }
}

extension Slice: MyCollection {
    
    typealias Index = Base.Index
    typealias Element = Base.Element
    typealias Iterator = MyIndexIterator<Slice<Base>>
    
    func index(after i: Base.Index) -> Base.Index {
        base.index(after: i)
    }
    
    subscript(position: Base.Index) -> Base.Element {
        base[position]
    }
}

Swift标准库源码之旅 -Collection
背景 Collection协议是继Sequence之后第二基础的一个容器协议. 距离咱们常用的Array其实还差很...
Swift标准库源码之旅 - SequenceAlgorithm
SequenceAlgorithms.swift EnumeratedSequence 每一个编程语言对集合类型的...
Swift标准库源码之旅 - LazySequence
背景 Lazy sequences can be used to avoid needless storage a...
Swift标准库源码之旅 - Zip.swift
zip在Swift里是一个全局函数，参数接收两个Sequence，返回一个可以遍历两个sequence的Eleme...
Swift标准库源码之旅 - Sequence.swift
背景将sequence作为我阅读Swift源码的第一篇原因是集合类型是一个编程语言中可以说是使用非常广泛而且它们...
数据结构与算法三：Swift Standard Library
Swift 标准库（Swift Standard Library） Swift 标准库是包含 Swift 语言核心...
Swift标准库源码之旅 -其它集合协议
BidirectionalCollection A collection that supports backwa...
Swift -- 标准库源码分析
Swift源码简介 Swift于2015年正式开源，github地址：https://github.com/app...
iOS 开发-查看swift源码方法
在进行完 GYP 预处理后，阅读 Swift 标准库源码的最简单的一种方式是执行一次完整的 Swift 编译。（另...
我从55个Swift标准库协议中学到了什么？
我从55个Swift标准库协议中学到了什么？我从55个Swift标准库协议中学到了什么？

Swift标准库源码之旅 -Collection

背景

与Sequence的区别

实现一个最低要求的Collection

Collection协议的定义

一些默认实现

Indices, SubSequence

相关文章

Swift标准库源码之旅 -Collection

Swift标准库源码之旅 - SequenceAlgorithm

Swift标准库源码之旅 - LazySequence

Swift标准库源码之旅 - Zip.swift

Swift标准库源码之旅 - Sequence.swift

数据结构与算法三：Swift Standard Library

Swift标准库源码之旅 -其它集合协议

Swift -- 标准库源码分析

iOS 开发-查看swift源码方法

我从55个Swift标准库协议中学到了什么？

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读