Swift探索（十）: Sequence && Collect

作者: Lee_Toto | 来源:发表于2022-05-31 22:29 被阅读0次

一：Sequence

对于 Sequence 协议来说，表达的是既可以是一个有限的集合，也可以是一个无限的集合，而它只需要提供集合中的元素，和如何访问这些元素的接口即可。

Sequence和Collection的关系.png

1.1 迭代器 Iterator

Sequence 是通过迭代器 Iterator 来访问元素的，那么什么是迭代器？直接来看 for..in 函数

let numbers = [1, 2, 3, 4]

for num in numbers {
    print(num)
}

for..in 函数其实是一种语法糖，他的本质是怎么去调用的呢？编译成 SIL 并定位到 main 函数中 for..in 的调用不重要的代码我就直接省略了

// main
sil @main : $@convention(c) (Int32, UnsafeMutablePointer<Optional<UnsafeMutablePointer<Int8>>>) -> Int32 {
bb0(%0 : $Int32, %1 : $UnsafeMutablePointer<Optional<UnsafeMutablePointer<Int8>>>):
  ...
  // function_ref Collection<>.makeIterator()
  %36 = function_ref @$sSlss16IndexingIteratorVyxG0B0RtzrlE04makeB0ACyF : $@convention(method) <τ_0_0 where τ_0_0 : Collection, τ_0_0.Iterator == IndexingIterator<τ_0_0>> (@in τ_0_0) -> @out IndexingIterator<τ_0_0> // user: %37
  %37 = apply %36<Array<Int>>(%31, %34) : $@convention(method) <τ_0_0 where τ_0_0 : Collection, τ_0_0.Iterator == IndexingIterator<τ_0_0>> (@in τ_0_0) -> @out IndexingIterator<τ_0_0>
  %38 = tuple ()
  dealloc_stack %34 : $*Array<Int>                // id: %39
  br bb1                                          // id: %40

bb1:                                              // Preds: bb3 bb0
  %41 = alloc_stack $Optional<Int>                // users: %47, %44, %48
  %42 = begin_access [modify] [static] %31 : $*IndexingIterator<Array<Int>> // users: %44, %46
  // function_ref IndexingIterator.next()
  %43 = function_ref @$ss16IndexingIteratorV4next7ElementQzSgyF : $@convention(method) <τ_0_0 where τ_0_0 : Collection> (@inout IndexingIterator<τ_0_0>) -> @out Optional<τ_0_0.Element> // user: %44
  %44 = apply %43<Array<Int>>(%41, %42) : $@convention(method) <τ_0_0 where τ_0_0 : Collection> (@inout IndexingIterator<τ_0_0>) -> @out Optional<τ_0_0.Element>
  %45 = tuple ()
  end_access %42 : $*IndexingIterator<Array<Int>> // id: %46
  ...
} // end sil function 'main'

// function_ref Collection<>.makeIterator()
%36 = function_ref @$sSlss16IndexingIteratorVyxG0B0RtzrlE04makeB0ACyF : $@convention(method) <τ_0_0 where τ_0_0 : Collection, τ_0_0.Iterator == IndexingIterator<τ_0_0>> (@in τ_0_0) -> @out IndexingIterator<τ_0_0> // user: %37
这里我们可以看到调用了一个 makeIterator() 的方法，这个方法需要两个参数一个是在上文中创建的 IndexingIterator , 一个是 Array 的引用。
// function_ref IndexingIterator.next()
%43 = function_ref @$ss16IndexingIteratorV4next7ElementQzSgyF : $@convention(method) <τ_0_0 where τ_0_0 : Collection> (@inout IndexingIterator<τ_0_0>) -> @out Optional<τ_0_0.Element> // user: %44
这里调用 IndexingIterator 的 next() 方法来遍历数组中一个又一个的元素。
因此可以发现执行 for..in 的时候，本质上会通过 Collection 创建一个迭代器 Iterator，然后把当前的数组传给迭代器 Iterator，最后调用迭代器 Iterator 的 next 方法将数组的元素遍历出来。

接着我们打开 Swift源码定位到 collection.swift 中的 IndexingIterator

@frozen
public struct IndexingIterator<Elements: Collection> {
  @usableFromInline
  internal let _elements: Elements
  @usableFromInline
  internal var _position: Elements.Index

  /// Creates an iterator over the given collection.
  @inlinable
  @inline(__always)
  public /// @testable
  init(_elements: Elements) {
    self._elements = _elements
    self._position = _elements.startIndex
  }

  /// Creates an iterator over the given collection.
  @inlinable
  @inline(__always)
  public /// @testable
  init(_elements: Elements, _position: Elements.Index) {
    self._elements = _elements
    self._position = _position
  }
}

extension IndexingIterator: IteratorProtocol, Sequence {
  public typealias Element = Elements.Element
  public typealias Iterator = IndexingIterator<Elements>
  public typealias SubSequence = AnySequence<Element>

  @inlinable
  @inline(__always)
  public mutating func next() -> Elements.Element? {
    if _position == _elements.endIndex { return nil }
    let element = _elements[_position]
    _elements.formIndex(after: &_position)
    return element
  }
}

可以看见 IndexingIterator 是一个接收泛型参数 Collection 的结构体，并且这个结构体遵循了两个协议，分别是 IteratorProtocol 和 Sequence 。 IteratorProtocol 是一个一次提供一个序列值的类型，它和 Sequence 协议是息息相关的，每次通过创建迭代器来访问序列中的元素。
所以我们每次在使用 for..in 的时候，其实都是使用这个集合的迭代器来遍历当前集合或者序列当中的元素。

1.2 IteratorProtocol协议

在 sequence.swift 文件中定位到 IteratorProtocol

public protocol IteratorProtocol {
  /// The type of element traversed by the iterator.
  associatedtype Element

  mutating func next() -> Element?
}

可以看到有一个关联类型，实现该协议时需要执行 Element 的类型，还有一个 next() 函数返回的是这个关联类型。

1.3 Sequence协议

接着在 sequence.swift 文件中定位到 Sequence

public protocol Sequence {
  /// A type representing the sequence's elements.
  associatedtype Element

  /// A type that provides the sequence's iteration interface and
  /// encapsulates its iteration state.
  associatedtype Iterator: IteratorProtocol where Iterator.Element == Element

  /// A type that represents a subsequence of some of the sequence's elements.
  // associatedtype SubSequence: Sequence = AnySequence<Element>
  //   where Element == SubSequence.Element,
  //         SubSequence.SubSequence == SubSequence
  // typealias SubSequence = AnySequence<Element>

  /// Returns an iterator over the elements of this sequence.
  __consuming func makeIterator() -> Iterator

  ...
}

可以看到这里也是有一个关联类型 Element
还有一个关联类型 Iterator 并且 Iterator 要遵循 IteratorProtocol 协议并且 Iterator 对于协议 IteratorProtocol 的关联类型要与 Sequence 的关联类型相等
有一个创建迭代器的方法 makeIterator() 返回当前关联类型 Iterator (这个就类似与车间一样，Iterator 就是一条流水线)
因此对于 Sequence 协议来说，表达的是既可以是一个有限的集合，也可以是一个无限的集合，而它只需要提供集合中的元素，和如何访问这些元素的接口即可。

1.4 通过Sequence协议自定义有限的集合

直接上代码

struct MyIterator: IteratorProtocol {
    typealias Element = Int
    
    let seq: MySequence
    
    init(_ seq: MySequence) {
        self.seq = seq
    }
    
    var count = 0
    
    // 对于迭代器来说要遍历当前 `sequence` 中的元素
    mutating func next() -> Int? {
        guard count < self.seq.arrayCount else {
            return nil
        }
        count += 1
        return count
    }
}

struct MySequence: Sequence {
    typealias Element = Int
    
    var arrayCount: Int
    
    // 对于 sequence 来说需要一个迭代器
    func makeIterator() -> MyIterator {
        return MyIterator.init(self)
    }
}


var seq = MySequence(arrayCount: 10)

for element in seq {
    print(element)
}

打印结果：
1
2
3
4
5
6
7
8
9
10

对于 sequence 来说需要一个迭代器，而对于对于迭代器来说要遍历当前 sequence 中的元素，所以需要 MyIterator 提供一个遍历的构造函数 init(_ seq: MySequence)
同样的我们也可以创建一个无限的序列，但是在实际开发当中一般不会出现这种场景，所以这里就不继续了。

二：Collection

2.1 环形数组

环形数组是一种用于表示一个头尾相连的缓冲区的数据结构。跟环形队列差不多。接下来通过 Collection 来表达一个环形数组。

// 参照源码
extension FixedWidthInteger {
    /// Returns the next power of two.
    @inlinable
    func nextPowerOf2() -> Self {
        guard self != 0 else {
            return 1
        }

        return 1 << (Self.bitWidth - (self - 1).leadingZeroBitCount)
    }
}


struct RingBuffer <Element>{
    
    //ContiguousArray 可以理解为存粹的 Swift 的数组。和 OC 没有任何关系的数组，它的效率比 Array 更快。
    internal var _buffer: ContiguousArray<Element?>
    internal var headIndex: Int = 0
    internal var tailIndex: Int = 0

    internal var mask: Int{
        return self._buffer.count - 1
    }

    // 初始化方法
    init(initalCapacity: Int) {
        let capcatiy = initalCapacity.nextPowerOf2()

        self._buffer = ContiguousArray<Element?>.init(repeating: nil, count:capcatiy)
    }

    // 移动环形数组的尾
    mutating func advancedTailIndex(by: Int){
        self.tailIndex = self.indexAdvanced(index: self.tailIndex, by: by)
    }

    // 移动环形数组的头
    mutating func advancedHeadIndex(by: Int){
        self.headIndex = self.indexAdvanced(index: self.headIndex, by: by)
    }

    // 获取元素下标
    func indexAdvanced(index: Int, by: Int) -> Int{
        return (index + by) & self.mask
    }

    // 添加新元素
    mutating func append(_ value: Element){
        _buffer[self.tailIndex] = value
        self.advancedTailIndex(by: 1)

        if self.tailIndex == self.headIndex {
            fatalError("out of bounds")
        }
    }

    // 读取元素
    mutating func read() -> Element?{
        let element = _buffer[self.headIndex]
        self.advancedHeadIndex(by: 1)
        return element
    }
}

这个结构体通过一个 ContiguousArray 类型的 _buffer 来记录环形数组的元素，ContiguousArray 可以理解为存粹的 Swift 的数组。和 OC 没有任何关系的数组，它的效率比 Array 更快。

并且通过 headIndex 和 tailIndex 来表示环形数组的起始和结束的位置。以及一些方法。

在初始化方法中 nextPowerOf2 这个方法表示 2^n，这个表示使 capacity 始终为 2^n。

2.2 MutableCollection

MutableCollection 允许集合通过下标修改自身元素。对于 MutableCollection 需要

定义 startIndex 和 endIndex 属性，表示集合起始和结束位置
定义一个只读的下标操作符
实现一个 index(after:) 方法用于在集合中移动索引位置

extension RingBuffer: Collection, MutableCollection {
    var startIndex: Int{
        return self.headIndex
    }

    var endIndex: Int{
        return self.tailIndex
    }

    subscript(position: Int) -> Element? {
        get{
            return self._buffer[position]
        }
        set{
            self._buffer[position] = newValue
        }
    }

    // 移动当前索引的位置
    func index(after i: Int) -> Int {
        return (i + 1) & self.mask
    }
}

这里实现了 subscript 的 get 和 set ，对于 Collection 来说可以不提供 set ,这个时候我们就没有办法通过下标的方式改变自身元素了，所以对 MutableColletion 来说下标语法提供一个 setter 就行。就好比

var array = [1, 2, 3, 4]
array[1] = 5

这里直接通过下标修改元素

2.3 RangeReplaceableCollection

RangeReplaceableCollection 允许集合修改任意区间的元素。对于 RangeReplaceableCollection 我们需要提供一个默认的 init 方法。其他的如果不需要 RangeReplaceableCollection 都有默认的实现。

extension RingBuffer: RangeReplaceableCollection {
    
    init() {
        self.init(initalCapacity: 10)
    }

    mutating func remove(at i: Int) -> Element? {
        var currentIndex = i
        let element = _buffer[i]
        self._buffer[currentIndex] = nil
        
        switch i {
        case self.headIndex:
            self.advancedHeadIndex(by: 1)
        default:
            var nextIndex = self.indexAdvanced(index: i, by: 1)
            while nextIndex != self.tailIndex {
                self._buffer.swapAt(currentIndex, nextIndex)
                currentIndex = nextIndex
                nextIndex = self.indexAdvanced(index: currentIndex, by: 1)
            }
        }
        return element
    }
}

对于环形数组来说这里的 remove 方法：首先都是把当前位置的元素删除，然后根据删除的位置来进行判断

当删除的位置刚好是 headIndex 的位置，那么就将 headIndex 往后移动一个位置。
不是 headIndex 的位置，就需要将后面的元素往前移一个位置，然后再把尾节点指向当前空闲节点的位置。

2.4 BidirectionalCollection

BidirectionalCollection 可以向前或向后遍历集合。比如可以获取最后一个元素、反转序列、快速的获取倒序等等。既然正序是通过 subscript 和 index(after:) 来实现的，那么倒序添加一个 index(before:) 就可以往前递归了，这就好像双向链表 Remove 一样，只不过双向链表获取的是值，而这里的集合获取的都是索引。

extension RingBuffer: BidirectionalCollection{
    func index(before i: Int) -> Int {
        return (i - 1) & self.mask
    }
}

2.5 RandomAccessCollection

RandomAccessCollection 任意访问集合元素。对于 RandomAccessCollection 需要实现 index(_:offsetBy:) 和 distance(from:to:)

extension RingBuffer: RandomAccessCollection{
    func index(_ i: Int, offsetBy distance: Int) -> Int {
        return (i + distance) & self.mask
    }

    func distance(from start: Int, to end: Int) -> Int {
        return end - start
    }
}

当然，对于这个集合 (RingBuffer) 我们不去实现也是可以的。因为我们把 index(after:) 和 index(before:) 已经实现了，对于系统来说，它是可以通过这两个方法来推断当前要移动的距离。但是考虑到效率的问题，在这里直接实现会比系统去推断要好得多，因为直接省去了系统推断的时间。

使用示例代码

var buffer: RingBuffer = RingBuffer<Int>.init(initalCapacity: 10)
for i in 0..<10 {
    buffer.append(i)
}

print(buffer.index(0, offsetBy: 5))

打印结果:
5

Swift探索（十）: Sequence && Collect

一：Sequence

1.1 迭代器 Iterator

1.2 IteratorProtocol协议

1.3 Sequence协议

1.4 通过Sequence协议自定义有限的集合

二：Collection

2.1 环形数组

2.2 MutableCollection

2.3 RangeReplaceableCollection

2.4 BidirectionalCollection

2.5 RandomAccessCollection

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

Swift探索

Swift探索（ 十）: Sequence && Collect

一：Sequence

1.1 迭代器 Iterator

1.2 IteratorProtocol协议

1.3 Sequence协议

1.4 通过Sequence协议自定义有限的集合

二：Collection

2.1 环形数组

2.2 MutableCollection

2.3 RangeReplaceableCollection

2.4 BidirectionalCollection

2.5 RandomAccessCollection

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

Swift探索

Swift探索（十）: Sequence && Collect