![](https://img.haomeiwen.com/i14495828/94f801b53f7f62ed.jpg)
译者注:
原文后还附有相关视频,原文地址:https://marcosantadev.com/implement-cache-lru-swift/
Introduction
概述
A cache LRU () is similar to a dictionary. It stores data associated to a key. The difference between a dictionary and a Cache is that the latter has a limited capacity. Every time we reach the capacity, the Cache deletes the least recently used element.
LRU(Least Recently Used)缓存和字典相似,存储着key和对应的data。和字典不同的是缓存有着容量限制。每次到达容量限制后,缓存会删除最近使用的元素。
In this article, we are going to see how to implement a Cache LRU with Swift.
本文中我们来看下如何用Swift实现一个 LRU 缓存。
Happy Reading!
Getting Started
开始
First of all, we must understand what data structures we should use to implement our Cache LRU. There are different ways to implement it. In this version, we will use:
首先我们要理解实现LRU Cache的数据结构。LRU Cache有许多种实现方式。本文中我们会用到:
Doubly Linked List: This is the core of our implementation. We need this list to store the elements of our cache. We don’t use an Array because it would be slower. The Cache LRU policy is to move the elements recently used in the head very often. If we move an element in the head—at the index 0—in an array, we should perform a shift to right for all the other elements.
双向链表:这是我们实现的核心。我们需要链表存储Cache的元素。我们不用数组的原因是太慢了。LRU 缓存的策略是频繁移动在头部位置最近使用的元素。如果我们移动在头部的元素(也就是Array中index 0的元素),我们会平移其它所有的元素。
Dictionary<Key, ListNode>: The problem of using a doubly linked list is that its lookup complexity is O(n). We can solve this bottleneck using a dictionary—which has a lookup complexity of O(1). We’ll use this dictionary to store the nodes of our list.
Dictionary<Key, ListNode>: 用双向链表的问题在于它的查询复杂度是O(n)。我们用字典可以解决这个瓶颈(查询复杂度为O(1))。我们会用字典存储链表的节点。
In the next section, we are going to see how to implement the doubly linked list. If you already know it, feel free to jump at the section Cache LRU.
在下一节中,我们会看看如何实现双向链表。如果你已经知道,可以跳到Cache LRU这一节。
Doubly Linked List
双向链表
For this article, we don’t need a complete doubly linked list implementation. For this reason, we’ll implement only the methods used in the Cache.
The first step is to create a new class DoublyLinkedList which accepts a generic value T to store inside the nodes:
本文中,我们不需要完整的双向链表实现。因此,我们仅实现缓存所需的方法。
第一步是创建DoublyLinkedList类(接受用于存储在节点的值泛型T)
final class DoublyLinkedList {
}
Then, we must create a class for the nodes:
然后,我创建一个链表类:
final class DoublyLinkedList<T> {
final class Node<T> {
var payload: T
var previous: Node<T>?
var next: Node<T>?
init(payload: T) {
self.payload = payload
}
}
}
In this implementation, we use a nested Node class. If you use a Swift version older than 3.1, you must create this class outside DoublyLinkedList. Nested classes with generic values are supported from Swift 3.1.
在这个实现中,我们使用了Node嵌套类,如果使用老的Swift版本(<3.1),你必须在DoublyLinkedList类外创建Node类.
Then, we must provide the information of how many elements are stored in the list:
然后我们提供一个链表存储多少元素的属性。
private(set) var count: Int = 0
The operations on a linked list may be sometimes complex to implement. For this reason, we can store the first and last elements to keep our life easier:
链表的操作有时实现起来太复杂。因此我们存储下第一个和最后一个元素(为了生活轻松点...)。
private var head: Node<T>?
private var tail: Node<T>?
Now, we can start implementing the list methods:
现在我们开始实现链表的方法:
addHead
We need a method to add a new element in the list. We add it in the head to be compliant with the Cache LRU policy—a new element is the recently used:
我们需要一个往链表里添加一个新元素的方法。我们根据LRU缓存策略在头部添加最近用的新元素:
func addHead(_ payload: T) -> Node<T> {
let node = Node(payload: payload)
defer {
head = node
count += 1
}
guard let head = head else {
tail = node
return node
}
head.previous = node
node.previous = nil
node.next = head
return node
}
moveToHead
The concept of a Cache LRU is keeping the recently element used at the beginning of our list. For this reason, we need a method to move a node at the head:
LRU Cache的概念是将最近用过的元素保存在链表的头部。因此我们需要把节点移到头部的方法:
func moveToHead(_ node: Node<T>) {
guard node !== head else { return }
let previous = node.previous
let next = node.next
previous?.next = next
next?.previous = previous
node.next = head
node.previous = nil
if node === tail {
tail = previous
}
self.head = node
}
removeLast
When our Cache is full, we need a method to remove the last element—which is the least recently used:
当缓存满了,我们就需要移除最后一个元素的方法(也就是最近最少使用的元素):
func removeLast() -> Node<T>? {
guard let tail = self.tail else { return nil }
let previous = tail.previous
previous?.next = nil
self.tail = previous
if count == 1 {
head = nil
}
count -= 1
// 1
return tail
}
1.The value of this tail is not the same of self.tail. It’s the value of the old tail which comes from the optional binding in the guard at the beginning of this method.
1.tail的值和self.tail的值不一样。tail的值是老的tail值(来自方法开始guard位置的optional绑定)。
译者注:也就是tail是被删除的元素,在方法结束时返回
Finally, we can add a typealias for our Node type to use in the Cache implementation:
最后,我们可以在Cache实现文件中增加一个Node类型的别名:
typealias DoublyLinkedListNode<T> = DoublyLinkedList<T>.Node<T>
The final version of our list implementation should be like this:
链表的最终实现如下:
typealias DoublyLinkedListNode<T> = DoublyLinkedList<T>.Node<T>
final class DoublyLinkedList<T> {
final class Node<T> {
var payload: T
var previous: Node<T>?
var next: Node<T>?
init(payload: T) {
self.payload = payload
}
}
private(set) var count: Int = 0
private var head: Node<T>?
private var tail: Node<T>?
func addHead(_ payload: T) -> Node<T> {
let node = Node(payload: payload)
defer {
head = node
count += 1
}
guard let head = head else {
tail = node
return node
}
head.previous = node
node.previous = nil
node.next = head
return node
}
func moveToHead(_ node: Node<T>) {
guard node !== head else { return }
let previous = node.previous
let next = node.next
previous?.next = next
next?.previous = previous
node.next = head
node.previous = nil
if node === tail {
tail = previous
}
self.head = node
}
func removeLast() -> Node<T>? {
guard let tail = self.tail else { return nil }
let previous = tail.previous
previous?.next = nil
self.tail = previous
if count == 1 {
head = nil
}
count -= 1
return tail
}
}
Cache LRU
It’s time to implement our Cache. We can start creating a new CacheLRU class:
现在该实现Cache了。我们创建一个CacheLRU类:
final class CacheLRU<Key: Hashable, Value> {
The generic value Key must be Hashable since it’s the key of the value stored in the doubly linked list.
泛型值Key必须是可哈希化的,因为它是存储在双链表中的值的键。
A Cache stores data associated to keys like a dictionary. Unfortunately, our doubly linked list accepts only a value payload and not also a key. To solve this problem, we can create a struct which wraps the value and its key. In this way, our list nodes will store the object CachePayload which contains both value and key:
缓存像字典一样存储着和key和对应的data。不幸的是,我们的双向链表只能存储payload而不能存储
key。为了解决这个问题,我们可以创建带有value和key的结构体。通过这种方法,我们的链表节点就可以存储
CachePayload(包含value和key)对象了。
final class CacheLRU<Key: Hashable, Value> {
private struct CachePayload {
let key: Key
let value: Value
}
}
Then, we should add the two data structures—a doubly linked list and a dictionary:
然后,我们添加两个数据结构--双向链表和字典:
private let list = DoublyLinkedList<CachePayload>()
private var nodesDict = [Key: DoublyLinkedListNode<CachePayload>]()
As we saw in Introduction, Cache LRU has a limited capacity. We can inject this capacity in the init method and store it in a private property:
正如概述所说,LRU 缓存有着容量限制。我们可以在init方法里初始化这个属性,并使用私有property存储。
private let capacity: Int
init(capacity: Int) {
self.capacity = max(0, capacity)
}
We use the method max to avoid invalid capacity values less than zero.
我们使用max函数避免capacity小于0的无效值。
Now, we can implement the two Cache methods to get and set the elements:
现在,我们实现下Cache的get元素、set元素方法:
setValue
With the method set, we can add/update an element for a specific key. The value is always moved at the beginning of the list as recently used element:
通过set方法,我可以给指定的key添加、更新对应的元素值。最近用过的元素值总是被移动到链表的开头:
func setValue(_ value: Value, for key: Key) {
// 1
let payload = CachePayload(key: key, value: value)
// 2
if let node = nodesDict[key] {
node.payload = payload
list.moveToHead(node)
} else {
let node = list.addHead(payload)
nodesDict[key] = node
}
// 3
if list.count > capacity {
let nodeRemoved = list.removeLast()
if let key = nodeRemoved?.payload.key {
nodesDict[key] = nil
}
}
}
1.Create a payload object to wrap key and value to be stored in the list.
2.If the list already stores an element for that specific key, we update the value and move it at the beginning of the list. Otherwise, we create a new node and add it as head of the list.
3.If we exceeded the capacity of the cache adding the new element, we remove the last element of the list.
1.创建要存储在链表中的payload对象(包含key、value)
2.如果列表已经存储了指定key的元素,我们就更新值然后移动到链表的头部。否则,我们创建一个新节点并把它加到链表的头部
3.如果在新增元素时超过了缓存限制,我们就移除链表最后一个元素。
getValue
With the method get, we can retrieve an element for a specific key. Every time we retrieve an element, it is moved at the beginning of the list as recently used element:
通过get方法,我们可以通过指定key获取对应元素。每次我们取元素时,链表就移动最近用过的元素到链表的开头:
func getValue(for key: Key) -> Value? {
guard let node = nodesDict[key] else { return nil }
list.moveToHead(node)
return node.payload.value
}
The final version of our Cache implementation should be this:
最终Cache的实现如下:
final class CacheLRU<Key: Hashable, Value> {
private struct CachePayload {
let key: Key
let value: Value
}
private let capacity: Int
private let list = DoublyLinkedList<CachePayload>()
private var nodesDict = [Key: DoublyLinkedListNode<CachePayload>]()
init(capacity: Int) {
self.capacity = max(0, capacity)
}
func setValue(_ value: Value, for key: Key) {
let payload = CachePayload(key: key, value: value)
if let node = nodesDict[key] {
node.payload = payload
list.moveToHead(node)
} else {
let node = list.addHead(payload)
nodesDict[key] = node
}
if list.count > capacity {
let nodeRemoved = list.removeLast()
if let key = nodeRemoved?.payload.key {
nodesDict[key] = nil
}
}
}
func getValue(for key: Key) -> Value? {
guard let node = nodesDict[key] else { return nil }
list.moveToHead(node)
return node.payload.value
}
}
We can use this cache like this:
我们可以这样使用cache:
let cache = CacheLRU<Int, String>(capacity: 2)
cache.getValue(for: 5) // nil
cache.setValue("One", for: 1)
cache.setValue("Eleven", for: 11)
cache.setValue("Twenty", for: 20)
cache.getValue(for: 1) // nil. We exceeded the capacity with the previous `setValue` and `1` was the last element.
cache.getValue(for: 11) // Eleven
You can find a Gist with the Cache LRU implementation here.
你可以在Gist连接里查看Cache LRU的实现
Conclusion
That’s all for our Cache LRU.
到此就是Cache LRU介绍的全部了。
Nowadays, we have a lot of memory available for our Apps. Despite this, we may need a Cache with a limited capacity to save memory space. For example, when we have to cache objects which are space-consuming like the images.
现今,我们的App有好多可用内存。尽管如此,我们仍然可能需要带有容量限制的缓存来节省内存。例如,当我们遇到占用空间大的对象,比如说image。
Update:
I’ve figured out that Array is faster than linked list. Since the pure version of Cache LRU uses a doubly linked list, I leave the current implementation. But, keep in mind that with a Swift Array we would have a faster implementation.
我发现数组比链表快。由于LRU缓存基础版本使用了双向链表,所以我将保留当前的实现。但是请留意,Swift 数组有着更快的实现。
译者的补充
Swift数组内部实现可能和C++ deque类似,并不是数组就比链表快,deque可以在内存占用空间和速度做比较好的平衡,原文评论中也有人指出使用heap堆也可实现LRU。
原文作者的核心思路是使用一个node字典和一个双向链表,node字典在查找时使用(get方法),双向链表用于实现删除最久没用的元素。
实际项目中更多的是用对象占用内存大小来计算limitSize。
有兴趣的同学可以看下https://github.com/robinzhangx/LruCache里的实现,其中考虑到了线程安全,核心思路是使用:
NSMutableDictionary来存储对象值(用于查找)
NSMutableOrderedSet用于实现LRU淘汰机制(类似本文中的双向链表),其中存储着key,到达缓存上限时找到最后的key,删除对应NSMutableDictionary中的数据对象
其中实现思路比较好的一点是类的抽象程度比较高,实现子类去告诉对象占用内存大小。
网友评论