LRU和LFU的CPP实现

作者: 山中散人的博客 | 来源:发表于2019-05-01 15:12 被阅读30次

LRU和LFU的CPP实现
LRU和LFU（Go和CPP版本）
Algorithm进阶计划 -- LRU 与 LFU 算法
LRU和LFU
[LeetCode] LRU和LFU详解之一
groupcache 源码系列三 LRU
[LeetCode] LRU和LFU详解之二
LRU 和 LFU 缓存淘汰算法
jodd.cache源码阅读笔记
LRU算法原理与实践

LRU(least recently used cache) LFU(least frequently used cache) 是两种典型的缓存设计，其主要的思想是用一个有限长的队列缓存一系列键值对数据，随着新数据的加入，老数据会被逐渐抛弃，用缓存的遗忘方法来保证重复访问的命中率。

其中LRU中越近访问的数据越不会被遗忘，而LFU中越频繁访问的数据越不容易遗忘，下面我们用CPP来实现这两种缓存模型。

LFU 缓存的实现

首先定义缓存数据节点，节点中除了键值对外，还有节点访问次数和最后访问时间。同时定义了节点排列的顺序，访问次数越小的节点排列越前，当访问次数相同时，更早被访问的节点排前。排列越前意味着节点在缓存队列越容易被遗忘抛弃。

#include <unordered_map>
#include <set>
#include <iostream>
#include <list>

using namespace std;

typedef int key_t;
typedef int val_t;

struct cache_node_t
{
    key_t key;
    val_t val;
    int freq, tick; //freq is the times of visits, tick is the time of recent visit
    //less if is less freq or add before
    bool operator < (const cache_node_t& rhs) const {
        return freq < rhs.freq ||
            (freq == rhs.freq && tick < rhs.tick);
    }
};

然后是LFU缓存队列的数据结构，其中包括了队列当前的时间计数，队列容量，键值对和访问队列。访问队列中最排列前的节点最先被遗忘。

class lfu_cache_t{
    int cur_tick, capacity; //currect time
    unordered_map<key_t, cache_node_t> kv;
    set<cache_node_t>                  hist; //visit queue
};

其后要设计一个更新访问队列节点的函数。每次更新节点的范围次数和最后访问时间，插入访问队列然后重新排序。

private:
    void touch(cache_node_t &node){
        hist.erase(node); //temporary remove node
        node.freq++; //update visit times and last 
        node.tick = ++cur_tick; //visit time
        hist.insert(node); //add node again
    }

接着完成函数加入新键值对。加入新键值对或者更新已有键值对，但是不管如何，被更改的键值节点也要更新。

void put(key_t k, val_t v){
        if(capacity <= 0) //invalid cache
            return;

        if(kv.count(k) > 0){ //update exist k-v
            kv[k].val = v;
        }else{
            //remove old node
            if(hist.size() == capacity){
                cache_node_t evict_node = *hist.cbegin();
                kv.erase(evict_node.key);
                hist.erase(evict_node);
            }

            //insert new node
            cache_node_t new_node{k, v, 0, cur_tick};
            kv[k] = new_node;
            hist.insert(new_node);
        }
        touch(kv[k]); //update node whatever
    }

然后设计函数从缓存中取值，这一步也要更新键值在访问队列中的位置。

 lfu_cache_t(int cap):capacity(cap){}
    int get(key_t k){
        if(kv.count(k) > 0){ //if key exist
            touch(kv[k]);
            return kv[k].val;
        }else{ //if key not valid
            return -1;
        }
    }

最后设计一组测试用例来验证程序的正确性。

void lfu_cache_test(){
    lfu_cache_t cache( 2 /* capacity */ );

    cache.put(1, 1);
    cache.put(2, 2);
    cout<<cache.get(1);       // returns 1
    cache.put(3, 3);    // evicts key 2
    cout<<cache.get(2);       // returns -1 (not found)
    cout<<cache.get(3);       // returns 3.
    cache.put(4, 4);    // evicts key 1.
    cout<<cache.get(1);       // returns -1 (not found)
    cout<<cache.get(3);       // returns 3
    cout<<cache.get(4);       // returns 4
    cout<<endl;
}

2. LRU缓存队列的实现

LRU和LFU不同的是它不受访问次数的影响，缓存节点的抛弃只取决于是否超时无人访问。

这里首先实现LRU的数据结构，核心是访问队列hist，它扮演的是一个队列的角色，新访问的节点从头部加入，超时无访问的节点从尾部抛弃。而键值对的访问通过指针来实现。

class lru_cache_t{
    int                           capacity;
    list<kv_pair>                 hist;
    unordered_map<key_t, 
    list<kv_pair>::iterator>      kv; 
};

然后设计一个加入键值对的函数，若主键存在，更新指针所指向缓存节点中的值，若不存在，新建缓存节点。同时，新访问的缓存节点都要被移动到hist的最前端。一旦缓存队列容量不足以承载节点数，立即抛弃hist最后端的节点及其键值。

 lru_cache_t(int cap):capacity(cap){}

    void put(key_t k, val_t v) {
        auto it = kv.find(k);

        if(it != kv.end()){
            it->second->second = v; //update value
            hist.splice(hist.begin(), hist, it->second); //move visit record to front
        }else{
            if(hist.size() == capacity){
                auto evict_node = hist.back();
                kv.erase(evict_node.first); 
                hist.pop_back(); //remove the key in back
            }

            hist.emplace_front(k, v); //insert new key
            kv[k] = hist.begin();
        }
    }

取值得函数相对简单，

   val_t get(key_t k){
        auto it = kv.find(k);
        if(it == kv.end()){
            return -1;
        }else{
            hist.splice(hist.begin(), hist, it->second);
            return it->second->second;
        }
    }

最后添加测试用例。

void lru_cache_test(){
    lru_cache_t cache( 2 /* capacity */ );

    cache.put(1, 1);
    cache.put(2, 2);
    cout<< cache.get(1);       // returns 1
    cache.put(3, 3);    // evicts key 2
    cout<< cache.get(2);       // returns -1 (not found)
    cache.put(4, 4);    // evicts key 1
    cout<< cache.get(1);       // returns -1 (not found)
    cout<<cache.get(3);       // returns 3
    cout<<cache.get(4);       // returns 4   
}

代码清单：https://github.com/KevinACoder/Learning/blob/master/ds_cpp/cache_design.cpp