Lru实现原理——LinkedHashMap源码解析

作者: Frank_Kivi | 来源:发表于2017-09-12 00:40 被阅读283次

    Lru算法对于很多人来说感觉非常的高大上,但是一旦你揭开了他的面纱之后,就会发现其实它真的很简单。
    Lru算法简单来说就是最后操作的最后出队,优先删除那些不用的元素。其实说白了就是create,retrieve和update都会把操作的元素主到队尾(因为delete直接就把元素删除了,没有考虑的必要),只要完成这个操作,一个简单的Lru算法就相当于实现了。而对于Java来说,有一个完全按照这个算法结构设计的数据结构,它就是LinkedHashMap。

    
    /**
     * <p>Hash table and linked list implementation of the <tt>Map</tt> interface,
     * with predictable iteration order.  This implementation differs from
     * <tt>HashMap</tt> in that it maintains a doubly-linked list running through
     * all of its entries.  This linked list defines the iteration ordering,
     * which is normally the order in which keys were inserted into the map
     * (<i>insertion-order</i>).  Note that insertion order is not affected
     * if a key is <i>re-inserted</i> into the map.  (A key <tt>k</tt> is
     * reinserted into a map <tt>m</tt> if <tt>m.put(k, v)</tt> is invoked when
     * <tt>m.containsKey(k)</tt> would return <tt>true</tt> immediately prior to
     * the invocation.)
     *
     * <p>This implementation spares its clients from the unspecified, generally
     * chaotic ordering provided by {@link HashMap} (and {@link Hashtable}),
     * without incurring the increased cost associated with {@link TreeMap}.  It
     * can be used to produce a copy of a map that has the same order as the
     * original, regardless of the original map's implementation:
     * <pre>
     *     void foo(Map m) {
     *         Map copy = new LinkedHashMap(m);
     *         ...
     *     }
     * </pre>
     * This technique is particularly useful if a module takes a map on input,
     * copies it, and later returns results whose order is determined by that of
     * the copy.  (Clients generally appreciate having things returned in the same
     * order they were presented.)
     *
     * <p>A special {@link #LinkedHashMap(int,float,boolean) constructor} is
     * provided to create a linked hash map whose order of iteration is the order
     * in which its entries were last accessed, from least-recently accessed to
     * most-recently (<i>access-order</i>).  This kind of map is well-suited to
     * building LRU caches.  Invoking the {@code put}, {@code putIfAbsent},
     * {@code get}, {@code getOrDefault}, {@code compute}, {@code computeIfAbsent},
     * {@code computeIfPresent}, or {@code merge} methods results
     * in an access to the corresponding entry (assuming it exists after the
     * invocation completes). The {@code replace} methods only result in an access
     * of the entry if the value is replaced.  The {@code putAll} method generates one
     * entry access for each mapping in the specified map, in the order that
     * key-value mappings are provided by the specified map's entry set iterator.
     * <i>No other methods generate entry accesses.</i>  In particular, operations
     * on collection-views do <i>not</i> affect the order of iteration of the
     * backing map.
     *
     * <p>The {@link #removeEldestEntry(Map.Entry)} method may be overridden to
     * impose a policy for removing stale mappings automatically when new mappings
     * are added to the map.
     *
     * <p>This class provides all of the optional <tt>Map</tt> operations, and
     * permits null elements.  Like <tt>HashMap</tt>, it provides constant-time
     * performance for the basic operations (<tt>add</tt>, <tt>contains</tt> and
     * <tt>remove</tt>), assuming the hash function disperses elements
     * properly among the buckets.  Performance is likely to be just slightly
     * below that of <tt>HashMap</tt>, due to the added expense of maintaining the
     * linked list, with one exception: Iteration over the collection-views
     * of a <tt>LinkedHashMap</tt> requires time proportional to the <i>size</i>
     * of the map, regardless of its capacity.  Iteration over a <tt>HashMap</tt>
     * is likely to be more expensive, requiring time proportional to its
     * <i>capacity</i>.
     *
     * <p>A linked hash map has two parameters that affect its performance:
     * <i>initial capacity</i> and <i>load factor</i>.  They are defined precisely
     * as for <tt>HashMap</tt>.  Note, however, that the penalty for choosing an
     * excessively high value for initial capacity is less severe for this class
     * than for <tt>HashMap</tt>, as iteration times for this class are unaffected
     * by capacity.
     *
     * <p><strong>Note that this implementation is not synchronized.</strong>
     * If multiple threads access a linked hash map concurrently, and at least
     * one of the threads modifies the map structurally, it <em>must</em> be
     * synchronized externally.  This is typically accomplished by
     * synchronizing on some object that naturally encapsulates the map.
     *
     * If no such object exists, the map should be "wrapped" using the
     * {@link Collections#synchronizedMap Collections.synchronizedMap}
     * method.  This is best done at creation time, to prevent accidental
     * unsynchronized access to the map:<pre>
     *   Map m = Collections.synchronizedMap(new LinkedHashMap(...));</pre>
     *
     * A structural modification is any operation that adds or deletes one or more
     * mappings or, in the case of access-ordered linked hash maps, affects
     * iteration order.  In insertion-ordered linked hash maps, merely changing
     * the value associated with a key that is already contained in the map is not
     * a structural modification.  <strong>In access-ordered linked hash maps,
     * merely querying the map with <tt>get</tt> is a structural modification.
     * </strong>)
     *
     * <p>The iterators returned by the <tt>iterator</tt> method of the collections
     * returned by all of this class's collection view methods are
     * <em>fail-fast</em>: if the map is structurally modified at any time after
     * the iterator is created, in any way except through the iterator's own
     * <tt>remove</tt> method, the iterator will throw a {@link
     * ConcurrentModificationException}.  Thus, in the face of concurrent
     * modification, the iterator fails quickly and cleanly, rather than risking
     * arbitrary, non-deterministic behavior at an undetermined time in the future.
     *
     * <p>Note that the fail-fast behavior of an iterator cannot be guaranteed
     * as it is, generally speaking, impossible to make any hard guarantees in the
     * presence of unsynchronized concurrent modification.  Fail-fast iterators
     * throw <tt>ConcurrentModificationException</tt> on a best-effort basis.
     * Therefore, it would be wrong to write a program that depended on this
     * exception for its correctness:   <i>the fail-fast behavior of iterators
     * should be used only to detect bugs.</i>
     *
     * <p>The spliterators returned by the spliterator method of the collections
     * returned by all of this class's collection view methods are
     * <em><a href="Spliterator.html#binding">late-binding</a></em>,
     * <em>fail-fast</em>, and additionally report {@link Spliterator#ORDERED}.
     *
     * <p>This class is a member of the
     * <a href="{@docRoot}/../technotes/guides/collections/index.html">
     * Java Collections Framework</a>.
     *
     * @implNote
     * The spliterators returned by the spliterator method of the collections
     * returned by all of this class's collection view methods are created from
     * the iterators of the corresponding collections.
     *
     * @param <K> the type of keys maintained by this map
     * @param <V> the type of mapped values
     *
     * @author  Josh Bloch
     * @see     Object#hashCode()
     * @see     Collection
     * @see     Map
     * @see     HashMap
     * @see     TreeMap
     * @see     Hashtable
     * @since   1.4
     */
    public class LinkedHashMap<K,V>
        extends HashMap<K,V>
        implements Map<K,V>
    {
    
        /*
         * Implementation note.  A previous version of this class was
         * internally structured a little differently. Because superclass
         * HashMap now uses trees for some of its nodes, class
         * LinkedHashMap.Entry is now treated as intermediary node class
         * that can also be converted to tree form. The name of this
         * class, LinkedHashMap.Entry, is confusing in several ways in its
         * current context, but cannot be changed.  Otherwise, even though
         * it is not exported outside this package, some existing source
         * code is known to have relied on a symbol resolution corner case
         * rule in calls to removeEldestEntry that suppressed compilation
         * errors due to ambiguous usages. So, we keep the name to
         * preserve unmodified compilability.
         *
         * The changes in node classes also require using two fields
         * (head, tail) rather than a pointer to a header node to maintain
         * the doubly-linked before/after list. This class also
         * previously used a different style of callback methods upon
         * access, insertion, and removal.
         */
    
        /**
         * HashMap.Node subclass for normal LinkedHashMap entries.
         */
        static class Entry<K,V> extends HashMap.Node<K,V> {
            Entry<K,V> before, after;
            Entry(int hash, K key, V value, Node<K,V> next) {
                super(hash, key, value, next);
            }
        }
    
        private static final long serialVersionUID = 3801124242820219131L;
    
        /**
         * The head (eldest) of the doubly linked list.
         */
        transient LinkedHashMap.Entry<K,V> head;
    
        /**
         * The tail (youngest) of the doubly linked list.
         */
        transient LinkedHashMap.Entry<K,V> tail;
    
        /**
         * The iteration ordering method for this linked hash map: <tt>true</tt>
         * for access-order, <tt>false</tt> for insertion-order.
         *
         * @serial
         */
        final boolean accessOrder;
    }
    

    LinkedHashMap是HasMap的子类。通过注释上的介绍我们也可以了解到,它和HashMap本质上是一样的,然后多了一套用来保证遍历顺序的东西,那就是head和tail,它是LinkedEntry结构。注释上写明了它是一个双向的链表。

    /*
         * Implementation note.  A previous version of this class was
         * internally structured a little differently. Because superclass
         * HashMap now uses trees for some of its nodes, class
         * LinkedHashMap.Entry is now treated as intermediary node class
         * that can also be converted to tree form. The name of this
         * class, LinkedHashMap.Entry, is confusing in several ways in its
         * current context, but cannot be changed.  Otherwise, even though
         * it is not exported outside this package, some existing source
         * code is known to have relied on a symbol resolution corner case
         * rule in calls to removeEldestEntry that suppressed compilation
         * errors due to ambiguous usages. So, we keep the name to
         * preserve unmodified compilability.
         *
         * The changes in node classes also require using two fields
         * (head, tail) rather than a pointer to a header node to maintain
         * the doubly-linked before/after list. This class also
         * previously used a different style of callback methods upon
         * access, insertion, and removal.
         */
    
        /**
         * HashMap.Node subclass for normal LinkedHashMap entries.
         */
        static class Entry<K,V> extends HashMap.Node<K,V> {
            Entry<K,V> before, after;
            Entry(int hash, K key, V value, Node<K,V> next) {
                super(hash, key, value, next);
            }
        }
    

    查看源码后我们发布它确实是双向链表结构,并且是Node的子类。
    另外一个非常重要是boolean的accessOrder,已经说的很明确,当是true的时候表示使用的顺序,当是false的时候表示插入顺序。很明显我们想实现Lru算法,需要它是true。只能通过三个参数的构造来达到目的。

      /**
         * Constructs an empty insertion-ordered <tt>LinkedHashMap</tt> instance
         * with the specified initial capacity and load factor.
         *
         * @param  initialCapacity the initial capacity
         * @param  loadFactor      the load factor
         * @throws IllegalArgumentException if the initial capacity is negative
         *         or the load factor is nonpositive
         */
        public LinkedHashMap(int initialCapacity, float loadFactor) {
            super(initialCapacity, loadFactor);
            accessOrder = false;
        }
    
        /**
         * Constructs an empty insertion-ordered <tt>LinkedHashMap</tt> instance
         * with the specified initial capacity and a default load factor (0.75).
         *
         * @param  initialCapacity the initial capacity
         * @throws IllegalArgumentException if the initial capacity is negative
         */
        public LinkedHashMap(int initialCapacity) {
            super(initialCapacity);
            accessOrder = false;
        }
    
        /**
         * Constructs an empty insertion-ordered <tt>LinkedHashMap</tt> instance
         * with the default initial capacity (16) and load factor (0.75).
         */
        public LinkedHashMap() {
            super();
            accessOrder = false;
        }
    
        /**
         * Constructs an insertion-ordered <tt>LinkedHashMap</tt> instance with
         * the same mappings as the specified map.  The <tt>LinkedHashMap</tt>
         * instance is created with a default load factor (0.75) and an initial
         * capacity sufficient to hold the mappings in the specified map.
         *
         * @param  m the map whose mappings are to be placed in this map
         * @throws NullPointerException if the specified map is null
         */
        public LinkedHashMap(Map<? extends K, ? extends V> m) {
            super();
            accessOrder = false;
            putMapEntries(m, false);
        }
    
        /**
         * Constructs an empty <tt>LinkedHashMap</tt> instance with the
         * specified initial capacity, load factor and ordering mode.
         *
         * @param  initialCapacity the initial capacity
         * @param  loadFactor      the load factor
         * @param  accessOrder     the ordering mode - <tt>true</tt> for
         *         access-order, <tt>false</tt> for insertion-order
         * @throws IllegalArgumentException if the initial capacity is negative
         *         or the load factor is nonpositive
         */
        public LinkedHashMap(int initialCapacity,
                             float loadFactor,
                             boolean accessOrder) {
            super(initialCapacity, loadFactor);
            this.accessOrder = accessOrder;
        }
    

    刚才我们已经说过,影响Lru的是create,retrieve和udpate,对map来说也就是put,get和putAll。
    LinkedHashMap本身没有实现put和putAll。需要我们查看HashMap的源码,有兴趣的同学可以查阅 HashMap去重原理和内部实现。最终这两个方法都会调用putVal。

    /**
         * Implements Map.put and related methods
         *
         * @param hash hash for key
         * @param key the key
         * @param value the value to put
         * @param onlyIfAbsent if true, don't change existing value
         * @param evict if false, the table is in creation mode.
         * @return previous value, or null if none
         */
        final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                       boolean evict) {
            Node<K,V>[] tab; Node<K,V> p; int n, i;
            if ((tab = table) == null || (n = tab.length) == 0)
                n = (tab = resize()).length;
            if ((p = tab[i = (n - 1) & hash]) == null)
                tab[i] = newNode(hash, key, value, null);
            else {
                Node<K,V> e; K k;
                if (p.hash == hash &&
                    ((k = p.key) == key || (key != null && key.equals(k))))
                    e = p;
                else if (p instanceof TreeNode)
                    e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
                else {
                    for (int binCount = 0; ; ++binCount) {
                        if ((e = p.next) == null) {
                            p.next = newNode(hash, key, value, null);
                            if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                                treeifyBin(tab, hash);
                            break;
                        }
                        if (e.hash == hash &&
                            ((k = e.key) == key || (key != null && key.equals(k))))
                            break;
                        p = e;
                    }
                }
                if (e != null) { // existing mapping for key
                    V oldValue = e.value;
                    if (!onlyIfAbsent || oldValue == null)
                        e.value = value;
                    afterNodeAccess(e);
                    return oldValue;
                }
            }
            ++modCount;
            if (++size > threshold)
                resize();
            afterNodeInsertion(evict);
            return null;
        }
    

    可以明显看到当是update的时候,调用了afterNodeAccess(e),当是create时,调用了afterNodeInsertion(evict)。
    查看这两个方法, HashMap本身都没有实现。

     // Callbacks to allow LinkedHashMap post-actions
        void afterNodeAccess(Node<K,V> p) { }
        void afterNodeInsertion(boolean evict) { }
    

    很明显这两个方法就是让LinkedHashMap 来实现的。
    先来看第一个。

    void afterNodeAccess(Node<K,V> e) { // move node to last
            LinkedHashMap.Entry<K,V> last;
            if (accessOrder && (last = tail) != e) {
                LinkedHashMap.Entry<K,V> p =
                    (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
                p.after = null;
                if (b == null)
                    head = a;
                else
                    b.after = a;
                if (a != null)
                    a.before = b;
                else
                    last = b;
                if (last == null)
                    head = p;
                else {
                    p.before = last;
                    last.after = p;
                }
                tail = p;
                ++modCount;
            }
        }
    

    它的实现就是把传入的Node放到队尾,前提是accessOrder为true并且e不是在队尾的时候。

     void afterNodeInsertion(boolean evict) { // possibly remove eldest
            LinkedHashMap.Entry<K,V> first;
            if (evict && (first = head) != null && removeEldestEntry(first)) {
                K key = first.key;
                removeNode(hash(key), key, null, false, true);
            }
        }
    

    查看源码我们可以发现,这个时候传入的evict全为true,head!=null有很好理解,为null时队是空的,肯定不需要操作。最重要的是removeEldestEntry(first)是什么情况。

    /**
         * Returns <tt>true</tt> if this map should remove its eldest entry.
         * This method is invoked by <tt>put</tt> and <tt>putAll</tt> after
         * inserting a new entry into the map.  It provides the implementor
         * with the opportunity to remove the eldest entry each time a new one
         * is added.  This is useful if the map represents a cache: it allows
         * the map to reduce memory consumption by deleting stale entries.
         *
         * <p>Sample use: this override will allow the map to grow up to 100
         * entries and then delete the eldest entry each time a new entry is
         * added, maintaining a steady state of 100 entries.
         * <pre>
         *     private static final int MAX_ENTRIES = 100;
         *
         *     protected boolean removeEldestEntry(Map.Entry eldest) {
         *        return size() > MAX_ENTRIES;
         *     }
         * </pre>
         *
         * <p>This method typically does not modify the map in any way,
         * instead allowing the map to modify itself as directed by its
         * return value.  It <i>is</i> permitted for this method to modify
         * the map directly, but if it does so, it <i>must</i> return
         * <tt>false</tt> (indicating that the map should not attempt any
         * further modification).  The effects of returning <tt>true</tt>
         * after modifying the map from within this method are unspecified.
         *
         * <p>This implementation merely returns <tt>false</tt> (so that this
         * map acts like a normal map - the eldest element is never removed).
         *
         * @param    eldest The least recently inserted entry in the map, or if
         *           this is an access-ordered map, the least recently accessed
         *           entry.  This is the entry that will be removed it this
         *           method returns <tt>true</tt>.  If the map was empty prior
         *           to the <tt>put</tt> or <tt>putAll</tt> invocation resulting
         *           in this invocation, this will be the entry that was just
         *           inserted; in other words, if the map contains a single
         *           entry, the eldest entry is also the newest.
         * @return   <tt>true</tt> if the eldest entry should be removed
         *           from the map; <tt>false</tt> if it should be retained.
         */
        protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
            return false;
        }
    

    可以看到默认实现是false,但是已经给出例子,可以设置一个MAX_ENTRIES 来控制。其实可以这样理解,Lru算法删除是有我们条件的,我们可以以数量来控制,当数量超过一定个数时删除。
    总结一下就是如果是update,会自动把这个node放到队尾,因为数量没有变,不会触发删除操作。当是create时,插入操作本身就是把node加到队尾,所以只用关心是否需要删除队首就可以了。
    最后来查看一下retrieve。

    /**
         * Returns the value to which the specified key is mapped,
         * or {@code null} if this map contains no mapping for the key.
         *
         * <p>More formally, if this map contains a mapping from a key
         * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
         * key.equals(k))}, then this method returns {@code v}; otherwise
         * it returns {@code null}.  (There can be at most one such mapping.)
         *
         * <p>A return value of {@code null} does not <i>necessarily</i>
         * indicate that the map contains no mapping for the key; it's also
         * possible that the map explicitly maps the key to {@code null}.
         * The {@link #containsKey containsKey} operation may be used to
         * distinguish these two cases.
         */
        public V get(Object key) {
            Node<K,V> e;
            if ((e = getNode(hash(key), key)) == null)
                return null;
            if (accessOrder)
                afterNodeAccess(e);
            return e.value;
        }
    

    可以看到跟HashMap的get方法基本一致,就不再分析了。只是最后加了一个判断,当accessOrder为true时,会触发afterNodeAccess(e)和前边的分析是完全一样的,就不再赘述。

    相关文章

      网友评论

        本文标题:Lru实现原理——LinkedHashMap源码解析

        本文链接:https://www.haomeiwen.com/subject/uzrtsxtx.html