美文网首页程序员
java中的reference(二): jdk1.8中Refer

java中的reference(二): jdk1.8中Refer

作者: 冬天里的懒喵 | 来源:发表于2020-07-11 16:56 被阅读0次

    [toc]

    1. java1.8 中的Reference结构

    在jdk1.8中,Reference位于java.lan.ref包中。


    image.png

    主要的类有:Reference、SoftReference、WeakReference、PhantomReference及FinalReference、和Finalizer。其中最核心的是抽象类Reference,其他的Reference都继承了这个抽象类。分别对应java的软、弱、虚引用。而强引用是系统缺省的引用关系,用等号即可表示。因此没有专门的类。另外还有一个FinalReference,这个类主要是配合Finalizer机制使用。Finalizer本身存在诸多问题,在jdk1.9中已经被替换为另外一种Cleaner机制来配合PhantomReference机制,本文暂不涉及jdk1.9中的内容仅限于jdk1.8。
    还有一个关键的类是ReferenceQueue,
    java.lan.ref包中各类的关系如下图:


    image.png

    也可以通过idea提供的Diagram查看:


    image.png

    上述Reference总结见下表:

    类名 引用类型 说明
    SoftReference 软引用 堆内存不足时,垃圾回收器会回收对应引用
    WeakReference 弱引用 每次垃圾回收都会回收其引用
    PhantomReference 虚引用 对引用无影响,只用于获取对象被回收的通知
    FinalReference - Java用于实现finalization的一个内部类

    2. 引用与可达性

    要搞懂Reference,必须要对GC的过程进行进一步的了解。
    我们在前文中已经体会了jvm中定义的这些引用的具体用法。
    我们知道,GC决定是否对一个对象进行回收,主要根据的是从GC ROOT 节点往下搜索,进行可达性计算。GC根据可达性结果决定是否对这些对象进行回收。可达性主要有五种,分别与这4种引用类型进行对应。

    可达性类型 引用类型 说明
    强可达(Strongly Reachable) 强引用(Strong Reference) 如果线程能通过强引用访问到对象,那么这个对象就是强可达的。
    软可达(Soft Reachable) 软引用(Soft Reference) 如果一个对象不是强可达的,但是可以通过软引用访问到,那么这个对象就是软可达的
    弱可达(Weak Reachable) 弱引用(Weak Reference) 如果一个对象不是强可达或者软可达的,但是可以通过弱引用访问到,那么这个对象就是弱可达的。
    虚可达(Phantom Reachable) 虚引用(Phantom Reference) 如果一个对象不是强可达,软可达或者弱可达,并且这个对象已经finalize过了,并且有虚引用指向该对象,那么这个对象就是虚可达的。
    不可达(Unreachable) - 如果一个对象不是强可达,软可达或者弱可达,并且这个对象已经finalize过了,并且有虚引用指向该对象,那么这个对象就是虚可达的。

    这是可达性的概念,我们可以通过如下示例进一步分析:


    image.png

    在上面这个例子中,A~D,每个对象只存在一个引用,分别是:A-强引用,B-软引用,C-弱引用,D-虚引用,所以他们的可达性为:A-强可达,B-软可达,C-弱可达,D-虚可达。因为E没有存在和GC Root的引用链,所以它是不可达。
    再看如下这个更加复杂的例子:


    image.png
    • A依然只有一个强引用,所以A是强可达
    • B存在两个引用,强引用和软引用,但是B可以通过强引用访问到,所以B是强可达
    • C只能通过弱引用访问到,所以是弱可达
    • D存在弱引用和虚引用,所以是弱可达
    • E虽然存在F的强引用,但是GC Root无法访问到它,所以它依然是不可达。

    这是jvm种的5种可达性。不难看出,jvm主要是根据这些Reference的4种子类,来实现GC面对这些对象不可达的时候的不同处理办法。

    3. Reference源码

    3.1 核心源码

    首先来看Reference源码

    /**
     * Abstract base class for reference objects.  This class defines the
     * operations common to all reference objects.  Because reference objects are
     * implemented in close cooperation with the garbage collector, this class may
     * not be subclassed directly.
     *
     * @author   Mark Reinhold
     * @since    1.2
     */
    

    注释说,这个抽象类是所有Reference类的基类,定义了所有Reference相关的操作,与GC紧密关联。也就是说GC会根据这些类来做一些特定的处理,直接实现其子类没有意义。什么意思,也就是说,jvm会对这个类及其子类做特殊的处理,jvmGC程序会硬编码识别SoftReference,WeakReference,PhantomReference等这些具体的类,对其reference变量进行特殊对象,才有了不同的引用类型的效果。否则,Reference与普通的类没啥区别。
    Reference 主要实现两大核心功能:

    • 实现特定的引用类型
    • 用户可以对象被回收后得到通知
      那么第一个功能在此已经可以很明白了。对于第二个功能,GC如何实现垃圾回收之后发送消息通知呢?很显然,对于GC这种性能要求很高的场景,不能采用传统的消息回调模式。万一再FullGC重消息回调阻塞或者出现性能问题,那么会导致整个JVM挂起。所以,Reference采用了另外一种方式,把被回收的Reference添加到了一个队列中。后续用户根据需要自行从queue中获取。这也解释了为啥软、弱引用提供了两调用方式,可以选择ReferenceQueue一起使用,也可以不用。但是虚引用由于只有通知消息,必须和ReferenceQuene一起使用。
      现在查看Reference的源码:
    public abstract class Reference<T> {
        //会被GC特殊对待
        private T referent;         /* Treated specially by GC */ 
        //Reference被回收之后会被添加到这个queue
        volatile ReferenceQueue<? super T> queue;
        
        
        /* -- Constructors -- */
        //用户只需要特殊的Reference,并不关心GC状态,因此可以不需要ReferenceQueue
        Reference(T referent) {
            this(referent, null);
        }
        //构造函数中传入了queue,如果reference被GC回收,则会添加到queue中去
        Reference(T referent, ReferenceQueue<? super T> queue) {
            this.referent = referent;
            this.queue = (queue == null) ? ReferenceQueue.NULL : queue;
        }
    
    }
    

    3.2 reference的状态

    再Reference中,定义了Reference的状态:

     /* A Reference instance is in one of four possible internal states:
         *
         *     Active: Subject to special treatment by the garbage collector.  Some
         *     time after the collector detects that the reachability of the
         *     referent has changed to the appropriate state, it changes the
         *     instance's state to either Pending or Inactive, depending upon
         *     whether or not the instance was registered with a queue when it was
         *     created.  In the former case it also adds the instance to the
         *     pending-Reference list.  Newly-created instances are Active.
         *
         *     Pending: An element of the pending-Reference list, waiting to be
         *     enqueued by the Reference-handler thread.  Unregistered instances
         *     are never in this state.
         *
         *     Enqueued: An element of the queue with which the instance was
         *     registered when it was created.  When an instance is removed from
         *     its ReferenceQueue, it is made Inactive.  Unregistered instances are
         *     never in this state.
         *
         *     Inactive: Nothing more to do.  Once an instance becomes Inactive its
         *     state will never change again.
         *
         * The state is encoded in the queue and next fields as follows:
         *
         *     Active: queue = ReferenceQueue with which instance is registered, or
         *     ReferenceQueue.NULL if it was not registered with a queue; next =
         *     null.
         *
         *     Pending: queue = ReferenceQueue with which instance is registered;
         *     next = this
         *
         *     Enqueued: queue = ReferenceQueue.ENQUEUED; next = Following instance
         *     in queue, or this if at end of list.
         *
         *     Inactive: queue = ReferenceQueue.NULL; next = this.
         *
         * With this scheme the collector need only examine the next field in order
         * to determine whether a Reference instance requires special treatment: If
         * the next field is null then the instance is active; if it is non-null,
         * then the collector should treat the instance normally.
         *
         * To ensure that a concurrent collector can discover active Reference
         * objects without interfering with application threads that may apply
         * the enqueue() method to those objects, collectors should link
         * discovered objects through the discovered field. The discovered
         * field is also used for linking Reference objects in the pending list.
         */
    

    大段的英文注释,实际上在学习java源代码的过程中,看懂这些注释往往比源码更加重要,有时候源码只能反应实现的具体过程,但是究竟为什么要真没实现,则在很多源码的注释中有说明。
    注释中,将Reference的状态分为4种:

    状态 说明
    Active 刚初始化的实例是Active状态,在可达性发生变化之后,由于GC的各种特殊处理,可能会切换为Pendig或者Inactive状态,如果实例创建时注册了referenceQueue,则会切换到Pending状态,并将Reference加入到Pending-Reference队列,如果没有注册ReferenceQueue,则会切换到Inactive状态
    Pending 当被加入到Penging-reference链表中的时候的状态,这些Reference等待被加入到ReferenceQueue。如果没有注册ReferenceQueue则永远不会出现这个状态
    Enqueued 在ReferenceQueue队列中的Reference的状态,如果从ReferenceQueue中移除,则会进入Inactive状态
    Inactive Reference的最终状态,一旦到达Inactive状态则状态不会再发生改变

    对于这四种状态,Reference的next指针和queue如下:

    状态 queue next
    Active ReferenceQueue or ReferenceQueue.NULL null
    Pending ReferenceQueue this
    Enqueued ReferenceQueue.ENQUEUED 队列中的下一个
    Inactive ReferenceQueue.NULL this

    状态图如下:


    image.png

    在上文注释中我们发现有一个Penging-reference链表,还有一个ReferenceQueue。这个链表又是来做什么的呢?常规来说,jvm应该直接将gc后的Referencce加入到ReferenceQueue中即可。但是实际上并不是如此。GC为了保证执行效率,而ReferenceQueue中的数据本身也不需要那么高的时效性,因此,在具体的代码中,jvm的GC操作只把Reference加入到了pending-Reference链表中。这是一个轻量级的操作,效率会非常高。Reference中有一个pending的成员变量,他就是这个pending-Reference链表的头节点。而discoverd 则是指向下一个节点的指针。
    我们再看看Reference源码:

        /* List of References waiting to be enqueued.  The collector adds
         * References to this list, while the Reference-handler thread removes
         * them.  This list is protected by the above lock object. The
         * list uses the discovered field to link its elements.
         */
        private static Reference<Object> pending = null;
        
            /* When active:   next element in a discovered reference list maintained by GC (or this if last)
         *     pending:   next element in the pending list (or null if last)
         *   otherwise:   NULL
         */
        transient private Reference<T> discovered;  /* used by VM */
    

    GC操作将Active的reference添加到了pending链表中。

    3.3 ReferenceHandler

    上文中说到GC只将reference添加到了Pending-Reference链表中。何时会被加入到ReferenceQueue中呢?这个过程就需要通过一个独立的线程来运行,这个线程就是ReferenceHandler。它是Reference的一个内部类,同时,为了线程安全,还有一个全局的锁:

        /* Object used to synchronize with the garbage collector.  The collector
         * must acquire this lock at the beginning of each collection cycle.  It is
         * therefore critical that any code holding this lock complete as quickly
         * as possible, allocate no new objects, and avoid calling user code.
         */
         //GC在操作过程中需要获取reference的这个锁,与ReferenceHandler线程同步。避免造成线程不安全。
         //由于GC也要用到这个锁,因此referenceHandler中的操作必须尽快完成,不生成新的对象,也不调用用户代码。避免对GC过程造成影响。
        static private class Lock { }
        private static Lock lock = new Lock();
        /* High-priority thread to enqueue pending References
         */
        private static class ReferenceHandler extends Thread {
    
            private static void ensureClassInitialized(Class<?> clazz) {
                try {
                    Class.forName(clazz.getName(), true, clazz.getClassLoader());
                } catch (ClassNotFoundException e) {
                    throw (Error) new NoClassDefFoundError(e.getMessage()).initCause(e);
                }
            }
    
            static {
                // pre-load and initialize InterruptedException and Cleaner classes
                // so that we don't get into trouble later in the run loop if there's
                // memory shortage while loading/initializing them lazily.
                ensureClassInitialized(InterruptedException.class);
                ensureClassInitialized(Cleaner.class);
            }
    
            ReferenceHandler(ThreadGroup g, String name) {
                super(g, name);
            }
    
            public void run() {
                while (true) {
                    tryHandlePending(true);
                }
            }
        }
    
    

    线程的核心逻辑都在tryHandlePending中:

    /**
         * Try handle pending {@link Reference} if there is one.<p>
         * Return {@code true} as a hint that there might be another
         * {@link Reference} pending or {@code false} when there are no more pending
         * {@link Reference}s at the moment and the program can do some other
         * useful work instead of looping.
         *
         * @param waitForNotify if {@code true} and there was no pending
         *                      {@link Reference}, wait until notified from VM
         *                      or interrupted; if {@code false}, return immediately
         *                      when there is no pending {@link Reference}.
         * @return {@code true} if there was a {@link Reference} pending and it
         *         was processed, or we waited for notification and either got it
         *         or thread was interrupted before being notified;
         *         {@code false} otherwise.
         */
        static boolean tryHandlePending(boolean waitForNotify) {
            Reference<Object> r;
            Cleaner c;
            try {
            // 获取锁,避免与垃圾回收器同时操作
                synchronized (lock) {
                 //判断pending-Reference链表是否有数据
                    if (pending != null) {
                     // 如果有Pending Reference,从列表中取出
                        r = pending;
                        // 'instanceof' might throw OutOfMemoryError sometimes
                        // so do this before un-linking 'r' from the 'pending' chain...
                        c = r instanceof Cleaner ? (Cleaner) r : null;
                        // unlink 'r' from 'pending' chain
                        pending = r.discovered;
                        r.discovered = null;
                    } else {
                     // 如果没有Pending Reference,调用wait等待
                        // 
                        // wait等待锁,是可能抛出OOME的,
                        // 因为可能发生InterruptedException异常,然后就需要实例化这个异常对象,
                        // 如果此时内存不足,就可能抛出OOME,所以这里需要捕获OutOfMemoryError,
                        // 避免因为OOME而导致ReferenceHandler进程静默退出
                        // The waiting on the lock may cause an OutOfMemoryError
                        // because it may try to allocate exception objects.
                        if (waitForNotify) {
                            lock.wait();
                        }
                        // retry if waited
                        return waitForNotify;
                    }
                }
            } catch (OutOfMemoryError x) {
                // Give other threads CPU time so they hopefully drop some live references
                // and GC reclaims some space.
                // Also prevent CPU intensive spinning in case 'r instanceof Cleaner' above
                // persistently throws OOME for some time...
                Thread.yield();
                // retry
                return true;
            } catch (InterruptedException x) {
                // retry
                return true;
            }
          //调用clean方法
            // Fast path for cleaners
            if (c != null) {
                c.clean();
                return true;
            }
    
            ReferenceQueue<? super Object> q = r.queue;
            //如果ReferenceQueue不为null 则入队
            if (q != ReferenceQueue.NULL) q.enqueue(r);
            return true;
        }
    
    

    ReferenceHandler则是在线程中的静态代码块中启动的:

      static {
            ThreadGroup tg = Thread.currentThread().getThreadGroup();
            for (ThreadGroup tgn = tg;
                 tgn != null;
                 tg = tgn, tgn = tg.getParent());
            Thread handler = new ReferenceHandler(tg, "Reference Handler");
            /* If there were a special system-only priority greater than
             * MAX_PRIORITY, it would be used here
             */
            handler.setPriority(Thread.MAX_PRIORITY);
            handler.setDaemon(true);
            handler.start();
    
            // provide access in SharedSecrets
            SharedSecrets.setJavaLangRefAccess(new JavaLangRefAccess() {
                @Override
                public boolean tryHandlePendingReference() {
                    return tryHandlePending(false);
                }
            });
        }
    

    可以看出,ReferenceHandler设置了Thread.MAX_PRIORITY 最高优先级。主要逻辑是将Pending-reference链表中的Reference添加到ReferenceUqeue。需要注意的是,为了不与GC冲突,ReferenceHandler不生成新的对象,也不调用用户代码。避免对GC过程造成影响。

    4. ReferenceQueue

    我们再来看看ReferenceQueue的源码。

    /**
     * Reference queues, to which registered reference objects are appended by the
     * garbage collector after the appropriate reachability changes are detected.
     *
     * @author   Mark Reinhold
     * @since    1.2
     */
    

    Reference queues 在注册queue之后,将GC之后的Reference放到这个队列中。其本身也是一个链表。

        // 引用链表的头节点
        private volatile Reference<? extends T> head = null;
        // 引用队列长度,入队则增加1,出队则减少1
        private long queueLength = 0;
    

    为了在多线程下运行,同样也实现了锁:

        // 静态内部类,作为锁对象
        static private class Lock { };
        /* 互斥锁,用于同步ReferenceHandler的enqueue和用户线程操作的remove和poll出队操作 */
        private Lock lock = new Lock();
        
          // 用于标识没有注册Queue
        static ReferenceQueue<Object> NULL = new Null<>();
        // 用于标识已经处于对应的Queue中
        static ReferenceQueue<Object> ENQUEUED = new Null<>();
    
    

    重点是入队的方法enqueue:

     boolean enqueue(Reference<? extends T> r) { /* Called only by Reference class */
            //获得锁
            synchronized (lock) {
                //判断是否需要入队
                // Check that since getting the lock this reference hasn't already been
                // enqueued (and even then removed)
                ReferenceQueue<?> queue = r.queue;
                  // 如果引用实例持有的队列为ReferenceQueue.NULL或者ReferenceQueue.ENQUEUED则入队失败返回false
                if ((queue == NULL) || (queue == ENQUEUED)) {
                    return false;
                }
                assert queue == this;
                //入队之后 设置为ENQUEUED 将Reference绑定只queue改为new一个新的Enqueue队列,避免循环引用
                r.queue = ENQUEUED;
                // 如果链表没有元素,则此引用实例直接作为头节点,否则把前一个引用实例作为下一个节点
                r.next = (head == null) ? r : head;
                // 当前实例更新为头节点,也就是每一个新入队的引用实例都是作为头节点,已有的引用实例会作为后继节点
                head = r;
                // 队列长度增加1
                queueLength++;
                // 特殊处理FinalReference,VM进行计数
                if (r instanceof FinalReference) {
                    sun.misc.VM.addFinalRefCount(1);
                }
                lock.notifyAll();
                return true;
            }
        }
    

    poll 方法和reallypoll方法:

     // 引用队列的poll操作,此方法必须在加锁情况下调用
        private Reference<? extends T> reallyPoll() {       /* Must hold lock */
            Reference<? extends T> r = head;
            if (r != null) {
                @SuppressWarnings("unchecked")
                Reference<? extends T> rn = r.next;
                // 更新next节点为头节点,如果next节点为自身,说明已经走过一次出队,则返回null
                head = (rn == r) ? null : rn;
                r.queue = NULL;
                // 当前头节点变更为环状队列,考虑到FinalReference尚为inactive和避免重复出队的问题
                r.next = r;
                // 队列长度减少1
                queueLength--;
                if (r instanceof FinalReference) {
                    sun.misc.VM.addFinalRefCount(-1);
                }
                return r;
            }
            return null;
        }
    
        // 队列的公有poll操作,主要是加锁后调用reallyPoll
        public Reference<? extends T> poll() {
            if (head == null)
                return null;
            synchronized (lock) {
                return reallyPoll();
            }
        }
    

    移除引用队列中的下一个引用元素的remove方法:

    // 移除引用队列中的下一个引用元素,实际上也是依赖于reallyPoll的Object提供的阻塞机制
        public Reference<? extends T> remove(long timeout)
            throws IllegalArgumentException, InterruptedException
        {
            if (timeout < 0) {
                throw new IllegalArgumentException("Negative timeout value");
            }
            synchronized (lock) {
                Reference<? extends T> r = reallyPoll();
                if (r != null) return r;
                long start = (timeout == 0) ? 0 : System.nanoTime();
                for (;;) {
                    lock.wait(timeout);
                    r = reallyPoll();
                    if (r != null) return r;
                    if (timeout != 0) {
                        long end = System.nanoTime();
                        timeout -= (end - start) / 1000_000;
                        if (timeout <= 0) return null;
                        start = end;
                    }
                }
            }
        }
    

    不难看出,实际上ReferenceQueue只存储了Reference链表的头节点,真正的Reference链表的所有节点是存储在Reference实例本身,通过属性 next 拼接的,ReferenceQueue提供了对Reference链表的入队、poll、remove等操作。
    Reference与ReferenceQueue的完整关系如下图:


    image.png

    5.其他Reference源码

    5.1 SoftReference

    SoftReference的实现很简单,继承Reference之后,只是增加了一个时间戳。

        /**
         * Timestamp clock, updated by the garbage collector
         */
        static private long clock;
    
        /**
         * Timestamp updated by each invocation of the get method.  The VM may use
         * this field when selecting soft references to be cleared, but it is not
         * required to do so.
         */
        private long timestamp;
    

    在SoftReference中,有一个全局的变量clock(实际上就是java.lang.ref.SoftReference的类变量clock,其保持了最后一次GC的时间点(以毫秒为单位),即每一次GC发生时,该值均会被重新设置。 同时,java.lang.ref.SoftReference对象实例均有一个timestamp的属性,其被设置为最后一次成功通过SoftReference对象获取其引用对象时的clock的值(最后一次GC)。所以,java.lang.ref.SoftReference对象实例的timestamp属性,保持的是这个对象被访问时的最后一次GC的时间戳。
    get 方法如下:

        /**
         * Returns this reference object's referent.  If this reference object has
         * been cleared, either by the program or by the garbage collector, then
         * this method returns <code>null</code>.
         *
         * @return   The object to which this reference refers, or
         *           <code>null</code> if this reference object has been cleared
         */
        public T get() {
            T o = super.get();
            if (o != null && this.timestamp != clock)
                this.timestamp = clock;
            return o;
        }
    

    在每次调用get的过程中,实际上只是修改了这个时间戳的值。GC每次调用会同时修改clock和timestamp。这样就可以计算出这个softReference有多久没访问。之后决定要不要将其删除。
    当GC发生时,以下两个因素影响SoftReference引用的对象是否被回收:
    1、SoftReference 对象实例的timestamp有多旧;
    2、内存空闲空间的大小。
    具体回收过程本文不做详细展开。

    5.2 WeakReference

    weakReference中只有构造方法,其他方法全部继承Reference构造方法。

        /**
         * Creates a new weak reference that refers to the given object.  The new
         * reference is not registered with any queue.
         *
         * @param referent object the new weak reference will refer to
         */
        public WeakReference(T referent) {
            super(referent);
        }
    
        /**
         * Creates a new weak reference that refers to the given object and is
         * registered with the given queue.
         *
         * @param referent object the new weak reference will refer to
         * @param q the queue with which the reference is to be registered,
         *          or <tt>null</tt> if registration is not required
         */
        public WeakReference(T referent, ReferenceQueue<? super T> q) {
            super(referent, q);
        }
    

    5.3 PhantomReference

    PhantomReference 只有一个带ReferenceQueue的构造方法。在使用的时候必须和ReferenceQueue配合一起使用。

        /**
         * Creates a new phantom reference that refers to the given object and
         * is registered with the given queue.
         *
         * <p> It is possible to create a phantom reference with a <tt>null</tt>
         * queue, but such a reference is completely useless: Its <tt>get</tt>
         * method will always return null and, since it does not have a queue, it
         * will never be enqueued.
         *
         * @param referent the object the new phantom reference will refer to
         * @param q the queue with which the reference is to be registered,
         *          or <tt>null</tt> if registration is not required
         */
        public PhantomReference(T referent, ReferenceQueue<? super T> q) {
            super(referent, q);
        }
    

    由此不难发现PhantomReference和weakReference在代码层面只有一个构造方法的差异。

    关于Finalizer和FinaReference将在后面专门介绍。
    本文参考:
    JDK源码阅读-Reference
    阿里面试: 说说强引用、软引用、弱引用、虚引用吧

    相关文章

      网友评论

        本文标题:java中的reference(二): jdk1.8中Refer

        本文链接:https://www.haomeiwen.com/subject/ftpicktx.html