美文网首页
ArrayList源码分析

ArrayList源码分析

作者: Leocat | 来源:发表于2017-01-23 19:22 被阅读16次

    ArrayList

    原文见:Java 容器源码分析之 ArrayList

    概述

    ArrayList是使用频率最高的集合之一了,在需要使用List的情况下,往往都是优先考虑ArrayList。首先我们来看一下声明:

    public class ArrayList<E> extends AbstractList<E>
            implements List<E>, RandomAccess, Cloneable, java.io.Serializable
    
    

    ArrayList实现的几个接口中,RandomAccess、Cloneable、Serializable都是标记接口,所以ArrayList是很纯粹的List接口的实现,不像它兄弟LinkedList还实现了Deque接口,还要作为双向队列使用。

    结构

    transient Object[] elementData;
    
    // 这个继承自父类AbstractList
    protected transient int modCount = 0;
    

    ArrayList的名称中我们就可以看出来,这是一个用数组实现的List,或者说是可变数组,数据就是存储在elementData这个对象数组里。除了elementData我们还需要关注一个重要的成员变量modCountmodCount成员变量是继承自父类AbstractListmodCount表示这个List被结构化修改的次数,结构化修改就是那些会改变List的大小的操作。modCount主要被用在迭代器上,如果一个List在迭代的过程中发生了结构化修改,就会导致结果出错。在List迭代过程中,如果因为其它线程对List的操作,导致结构发生变化,那么迭代器就抛出ConcurrentModificationException,这就是迭代器的fail-fast机制。

    添加元素

    /**
     * Appends the specified element to the end of this list.
     */
    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }
    
    /**
     * Inserts the specified element at the specified position in this
     * list. Shifts the element currently at that position (if any) and
     * any subsequent elements to the right (adds one to their indices).
     */
    public void add(int index, E element) {
        rangeCheckForAdd(index);
    
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
        elementData[index] = element;
        size++;
    }
    
    /**
     * Appends all of the elements in the specified collection to the end of
     * this list, in the order that they are returned by the
     * specified collection's Iterator.  The behavior of this operation is
     * undefined if the specified collection is modified while the operation
     * is in progress.  (This implies that the behavior of this call is
     * undefined if the specified collection is this list, and this
     * list is nonempty.)
     */
    public boolean addAll(Collection<? extends E> c) {
        Object[] a = c.toArray();
        int numNew = a.length;
        ensureCapacityInternal(size + numNew);  // Increments modCount
        System.arraycopy(a, 0, elementData, size, numNew);
        size += numNew;
        return numNew != 0;
    }
    
    /**
     * Inserts all of the elements in the specified collection into this
     * list, starting at the specified position.  Shifts the element
     * currently at that position (if any) and any subsequent elements to
     * the right (increases their indices).  The new elements will appear
     * in the list in the order that they are returned by the
     * specified collection's iterator.
     */
    public boolean addAll(int index, Collection<? extends E> c) {
        rangeCheckForAdd(index);
    
        Object[] a = c.toArray();
        int numNew = a.length;
        ensureCapacityInternal(size + numNew);  // Increments modCount
    
        int numMoved = size - index;
        if (numMoved > 0)
            System.arraycopy(elementData, index, elementData, index + numNew,
                             numMoved);
    
        System.arraycopy(a, 0, elementData, index, numNew);
        size += numNew;
        return numNew != 0;
    }
    
    private void rangeCheck(int index) {
        if (index >= size)
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }
    
    private void rangeCheckForAdd(int index) {
        if (index > size || index < 0)
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }
    

    有多个方法来给ArrayList添加元素,add(E e)是添加到数组末尾,add(int index, E element)是添加到指定位置,addAll(Collection<? extends E> c)批量添加元素到数组末尾,addAll(int index, Collection<? extends E> c)批量添加元素到指定位置。

    本质上这几个方法都是相同的,首先通过rangeCheck或者rangeCheckForAdd方法判断index是否合法。然后通过ensureCapacityInternal方法来确保数组的容量足够,该方法会先判断当前数组容量是否足够,如果不够就进行扩容,待会会进行介绍。不过需要注意的是,添加元素是会造成ArrayList结构化改变的,所以modCount的值要增加。而源码中将modCount自增操作放在了ensureCapacityInternal方法里,感觉有点怪怪的,从方法的命名中可以看出这个方法是用来确保数组容量的,但是却在这个方法里修改了与方法容量无关的成员变量,所以我觉得设计得不是很合理。写代码的人也觉得自己这样搞不是很合理,所以才通过注释来说明。

    ensureCapacityInternal(size + 1); // Increments modCount!!

    接着刚才的话题,当确保数组的容量足够之后,再通过静态方法System.arraycopy()将元素拷贝到合适的位置,对原数组进行重新排序就可以了。当然,添加到末尾就不用考虑到数组重排序的问题了,直接将待添加元素放到末尾就可以了。最后修改size到相应的数值,添加元素的操作就完成了。

    扩容

    ArrayList是基于可变数组的,当底层数组容量不足时会进行扩容,以改变数组的容量。代码如下:

    private void ensureCapacityInternal(int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }
    
        ensureExplicitCapacity(minCapacity);
    }
    
    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;
    
        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }
    
    /**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }
    
    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }
    

    前面那些ensure开头的方法是用来检测当前数组容量是否足够容纳minCapacity的,如果容量不足才会进行扩容,即调用grow(int capacity)方法,我们直接来看grow()方法。

    grow()方法首先将数组容量扩张为原来的1.5倍,即int newCapacity = oldCapacity + (oldCapacity >> 1)这条语句。然后再判断新容量是否满足最小所需容量minCapacity,如果还是不能满足,就将newCapacity设置为minCapacity。接下来要判断newCapacity是否超过了最大允许的数组大小MAX_ARRAY_SIZE,如果超过了就调整为最大的int值。最后就是将原数组的值拷贝到新的数组上。

    移除元素

    /**
     * Removes the element at the specified position in this list.
     * Shifts any subsequent elements to the left (subtracts one from their
     * indices).
     */
    public E remove(int index) {
        rangeCheck(index);
    
        modCount++;
        E oldValue = elementData(index);
    
        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work
    
        return oldValue;
    }
    
    /**
     * Removes the first occurrence of the specified element from this list,
     * if it is present.  If the list does not contain the element, it is
     * unchanged.  More formally, removes the element with the lowest index
     */
    public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    fastRemove(index);
                    return true;
                }
        } else {
            for (int index = 0; index < size; index++)
                if (o.equals(elementData[index])) {
                    fastRemove(index);
                    return true;
                }
        }
        return false;
    }
    
    /*
     * Private remove method that skips bounds checking and does not
     * return the value removed.
     */
    private void fastRemove(int index) {
        modCount++;
        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work
    }
    
    /**
     * Removes all of the elements from this list.  The list will
     * be empty after this call returns.
     */
    public void clear() {
        modCount++;
    
        // clear to let GC do its work
        for (int i = 0; i < size; i++)
            elementData[i] = null;
    
        size = 0;
    }
    
    /**
     * Removes from this list all of the elements whose index is between
     * {@code fromIndex}, inclusive, and {@code toIndex}, exclusive.
     * Shifts any succeeding elements to the left (reduces their index).
     * This call shortens the list by {@code (toIndex - fromIndex)} elements.
     * (If {@code toIndex==fromIndex}, this operation has no effect.)
     */
    protected void removeRange(int fromIndex, int toIndex) {
        modCount++;
        int numMoved = size - toIndex;
        System.arraycopy(elementData, toIndex, elementData, fromIndex,
                         numMoved);
    
        // clear to let GC do its work
        int newSize = size - (toIndex-fromIndex);
        for (int i = newSize; i < size; i++) {
            elementData[i] = null;
        }
        size = newSize;
    }
    

    其实移除元素的原理很简单,就是通过System.arraycopy方法将需要保留的元素复制到正确的位置上,然后调整size的大小。最后为了防止内存泄露,需要显式将不再使用的位置中存放的元素置为null。虽然原理简单,但是需要注意的细节很多,大多是索引值方面的小细节。

    接下来看一下批量删除或者保留元素的方法。

    /**
     * Removes from this list all of its elements that are contained in the
     * specified collection.
     */
    public boolean removeAll(Collection<?> c) {
        Objects.requireNonNull(c);
        return batchRemove(c, false);
    }
    
    /**
     * Retains only the elements in this list that are contained in the
     * specified collection.  In other words, removes from this list all
     * of its elements that are not contained in the specified collection.
     */
    public boolean retainAll(Collection<?> c) {
        Objects.requireNonNull(c);
        return batchRemove(c, true);
    }
    
    private boolean batchRemove(Collection<?> c, boolean complement) {
        final Object[] elementData = this.elementData;
        int r = 0, w = 0;
        boolean modified = false;
        try {
            for (; r < size; r++)
                //1) 移除c中元素,complement == false
                //   若elementData[r]不在c中,则保留
                //2)保留c中元素,complement == true
                //   若elementData[r]在c中,则保留
                if (c.contains(elementData[r]) == complement)
                    elementData[w++] = elementData[r];
        } finally {
            // Preserve behavioral compatibility with AbstractCollection,
            // even if c.contains() throws.
            // 1)r == size, 则操作成功了
            // 2)r != size, c.contains抛出了异常,
            //      可能是因为元素和c中元素类型不兼容,或者c不支持null元素
            //      则将后面尚未检查的元素向前复制
            if (r != size) {
                System.arraycopy(elementData, r,
                                 elementData, w,
                                 size - r);
                w += size - r;
            }
            if (w != size) {
                // clear to let GC do its work
                for (int i = w; i < size; i++)
                    elementData[i] = null;
                modCount += size - w;
                size = w;
                modified = true;
            }
        }
        return modified;
    }
    

    其中,无论是批量移除removeAll()方法还是批量保留retainAll()方法,都是使用了batchRemove方法,我们直接来看这个方法。

    先来说一下原理,首先通过便利整个数组,找出需要保留的元素,从索引0开始依次保存到elementData数组中。如果便利过程没有异常出现(也就是r==size),则显式将不再使用的位置中存放的元素置为null,让GC回收。当然如果便利过程出现异常(r!=size),则要将未被便利的值拷贝到w索引及之后的位置。暂时不清楚对异常的处理是否合理。

    查找与更新

    public boolean contains(Object o) {
        return indexOf(o) >= 0;
    }
    
    /**
     * Returns the index of the first occurrence of the specified element
     * in this list, or -1 if this list does not contain the element.
     * More formally, returns the lowest index <tt>i</tt> such that
     * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>,
     * or -1 if there is no such index.
     */
    public int indexOf(Object o) {
        if (o == null) {
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = 0; i < size; i++)
                if (o.equals(elementData[i]))
                    return i;
        }
        return -1;
    }
    
    /**
     * Returns the index of the last occurrence of the specified element
     * in this list, or -1 if this list does not contain the element.
     * More formally, returns the highest index <tt>i</tt> such that
     * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>,
     * or -1 if there is no such index.
     */
    public int lastIndexOf(Object o) {
        if (o == null) {
            for (int i = size-1; i >= 0; i--)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = size-1; i >= 0; i--)
                if (o.equals(elementData[i]))
                    return i;
        }
        return -1;
    }
    
    /**
     * Returns the element at the specified position in this list.
     */
    public E get(int index) {
        rangeCheck(index);
    
        return elementData(index);
    }
    
    /**
     * Replaces the element at the specified position in this list with
     * the specified element.
     */
    public E set(int index, E element) {
        rangeCheck(index);
    
        E oldValue = elementData(index);
        elementData[index] = element;
        return oldValue;
    }
    

    因为是基于数组实现的,所以查找元素和更新元素比较简单。这几个方法都没有改变List的结构,所以不会修改modCount的值。

    迭代

    列表的迭代也是开发中经常使用到了,特别是使用for each语句进行迭代。因为Collection接口继承了Iterable接口,ArrayList间接实现了Collection,所以需要实现Iterable接口的iterator()方法,下面我们来看一下。

    public Iterator<E> iterator() {
        return new Itr();
    }
    /**
     * An optimized version of AbstractList.Itr
     */
    private class Itr implements Iterator<E> {
        int cursor;       // index of next element to return
        int lastRet = -1; // index of last element returned; -1 if no such
        int expectedModCount = modCount;
    
        public boolean hasNext() {
            return cursor != size;
        }
    
        @SuppressWarnings("unchecked")
        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }
    
        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();
    
            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }
    
        final void checkForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }
    }
    

    迭代器中通过cursor来标注下一个待返回元素的索引值,还有一个lastRet来标注上一个被返回元素的索引值。ArrayList的实现不是线程安全的,其fail-fast机制的实现是通过modCount变量来实现的。在nextremove里都有checkForComodification()的方法,在该方法中,会比较Iterator创建时的modCount(expectedModCount)和当前的modCount的值是否相等。不过不相,证明在迭代器创建之后ArrayList的结构有被修改过,此时抛出ConcurrentModificationException异常。

    需要注意的一点在于,remove()方法调用时,会判断lastRet < 0,如果小于0,就会抛出异常。出现lastRet<0只有两种情况,一种是刚创建迭代器,还未调用next()方法的时候,一种是调用过一次remove()方法后会把lastRet设置为-1。所以连续两次调用remove()方法是会抛出异常的。

    List接口还支持另一种迭代器ListIterator,它不仅可以使用next()向前迭代,还可以使用previous()向后迭代;不仅可以使用remove()在迭代中移除元素,还可以使用add()方法在迭代中添加元素。

    小结

    ArrayList内部使用数组实现,具有高效的随机访问的特性。但是插入和删除元素时往往需要复制数组,开销较大。在容器创建之后需要进行大量访问,但插入和删除操作使用较少的情况下比较适合使用ArrayList。

    相关文章

      网友评论

          本文标题:ArrayList源码分析

          本文链接:https://www.haomeiwen.com/subject/oxgabttx.html