ArrayList源码分析

作者: 史路比 | 来源:发表于2018-02-05 12:42 被阅读30次

ArrayList的源码分析（一）【转载】
ArrayList 源码分析
手写ArrayList（JDK1.8,部分功能）
ArrayList实现原理
ArrayList源码分析(基于jdk1.8)（二）：subLi
Java集合源码分析-LinkedList
ArrayList源码解析基于jdk1.8
ArrayList学习笔记
ArrayList源码分析
java集合ArrayList与LinkedList源码分析

ArrayList

ArrayList就是传说中的动态数组，就是Array的复杂版本，它提供了如下一些好处：动态的增加和减少元素、灵活的设置数组的大小.

ArrayList的定义：

public class ArrayList<E> extends AbstractList<E>  implements List<E>, RandomAccess, Cloneable, java.io.Serializable

从ArrayList<E>可以看出它是支持泛型的，它继承自AbstractList，实现了List、RandomAccess、Cloneable、java.io.Serializable接口。

AbstractList提供了List接口的默认实现（个别方法为抽象方法）。
List接口定义了列表必须实现的方法。
RandomAccess是一个标记接口，接口内没有定义任何内容。
实现了Cloneable接口的类，可以调用Object.clone方法返回该对象的浅拷贝。
通过实现 java.io.Serializable 接口以启用其序列化功能。未实现此接口的类将无法使其任何状态序列化或反序列化。序列化接口没有方法或字段，仅用于标识可序列化的语义。

ArrayList的属性

ArrayList定义只定义类两个私有属性：

 /** 
  * The array buffer into which the elements of the ArrayList are stored. 
  * The capacity of the ArrayList is the length of this array buffer. 
  */  
 private transient Object[] elementData;  

 /** 
  * The size of the ArrayList (the number of elements it contains). 
  * 
  * @serial 
  */  
 private int size;

很容易理解，elementData存储ArrayList内的元素，size表示它包含的元素的数量。
有个关键字需要解释：transient。
Java的serialization提供了一种持久化对象实例的机制。当持久化对象时，可能有一个特殊的对象数据成员，我们不想用serialization机制
来保存它。为了在一个特定对象的一个域上关闭serialization，可以在这个域前加上关键字transient。
tansient是Java语言的关键字，用来表示一个域不是该对象串行化的一部分。当一个对象被串行化的时候，transient型变量的值不包括在串行化的表示中，然而非transient型的变量是被包括进去的。

有点抽象，看个例子应该能明白。

public class UserInfo implements Serializable {
    private static final long serialVersionUID = 996890129747019948L;
    private String name;
    private transient String psw;

    public UserInfo(String name, String psw) {
        this.name = name;
        this.psw = psw;
    }

    public String toString() {
        return "name=" + name + ", psw=" + psw;
    }
}

public class TestTransient {
    public static void main(String[] args) {
        UserInfo userInfo = new UserInfo("张三", "123456");
        System.out.println(userInfo);
        try {
            // 序列化，被设置为transient的属性没有被序列化
            ObjectOutputStream o = new ObjectOutputStream(new FileOutputStream("UserInfo.out"));
            o.writeObject(userInfo);
            o.close();
        } catch (Exception e) {
            // TODO: handle exception
            e.printStackTrace();
        }
        try {
            // 重新读取内容
            ObjectInputStream in = new ObjectInputStream(new FileInputStream("UserInfo.out"));
            UserInfo readUserInfo = (UserInfo) in.readObject();
            // 读取后psw的内容为null
            System.out.println(readUserInfo.toString());
        } catch (Exception e) {
            // TODO: handle exception
            e.printStackTrace();
        }
    }
}

被标记为transient的属性在对象被序列化的时候不会被保存。

ArrayList的构造方法

看完属性看构造方法。ArrayList提供了三个构造方法

/** 
 * Constructs an empty list with the specified initial capacity. 
 */  
public ArrayList(int initialCapacity) {  
     super();  
    if (initialCapacity < 0)  
        throw new IllegalArgumentException("Illegal Capacity: "+  
                                           initialCapacity);  
    this.elementData = new Object[initialCapacity];  
}

/** 
 * Constructs an empty list with an initial capacity of ten. 
 */  
public ArrayList() {
    this(10);  
}

/** 
 * Constructs a list containing the elements of the specified 
 * collection, in the order they are returned by the collection's 
 * iterator. 
 */  
public ArrayList(Collection<? extends E> c) {  
    elementData = c.toArray();  
    size = elementData.length;  
    // c.toArray might (incorrectly) not return Object[] (see 6260652)  
    if (elementData.getClass() != Object[].class)  
        elementData = Arrays.copyOf(elementData, size, Object[].class);  
}

第一个构造方法使用提供的initialCapacity来初始化elementData数组的大小。第二个构造方法调用第一个构造方法并传入参数10，即默认elementData数组的大小为10。第三个构造方法则将提供的集合转成数组返回给elementData（返回若不是Object[]将调用Arrays.copyOf方法将其转为Object[]）。

ArrayList的其他方法

add(E e)

add(E e)都知道是在尾部添加一个元素，如何实现的呢？

/**
 * Appends the specified element to the end of this list.
 *
 * @param e element to be appended to this list
 * @return <tt>true</tt> (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

书上都说ArrayList是基于数组实现的，属性中也看到了数组，具体是怎么实现的呢？比如就这个添加元素的方法，如果数组足够大，则在将某个位置的值设置为指定元素即可，如果数组容量不够了呢？

看到add(E e)中先调用了ensureCapacityInternal(size + 1)方法，之后将元素赋给elementData[size]，而后size自增。例如：初次添加时，size为0，add将elementData[0]赋值为e，然后size设置为1（类似执行以下两条语句elementData[0]=e;size=1）。将元素赋给elementData[size++]不会出现数组越界的情况吗？这里关键就在ensureCapacityInternal(size + 1)中了。

根据ensureCapacityInternal的方法名可以知道是确保容量用的。ensureCapacityInternal(size + 1)后面的注释可以明白是增加modCount的值（加了俩感叹号，应该蛮重要的，来看看）。

/**
 * Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;

/**
 * Shared empty array instance used for default sized empty instances. We
 * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
 * first element is added.
 */
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }

    ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

/**
 * The maximum size of array to allocate.
 * Some VMs reserve some header words in an array.
 * Attempts to allocate larger arrays may result in
 * OutOfMemoryError: Requested array size exceeds VM limit
 */
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

/**
 * Increases the capacity to ensure that it can hold at least the
 * number of elements specified by the minimum capacity argument.
 *
 * @param minCapacity the desired minimum capacity
 */
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

判断minCapacity（即size+1）是否大于oldCapacity（即elementData.length），若大于，则调整容量为oldCapacity + (oldCapacity >> 1)。调整elementData容量为新的容量，即返回一个内容为原数组元素，大小为新容量的数组赋给elementData；否则不做操作。

容量的拓展将导致数组元素的复制，多次拓展容量将执行多次整个数组内容的复制。若提前能大致判断list的长度，调用ensureCapacity调整容量，将有效的提高运行速度。

可以理解提前分配好空间可以提高运行速度，但是测试发现提高的并不是很大，而且若list原本数据量就不会很大效果将更不明显。

add(int index, E element)

add(int index,E element)在指定位置插入元素。

public void add(int index, E element) {
    if (index > size || index < 0)
        throw new IndexOutOfBoundsException("Index: "+index+", Size: "+size);

    ensureCapacityInternal(size + 1);  // Increments modCount!!
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    elementData[index] = element;
    size++;
}

首先判断指定位置index是否超出elementData的界限，之后调用ensureCapacityInternal调整容量（若容量足够则不会拓展），调用System.arraycopy将elementData从index开始的size-index个元素复制到index+1至size+1的位置（即index开始的元素都向后移动一个位置），然后将index位置的值指向element。

addAll(Collection<? extends E> c)

addAll(Collection<? extends E> c) 将c中的全部元素添加到集合的末尾

public boolean addAll(Collection<? extends E> c) {
    Object[] a = c.toArray();
    int numNew = a.length;
    ensureCapacityInternal(size + numNew);  // Increments modCount
    System.arraycopy(a, 0, elementData, size, numNew);
    size += numNew;
    return numNew != 0;
}

先将集合c转换成数组，根据转换后数组的长度和ArrayList的size拓展容量，之后调用System.arraycopy方法复制元素到elementData的尾部，调整size。根据返回的内容分析，只要集合c的大小不为0则返回true。

addAll(int index,Collection<? extends E> c)

addAll(int index,Collection<? extends E> c) 在指定位置将c中的全部元素添加到集合

public boolean addAll(int index, Collection<? extends E> c) {
    if (index > size || index < 0)
        throw new IndexOutOfBoundsException("Index: "+index+", Size: "+size);

    Object[] a = c.toArray();
    int numNew = a.length;
    ensureCapacityInternal(size + numNew);  // Increments modCount

    int numMoved = size - index;
    if (numMoved > 0)
        System.arraycopy(elementData, index, elementData, index + numNew,
                         numMoved);

    System.arraycopy(a, 0, elementData, index, numNew);
    size += numNew;
    return numNew != 0;
}

先判断index是否越界。其他内容与addAll(Collection<? extends E> c)基本一致，只是复制的时候先将index开始的元素向后移动X（c转为数组后的长度）个位置（也是一个复制的过程），之后将数组内容复制到elementData的index位置至index+X。

clear()

clear() 清空集合

public void clear() {
    modCount++;

    // clear to let GC do its work
    for (int i = 0; i < size; i++)
        elementData[i] = null;

    size = 0;
}

clear的时候并没有修改elementData的长度（好不容易申请、拓展来的，凭什么释放，留着搞不好还有用呢。这使得确定不再修改list内容之后最好调用trimToSize来释放掉一些空间），只是将所有元素置为null，size设置为0。

clone()

返回此 ArrayList 实例的浅表副本。（浅层复制，不复制这些元素本身。）

public Object clone() {
    try {
        ArrayList<?> v = (ArrayList<?>) super.clone();
        v.elementData = Arrays.copyOf(elementData, size);
        v.modCount = 0;
        return v;
    } catch (CloneNotSupportedException e) {
        // this shouldn't happen, since we are Cloneable
        throw new InternalError(e);
    }
}

调用父类的clone方法返回一个对象的副本，将返回对象的elementData数组的内容赋值为原对象elementData数组的内容，将副本的modCount设置为0。

contains(Object o)

contains(Object o)如果集合包含指定的元素，则返回值为true。注意：元素的值可以为null

public boolean contains(Object o) {
    return indexOf(o) >= 0;
}

indexOf方法返回值与0比较来判断对象是否在list中。接着看indexOf。

indexOf(Object o)

indexOf(Object o) 返回指定元素在集合中的位置，如果不存在则返回-1。

public int indexOf(Object o) {
    if (o == null) {
        for (int i = 0; i < size; i++)
            if (elementData[i]==null)
                return i;
    } else {
        for (int i = 0; i < size; i++)
            if (o.equals(elementData[i]))
                return i;
    }
    return -1;
}

通过遍历elementData数组来判断对象是否在list中，若存在，返回index（[0, size-1]），若不存在则返回-1。所以contains方法可以通过indexOf(Object)方法的返回值来判断对象是否被包含在list中。发现contains方法判断是否包含指定元素，调用的是equals方法

lastIndexOf(Object o)

lastIndexOf(Object o)返回指定元素在集合中最后一次出现的位置，如果不存在则返回-1

public int lastIndexOf(Object o) {
    if (o == null) {
        for (int i = size-1; i >= 0; i--)
            if (elementData[i]==null)
                return i;
    } else {
        for (int i = size-1; i >= 0; i--)
            if (o.equals(elementData[i]))
                return i;
    }
    return -1;
}

采用了从后向前遍历elementData数组，若遇到Object则返回index值，若没有遇到，返回-1。

get(int index)

get(int index) 返回指定位置的元素，如果index超出数组界限了，就抛出IndexOutBoundsException异常

public E get(int index) {
    if (index >= size)
        throw new IndexOutOfBoundsException("Index: "+index+", Size: "+size);

    return elementData(index);
}

remove(int index)

remove(int index) 移除指定位置的元素，并把元素返回。

public E remove(int index) {
    if (index >= size)
        throw new IndexOutOfBoundsException("Index: "+index+", Size: "+size);

    modCount++;
    E oldValue = elementData(index);

    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work

    return oldValue;
}

首先是检查范围，修改modCount，保留将要被移除的元素，将移除位置之后的元素向前挪动一个位置，将list末尾元素置空（null），返回被移除的元素。

remove(Object o)

remove(Object o) 移除指定的元素，移除成功则返回true。注意：传入的元素可为null

public boolean remove(Object o) {
    if (o == null) {
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                fastRemove(index);
                return true;
            }
    } else {
        for (int index = 0; index < size; index++)
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}

首先通过代码可以看到，当移除成功后返回true，否则返回false。remove(Object o)中通过遍历elementData寻找是否存在传入对象，一旦找到就调用fastRemove移除对象。为什么找到了元素就知道了index，不通过remove(index)来移除元素呢？因为fastRemove跳过了判断边界的处理，因为找到元素就相当于确定了index不会超过边界，而且fastRemove并不返回被移除的元素。下面是fastRemove的代码，基本和remove(index)一致。

private void fastRemove(int index) {
    modCount++;
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work
}

set(int index, E element)

set(int index, E element) 将集合中指定位置的元素替换为指定元素，并返回被替换的元素

public E set(int index, E element) {
    if (index >= size)
        throw new IndexOutOfBoundsException("Index: "+index+", Size: "+size);

    E oldValue = elementData(index);
    elementData[index] = element;
    return oldValue;
}

toArray()

toArray()返回集合的数组表示形式

public Object[] toArray() {
    return Arrays.copyOf(elementData, size);
}

trimToSize()

trimToSize() 将集合的容量整理至集合实际包含的元素个数

public void trimToSize() {
    modCount++;
    if (size < elementData.length) {
        elementData = (size == 0)
          ? EMPTY_ELEMENTDATA
          : Arrays.copyOf(elementData, size);
    }
}

关于modCount

/**
 * The number of times this list has been <i>structurally modified</i>.
 * Structural modifications are those that change the size of the
 * list, or otherwise perturb it in such a fashion that iterations in
 * progress may yield incorrect results.
 *
 * <p>This field is used by the iterator and list iterator implementation
 * returned by the {@code iterator} and {@code listIterator} methods.
 * If the value of this field changes unexpectedly, the iterator (or list
 * iterator) will throw a {@code ConcurrentModificationException} in
 * response to the {@code next}, {@code remove}, {@code previous},
 * {@code set} or {@code add} operations.  This provides
 * <i>fail-fast</i> behavior, rather than non-deterministic behavior in
 * the face of concurrent modification during iteration.
 *
 * <p><b>Use of this field by subclasses is optional.</b> If a subclass
 * wishes to provide fail-fast iterators (and list iterators), then it
 * merely has to increment this field in its {@code add(int, E)} and
 * {@code remove(int)} methods (and any other methods that it overrides
 * that result in structural modifications to the list).  A single call to
 * {@code add(int, E)} or {@code remove(int)} must add no more than
 * one to this field, or the iterators (and list iterators) will throw
 * bogus {@code ConcurrentModificationExceptions}.  If an implementation
 * does not wish to provide fail-fast iterators, this field may be
 * ignored.
 */
protected transient int modCount = 0;

在父类AbstractList中定义了一个int型的属性：modCount，记录了ArrayList结构性变化的次数。

在ArrayList的所有涉及结构变化的方法中都增加modCount的值，包括：add()、remove()、addAll()、removeRange()及clear()方法。这些方法每调用一次，modCount的值就加1。

关于迭代器

Iterator<E> iterator()

iterator()方法是在AbstractList中实现的，该方法返回AbstractList的一个内部类Itr的对象。

public Iterator<E> iterator() {
    return new Itr();
}

private class Itr implements Iterator<E> {
    
    int cursor = 0; //标记位：标记遍历到哪一个元素
    int expectedModCount = modCount; //标记位：用于判断在遍历的过程中，是否发生了add、remove等操作

    //检测对象数组是否还有元素
    public boolean hasNext() {
        return cursor != size(); //如果cursor==size，说明已经遍历完了，上一次遍历的是最后一个元素
    }

    //获取元素
    public E next() {
        checkForComodification(); //检测在遍历的过程中，是否发生了add、remove等操作
        try {
            E next = ArrayList.this.get(cursor++);
            return next;
        } catch (IndexOutOfBoundsException e) { //捕获get(cursor++)方法的IndexOutOfBoundsException
            checkForComodification();
            throw new NoSuchElementException();
        }
    }
    
    //移除元素
    public void remove() {
        checkForComodification();
        try {
            ArrayList.this.remove(cursor--);
            expectedModCount = modCount;
        } catch (IndexOutOfBoundsException ex) { //捕获IndexOutOfBoundsException
            throw new ConcurrentModificationException();
        }
    }

    //检测在遍历的过程中，是否发生了add、remove等操作
    final void checkForComodification() {
        if (modCount != expectedModCount) //发生了add、remove等操作,这个我们可以查看add等的源代码，发现会出现modCount++
            throw new ConcurrentModificationException();
    }
}

对集合的遍历操作，无论是for(int i=0; i<list.size(); i++)，还是增强for循环，进行编译之后都会解除语法糖，变成迭代器的形式。

现在对modCount和expectedModCount的作用应该非常清楚了。在对一个集合进行跌代操作的同时，并不限制对集合的元素进行操作。但当这些操作包括一些可能引起跌代错误的add()或remove()等危险操作时，就会抛出ConcurrentModificationException。这就是modCount和expectedModCount的作用所在。在迭代过程中，如果要移除集合中的元素，必须要使用it.remove()，否则会抛出ConcurrentModificationException

总结

ArrayList基于数组方式实现，无容量的限制（会自动扩容），每次以当前容量的50%进行扩容，即：10-->15-->22-->33....
添加元素时可能要扩容（所以最好预判一下），删除元素时不会减少容量（若希望减少容量，trimToSize()），删除元素时，将删除掉的位置元素置为null，下次gc就会回收这些元素所占的内存空间。
线程不安全
add(int index, E element)：添加元素到数组中指定位置的时候，需要将该位置及其后边所有的元素都整块向后复制一位
get(int index)：获取指定位置上的元素时，可以通过索引直接获取（O(1)）
remove(Object o)需要遍历数组（底层调用元素的equals方法）
remove(int index)不需要遍历数组，只需判断index是否符合条件即可，效率比remove(Object o)高，需要将该位置之后的所有元素都整块往前复制一位
contains(E)需要遍历数组（底层调用元素的equals方法）

ArrayList的源码分析（一）【转载】
ArrayList简介 ArrayList核心源码 ArrayList源码分析System.arraycopy()...
ArrayList 源码分析
公众号原文：ArrayList 源码分析博客原文：ArrayList 源码分析以下源码分析使用的 J...
手写ArrayList（JDK1.8,部分功能）
先分析下ArrayList的源码, 1、ArrayList源码分析 1.1 类实现的接口 ArrayList继承L...
ArrayList实现原理
ArrayList源码分析 1、ArrayList概述 ArrayList 是List 接口的可变...
ArrayList源码分析(基于jdk1.8)（二）：subLi
在上文ArrayList源码分析(基于jdk1.8)（一）：源码及基本操作中对ArrayList源码进行了分析，那...
Java集合源码分析-LinkedList
上一篇文章Java集合源码分析-ArrayList分析了ArrayList的源码，这篇文章分析下LinkedLis...
ArrayList源码解析基于jdk1.8
ArrayList源码解析 ArrayList继承结构开始分析 ArrayList是继承于AbstractList...
ArrayList学习笔记
ArrayList--动态数组 ArrayList介绍：从源码分析：遍历方式：
ArrayList源码分析
ArrayList 原文见：Java 容器源码分析之 ArrayList 概述 ArrayList是使用频率最高的...
java集合ArrayList与LinkedList源码分析
Array源码分析首先分析new ArrayList<>() new ArrayList()时，会创建一个Obj...

ArrayList源码分析

ArrayList

ArrayList的定义：

ArrayList的属性

ArrayList的构造方法

ArrayList的其他方法

关于modCount

关于迭代器

总结

相关文章

ArrayList的源码分析（一）【转载】

ArrayList 源码分析

手写ArrayList（JDK1.8,部分功能）

ArrayList实现原理

ArrayList源码分析(基于jdk1.8)（二）：subLi

Java集合源码分析-LinkedList

ArrayList源码解析基于jdk1.8

ArrayList学习笔记

ArrayList源码分析

java集合ArrayList与LinkedList源码分析

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

Java学习笔记

程序员

技术干货