SparseArray 与 HashMap

作者: 斌斌爱学习 | 来源:发表于2018-06-03 15:57 被阅读0次

之前我们分析过HashMap的源码，了解了它的扩容机制。不了解的朋友可以看一下我的另一篇文章：HashMap、HashTable与ConcurrentHashMap源码解析.HashMap总体来说还是很好用的，但是我们可以发现，当数量一达到扩容条件时就开始扩容，即使我们只需要使用一点点内存，这样就会使得内存大大的浪费，当容量越大时浪费的越明显。在android这种内存特别敏感的平台，我们应该尽量避免这样的事情发生。因此，google推出了更适合自己的api，SparseArray和ArrayMap。今天我们就来简单分析一下SparseArray的源码。

首先我们看一下官方介绍：

/**
 * SparseArrays map integers to Objects.  Unlike a normal array of Objects,
 * there can be gaps in the indices.  It is intended to be more memory efficient
 * than using a HashMap to map Integers to Objects, both because it avoids
 * auto-boxing keys and its data structure doesn't rely on an extra entry object
 * for each mapping.
 *
 * <p>Note that this container keeps its mappings in an array data structure,
 * using a binary search to find keys.  The implementation is not intended to be appropriate for
 * data structures
 * that may contain large numbers of items.  It is generally slower than a traditional
 * HashMap, since lookups require a binary search and adds and removes require inserting
 * and deleting entries in the array.  For containers holding up to hundreds of items,
 * the performance difference is not significant, less than 50%.</p>
 *
 * <p>To help with performance, the container includes an optimization when removing
 * keys: instead of compacting its array immediately, it leaves the removed entry marked
 * as deleted.  The entry can then be re-used for the same key, or compacted later in
 * a single garbage collection step of all removed entries.  This garbage collection will
 * need to be performed at any time the array needs to be grown or the the map size or
 * entry values are retrieved.</p>
 *
 * <p>It is possible to iterate over the items in this container using
 * {@link #keyAt(int)} and {@link #valueAt(int)}. Iterating over the keys using
 * <code>keyAt(int)</code> with ascending values of the index will return the
 * keys in ascending order, or the values corresponding to the keys in ascending
 * order in the case of <code>valueAt(int)</code>.</p>

简单翻译一下：大概意思是说，SparseArray比Hashmap内存使用效率更高，主要体现在两个方面，一方面它允许key的自动装箱，另一方面它不需依赖额外的Entry。需要注意的是，这个容器使得键值对一直保持数组的数据结构，并且用二分法查找key，所以它并不适合存储大量数据。SparseArray在增删改查的效率要比传统的Hashmap慢，因为它每次处理之前都需要进行二分法的查找。当数据量小于100的时候，这个性能差距不会很明显，会低于50%。为了提高性能，SparseArray在删除key的时候做了一项优化，他并不会立即将key从数组中删除，而是将其标记成删除状态，后续可以复用它，或者后续再将其移入一个专门收集的容器中，以便数组需要扩容时使用。

废话不多说，我们先来看一下它的数据结构吧！

    private int[] mKeys;
    private Object[] mValues;

正如注释当中说到的，SparseArray主要用两个数组分别来存储key和value，而且key只能为int。

看完了数据结构，我们看一下增删改查。
首先看增：增有两个方法：一个是append，另一个put；我们分别来看一下
首先看put方法：

 public void put(int key, E value) {
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
     
        if (i >= 0) {
            mValues[i] = value;
        } else {
            i = ~i;

            if (i < mSize && mValues[i] == DELETED) {
                mKeys[i] = key;
                mValues[i] = value;
                return;
            }

            if (mGarbage && mSize >= mKeys.length) {
                gc();

                // Search again because indices may have changed.
                i = ~ContainerHelpers.binarySearch(mKeys, mSize, key);
            }

            mKeys = GrowingArrayUtils.insert(mKeys, mSize, i, key);
            mValues = GrowingArrayUtils.insert(mValues, mSize, i, value);
            mSize++;
        }
    }

可以看出首先进行二分查找，判断key数组当中是否存在对应的key，如果存在命中if条件，直接将value值复制给对应索引位置。如果不存在则根据二分查找所返回的索引的位置判断对应位置的value是否是被标记是deleted，如果是则将key和value分别赋值给对应的索引位置。如果返回的索引位置比当前的size还要大，则判断是否有垃圾回收，如果有的话先进行gc，再二分法重新查找获得对应索引，最后将对应的key和value插入到对应的索引位置。

再看append方法：

public void append(int key, E value) {
        if (mSize != 0 && key <= mKeys[mSize - 1]) {
            put(key, value);
            return;
        }

        if (mGarbage && mSize >= mKeys.length) {
            gc();
        }

        mKeys = GrowingArrayUtils.append(mKeys, mSize, key);
        mValues = GrowingArrayUtils.append(mValues, mSize, value);
        mSize++;
    }

可以看出当size>0并且key小于最后一个元素的key时，则调用put方法。否则当之前还存在garbage并且size大于等于数组的长度的时候，先进行gc，然后再将对应的key，value追加到数组后面。那么gc方法做了什么呢？

private void gc() {
        // Log.e("SparseArray", "gc start with " + mSize);

        int n = mSize;
        int o = 0;
        int[] keys = mKeys;
        Object[] values = mValues;

        for (int i = 0; i < n; i++) {
            Object val = values[i];

            if (val != DELETED) {
                if (i != o) {
                    keys[o] = keys[i];
                    values[o] = val;
                    values[i] = null;
                }

                o++;
            }
        }

        mGarbage = false;
        mSize = o;

        // Log.e("SparseArray", "gc end with " + mSize);
    }

可以看出，其实就是将已经被标记为DELETE的对应索引位置的key，value重新赋值，也就是对数组进行了压缩。

看完了增，我们继续看删除方法：
删除系列方法一共有六个：

 public void remove(int key) {
        delete(key);
    }

public void removeAt(int index) {
        if (mValues[index] != DELETED) {
            mValues[index] = DELETED;
            mGarbage = true;
        }
    }

  public void removeAtRange(int index, int size) {
        final int end = Math.min(mSize, index + size);
        for (int i = index; i < end; i++) {
            removeAt(i);
        }
    }

public E removeReturnOld(int key) {
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        if (i >= 0) {
            if (mValues[i] != DELETED) {
                final E old = (E) mValues[i];
                mValues[i] = DELETED;
                mGarbage = true;
                return old;
            }
        }
        return null;
    }

public void delete(int key) {
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        if (i >= 0) {
            if (mValues[i] != DELETED) {
                mValues[i] = DELETED;
                mGarbage = true;
            }
        }
    }

 public void clear() {
        int n = mSize;
        Object[] values = mValues;

        for (int i = 0; i < n; i++) {
            values[i] = null;
        }

        mSize = 0;
        mGarbage = false;
    }

可以看出，其实大部分是相同的，主要就是将对应索引或者key对应value的值设为Delete。clear方法是将所有value的值清空并将size设置为0.

再来看修改：setValueAt方法

public void setValueAt(int index, E value) {
        if (mGarbage) {
            gc();
        }

        mValues[index] = value;
    }

很简单，就是判断是否需要gc，然后再对values对应的索引赋值。

最后就是查询方法了：
查询方法比较多：

public E get(int key) {
       return get(key, null);
   }


   @SuppressWarnings("unchecked")
   public E get(int key, E valueIfKeyNotFound) {
       int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

       if (i < 0 || mValues[i] == DELETED) {
           return valueIfKeyNotFound;
       } else {
           return (E) mValues[i];
       }
   }

这两个方法主要是通过key来获取到value。

 */
    public int indexOfKey(int key) {
        if (mGarbage) {
            gc();
        }

        return ContainerHelpers.binarySearch(mKeys, mSize, key);
    }

    public int indexOfValue(E value) {
        if (mGarbage) {
            gc();
        }

        for (int i = 0; i < mSize; i++) {
            if (mValues[i] == value) {
                return i;
            }
        }

        return -1;
    }

 
    public int indexOfValueByValue(E value) {
        if (mGarbage) {
            gc();
        }

        for (int i = 0; i < mSize; i++) {
            if (value == null) {
                if (mValues[i] == null) {
                    return i;
                }
            } else {
                if (value.equals(mValues[i])) {
                    return i;
                }
            }
        }
        return -1;
    }

第一个方法是获取key对应的索引，后两个方法是获取到对应value的索引，两者的不同在于一个是用equals判断另一个是通过==来判断。

还有最后两个方法：keyAt和valueAt

public int keyAt(int index) {
        if (mGarbage) {
            gc();
        }

        return mKeys[index];
    }
 public E valueAt(int index) {
        if (mGarbage) {
            gc();
        }

        return (E) mValues[index];
    }

分别返回索引所对应的key和value。

总的看起来，SparseArray要比HashMap的数据结构简单的多。没有hashmap的node结点，也不需要迭代器来遍历。二分法查找在查找上更有优势。缺点就是它的key只支持int类型。显然在我们实际开发当中是不够的，因此也就有了ArrayMap了，ArrayMap的源码我们就不再分析了。实际上它的用法以及对数据的存储都和SparseArray一样，都是由两个数组来完成的，并且也是基于二分查找。不同的地方在于，ArrayMap的key可以是任意类型的。

SparseArray 与 HashMap

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读