OkHttp源码之磁盘缓存的实现

作者: 低情商的大仙 | 来源:发表于2018-11-09 13:59 被阅读5次

在上篇文章okhttp源码之缓存文件介绍中,我们大致介绍了okhttp磁盘缓存的形式以及缓存文件的初始化,这篇文章中,我们继续探讨缓存的读写操作以及一些其他的知识点.

一、缓存读取

在分析缓存读取前，我们先回顾下Cache是如何通过DiskLruCache来读取缓存的：

String key = key(request.url());
    DiskLruCache.Snapshot snapshot;
    Entry entry;
    try {
      snapshot = cache.get(key);
      if (snapshot == null) {
        return null;
      }
    } catch (IOException e) {
      // Give up because the cache cannot be read.
      return null;
    }
    try {
      entry = new Entry(snapshot.getSource(ENTRY_METADATA));
    } catch (IOException e) {
      Util.closeQuietly(snapshot);
      return null;
    }
    Response response = entry.response(snapshot);
    //……
}

可以看到，这里先是计算了url的key，也就是journal文件那串编码，然后取得了一个Snapshot，借助Snapshot构造一个Cache.Entry(这里要和DiskLruCache.Entry分开)，通过Entry获取到Response。
所以我们重点分析这三步。

1.1 snapshot获取

先直接看源码：

public synchronized Snapshot get(String key) throws IOException {
    initialize();
    checkNotClosed();
    validateKey(key);
    //先从列表中获取对应的DiskLruCache.Entry,如果之前缓
   //存过，那么肯定会记录在journal中，在调用initialize()方法后肯定能在列表中找到
    Entry entry = lruEntries.get(key);
    if (entry == null || !entry.readable) return null;
    Snapshot snapshot = entry.snapshot();
    if (snapshot == null) return null;
    redundantOpCount++;
    //往journal文件中写入READ开头的行
    journalWriter.writeUtf8(READ).writeByte(' ').writeUtf8(key).writeByte('\n');
    if (journalRebuildRequired()) {
      executor.execute(cleanupRunnable);
    }
    return snapshot;
  }

这里是直接通过url生成的key去取DiskLruCache.Entry，取到后生成一个Snapshot，至于Snapshot是什么，我们看源码：

public final class Snapshot implements Closeable {
    private final String key;
    private final long sequenceNumber;
    private final Source[] sources;
    private final long[] lengths;
}

可以看到Snapshot和DiskLruCache.Entry差不多，只不过这里记录的不是缓存的文件，而是Source[]，这是okio的东西，其实可以看成对换成文件打开了两个InputStream[],有了Snapshot我么就能方便的对文件进行读取操作。
同时获取Snapshot后写入了一行以READ开头的内容到journal文件中。

1.2 Cache.Entry的构建

获取成功Snapshot后就是构建Cache.Entry的过程，首先看下Cache.Entry的结构：

private static final class Entry {
    /** Synthetic response header: the local time when the request was sent. */
    private static final String SENT_MILLIS = Platform.get().getPrefix() + "-Sent-Millis";

    /** Synthetic response header: the local time when the response was received. */
    private static final String RECEIVED_MILLIS = Platform.get().getPrefix() + "-Received-Millis";

    private final String url;
    private final Headers varyHeaders;
    private final String requestMethod;
    private final Protocol protocol;
    private final int code;
    private final String message;
    private final Headers responseHeaders;
    private final @Nullable Handshake handshake;
    private final long sentRequestMillis;
    private final long receivedResponseMillis;
}

从成员变量就可以看出，这个Cache.Entry主要是用来存请求的Response的Header的，我们看下Cache.Entry的构造过程：

 try {
      entry = new Entry(snapshot.getSource(ENTRY_METADATA));
    } catch (IOException e) {
      Util.closeQuietly(snapshot);
      return null;
    }

这里构造Cache.Entry的时候传入的其实是.0文件的Source（可以理解成InputStream），那么构造过程应该就是从.0文件读取header的内容的过程：

 Entry(Source in) throws IOException {
      try {
        BufferedSource source = Okio.buffer(in);
        url = source.readUtf8LineStrict();
        requestMethod = source.readUtf8LineStrict();
        Headers.Builder varyHeadersBuilder = new Headers.Builder();
        int varyRequestHeaderLineCount = readInt(source);
        for (int i = 0; i < varyRequestHeaderLineCount; i++) {
          varyHeadersBuilder.addLenient(source.readUtf8LineStrict());
        }
        varyHeaders = varyHeadersBuilder.build();

        StatusLine statusLine = StatusLine.parse(source.readUtf8LineStrict());
        protocol = statusLine.protocol;
        code = statusLine.code;
        message = statusLine.message;
      //省略代码
}

这里省略了很多代码，其实就是按照写入的顺序读出罢了，没什么好说的。

1.3 Response的构建

通过Cache.Entry来构建Response就一句代码：

Response response = entry.response(snapshot);

我们看具体实现：

public Response response(DiskLruCache.Snapshot snapshot) {
      String contentType = responseHeaders.get("Content-Type");
      String contentLength = responseHeaders.get("Content-Length");
      Request cacheRequest = new Request.Builder()
          .url(url)
          .method(requestMethod, null)
          .headers(varyHeaders)
          .build();
      return new Response.Builder()
          .request(cacheRequest)
          .protocol(protocol)
          .code(code)
          .message(message)
          .headers(responseHeaders)
           //重点关注此处body的获取
          .body(new CacheResponseBody(snapshot, contentType, contentLength))
          .handshake(handshake)
          .sentRequestAtMillis(sentRequestMillis)
          .receivedResponseAtMillis(receivedResponseMillis)
          .build();
    }

整个Response分成两部分，Header和Body，Header的内容，之前都读到Cache.Entry中去了，这里可以直接获取，所以我们重点关注Body的构建，这里使用了一个CacheResponseBody,跟进去：

 private static class CacheResponseBody extends ResponseBody {
    final DiskLruCache.Snapshot snapshot;
    private final BufferedSource bodySource;
    private final @Nullable String contentType;
    private final @Nullable String contentLength;
}

可以看到，实际上我们的核心工作就是把Snapshot中的第二个文件(也就是缓存body的文件)的输入流赋值到这里的bodySource就可以了，在构造方法中也确实这么干的：

CacheResponseBody(final DiskLruCache.Snapshot snapshot,
        String contentType, String contentLength) {
      this.snapshot = snapshot;
      this.contentType = contentType;
      this.contentLength = contentLength;

      Source source = snapshot.getSource(ENTRY_BODY);
      bodySource = Okio.buffer(new ForwardingSource(source) {
        @Override public void close() throws IOException {
          snapshot.close();
          super.close();
        }
      });
    }

由于CacheResponseBody此时持有了.1文件的输入流,因此CacheResponseBody就能从该文件中获取Response的body了,从文件中获取内容和从网络中获取内容其实没有什么区别,都是流的读写罢了.
至此，整个缓存的读取完成。

二、缓存写入

这块我们还是从Cache的put方法开始：

@Nullable CacheRequest put(Response response) {
    //此处省略部分代码
    //从Response中构造一个Cache.Entry
    Entry entry = new Entry(response);
    DiskLruCache.Editor editor = null;
    try {
      //从DiskLruCache中获取一个editor
      editor = cache.edit(key(response.request().url()));
      if (editor == null) {
        return null;
      }
      //将Cache.Entry写入到文件
      entry.writeTo(editor);
      //返回一个缓存的Request
      return new CacheRequestImpl(editor);
    } catch (IOException e) {
      abortQuietly(editor);
      return null;
    }
  }

整个写入部分分成4步，第一步构造Cache.Entry，通过上面的分析我们知道Cache.Entry存的都是Response中的Header信息，所以这里肯定是把Response的Header中内容赋值到Cache.Entry中，我们就不再深入，我们看后续三步。

2.1 获取editor

我们先看下Editor是什么：

public final class Editor {
    //此处的Entry是DiskLruCache.Entry,主要记录的是每个请求涉及到的具体文件
    final Entry entry;
    //文件是否可写
    final boolean[] written;
    private boolean done;
    //获取某个文件的输入流,类似与InputStream
    public Source newSource(int index) {
      //省略方法实现
    }
    //获取某个文件的输出流，类似于OutputStream
    public Sink newSink(int index) {
      //省略方法实现
    }

可以看到，这里的Editor其实就是提供了一种对缓存文件的流操作而已，类似于前面提到的Snapshot，当然这里多了一个数组记录各个文件是否可以写入的状态记录。
现在，我们再来分析如何获取一个Editor：

synchronized Editor edit(String key, long expectedSequenceNumber) throws IOException {
    initialize();
    //省略部分代码
    Entry entry = lruEntries.get(key);  
    //省略部分代码
    // Flush the journal before creating files to prevent file leaks.
    journalWriter.writeUtf8(DIRTY).writeByte(' ').writeUtf8(key).writeByte('\n');
    journalWriter.flush();
    if (hasJournalErrors) {
      return null; // Don't edit; the journal can't be written.
    }
    if (entry == null) {
      entry = new Entry(key);
      lruEntries.put(key, entry);
    }
    Editor editor = new Editor(entry);
    entry.currentEditor = editor;
    return editor;
  }

可以看到，首先是往journal文件中写入了以DIRTY开头的行，表明当前该请求的缓存文件已被某个线程准备写入。如果之前没有缓存过要先生成一个DiskLruCache.Entry，最后生成一个Editor便于对文件进行操作。

2.2将Cache.Entry写入到文件

其实就是这句：

    entry.writeTo(editor);

具体看源码：

public void writeTo(DiskLruCache.Editor editor) throws IOException {
      BufferedSink sink = Okio.buffer(editor.newSink(ENTRY_METADATA));

      sink.writeUtf8(url)
          .writeByte('\n');
      sink.writeUtf8(requestMethod)
          .writeByte('\n');
      sink.writeDecimalLong(varyHeaders.size())
          .writeByte('\n');
      for (int i = 0, size = varyHeaders.size(); i < size; i++) {
        sink.writeUtf8(varyHeaders.name(i))
            .writeUtf8(": ")
            .writeUtf8(varyHeaders.value(i))
            .writeByte('\n');
      }

      sink.writeUtf8(new StatusLine(protocol, code, message).toString())
          .writeByte('\n');
      sink.writeDecimalLong(responseHeaders.size() + 2)
          .writeByte('\n');
      for (int i = 0, size = responseHeaders.size(); i < size; i++) {
        sink.writeUtf8(responseHeaders.name(i))
            .writeUtf8(": ")
            .writeUtf8(responseHeaders.value(i))
            .writeByte('\n');
      }
      sink.writeUtf8(SENT_MILLIS)
          .writeUtf8(": ")
          .writeDecimalLong(sentRequestMillis)
          .writeByte('\n');
      sink.writeUtf8(RECEIVED_MILLIS)
          .writeUtf8(": ")
          .writeDecimalLong(receivedResponseMillis)
          .writeByte('\n');

      if (isHttps()) {
        sink.writeByte('\n');
        sink.writeUtf8(handshake.cipherSuite().javaName())
            .writeByte('\n');
        writeCertList(sink, handshake.peerCertificates());
        writeCertList(sink, handshake.localCertificates());
        sink.writeUtf8(handshake.tlsVersion().javaName()).writeByte('\n');
      }
      sink.close();
    }

这里非常简单，首先打开了.0文件的输出流，然后往里按顺序写入Response的Header中的内容,也就是说这一步完成了Header的缓存。

2.3 返回一个缓存的CacheRequestImpl

最后，我么看到整个put操作返回了一个CacheRequestImpl：

 return new CacheRequestImpl(editor);

看到这里大家可能会很奇怪，目前来看只是向缓存文件写了一个Header中的信息，并没有缓存Body的信息，而且还返回了一个莫名奇妙的CacheRequestImpl。别急，要弄懂这个我们要回到整个Cache的put方法调用的地方，也就是CacheInterceptor里：

 if (cache != null) {
      if (HttpHeaders.hasBody(response) && CacheStrategy.isCacheable(response, networkRequest)) {
        // Offer this request to the cache.
        CacheRequest cacheRequest = cache.put(response);
        return cacheWritingResponse(cacheRequest, response);
      }
  //省略代码
}

这里的cache.put，我们刚才分析缓存了Header信息，那么body信息的缓存必然是在cacheWritingResponse中了，在分析它的代码之前，我们先思考一个问题：Response的Body什么时候缓存合适？由于body是以流的形式读取的，不像Header可以一次性写入，所以body的缓存必然是在读取的时候，一边从流里读，一边缓存到文件。由于流只能读一次，如果把流里面的内容都读出来返回给app调用层,就没办法重新读一遍缓存到文件中了，所以需要把流内容拷贝，这也是为什么要返回通过cacheWritingResponse()方法处理过后的Response的原因。
现在，我们跟进去看看这个方法：

private Response cacheWritingResponse(final CacheRequest cacheRequest, Response response)
      throws IOException {
 
    Source cacheWritingSource = new Source() {
      boolean cacheRequestClosed;
      @Override public long read(Buffer sink, long byteCount) throws IOException {
        long bytesRead;
        try {
          bytesRead = source.read(sink, byteCount);
        } catch (IOException e) {
          if (!cacheRequestClosed) {
            cacheRequestClosed = true;
            cacheRequest.abort(); // Failed to write a complete cache response.
          }
          throw e;
        }
        if (bytesRead == -1) {
          if (!cacheRequestClosed) {
            cacheRequestClosed = true;
            cacheBody.close(); // The cache response is complete!
          }
          return -1;
        }
        sink.copyTo(cacheBody.buffer(), sink.size() - bytesRead, bytesRead);
        cacheBody.emitCompleteSegments();
        return bytesRead;
      }
  //省略很多代码
return response.newBuilder()
        .body(new RealResponseBody(contentType, contentLength, Okio.buffer(cacheWritingSource)))
        .build();
}

这个方法最终返回的是一个RealResponseBody,但最终在read的时候都会调用到生成的cacheWritingSource的read()方法中去，缓存也是在这里写入的，核心就是这句：

sink.copyTo(cacheBody.buffer(), sink.size() - bytesRead, bytesRead);

往cacheBody中拷贝内容。结合上面的源码，这里的cacheBody其实就是CacheRequestImpl中的body：

CacheRequestImpl(final DiskLruCache.Editor editor) {
       //省略部分代码
      this.body = new ForwardingSink(cacheOut) {
        @Override public void close() throws IOException {
          synchronized (Cache.this) {
          //省略部分代码
          editor.commit();
        }
      };
    }

可以看到，Body写入到文件后，最终还会调用editor的commit()方法，由于之前写入Header和写入body其实都是往dirty文件，也就是xxx.0.tmp和xxx.1.tmp文件中写入，所以这里的commit其实就是将.tmp文件后缀名去掉，变成clean文件而已，同时在journal文件中加入一个CLEAN行。由于篇幅有限，这里不再展开。

三、缓存文件大小控制

由于每个请求都会产生两个文件，同时每一次对缓存文件的操作，读取、删除等等都会在journal中新增一行，长此以往，整个缓存目录大小必然会膨胀起来。因此DiskLruCache有自己的清理机制。
所谓的清理机制就是执行一个Runnable而已:

private final Runnable cleanupRunnable = new Runnable() {
    public void run() {
     //省略具体实现
  };

3.1 何时清理缓存

之所以要清理缓存必然是因为缓存的文件大小超过规定的最大大小导致的。因此，凡是影响缓存文件大小的时机以及修改最大缓存值的时候都会开始清理缓存。

Journal文件太大

由于对缓存的每一次操作都会在journal文件中新增一行，行数太多，文件会增大，必然会影响文件读取效率。由于缓存的读取、删除、写入都会往journal文件中写入内容,因此都会触发清理机制,此处以读取为例：

public synchronized Snapshot get(String key) throws IOException {
    //省略代码
    redundantOpCount++;
    journalWriter.writeUtf8(READ).writeByte(' ').writeUtf8(key).writeByte('\n');
    if (journalRebuildRequired()) {
      executor.execute(cleanupRunnable);
    }

    return snapshot;
  }

当然,由于journal文件改动触发的清理机制,清理之前肯定要判断journal文件是否过大,
这里我们关注journalRebuildRequired()方法：

boolean journalRebuildRequired() {
    final int redundantOpCompactThreshold = 2000;
    return redundantOpCount >= redundantOpCompactThreshold
        && redundantOpCount >= lruEntries.size();
  }

此处判断条件就是如果journal文件记录的行数比实际的请求多了2000条就认为要清理。之所以会多，是因为对同一个请求会产生DIRTY、CLEAN、READ等行，所以肯定比实际缓存请求的数目多。

真正的缓存文件大小超过规定

除了journal文件外，每次写入缓存时都会统计当前所有文件尺寸是否超过规定：

synchronized void completeEdit(Editor editor, boolean success) throws IOException {
    //省略代码
    for (int i = 0; i < valueCount; i++) {
      File dirty = entry.dirtyFiles[i];
      if (success) {
        if (fileSystem.exists(dirty)) {
          //此处会统计最新的size大小
          size = size - oldLength + newLength;
        }
      } else {
        fileSystem.delete(dirty);
      }
    }
  //省略代码
    //此处会判断size是否符合要求以及journal文件是否符合要求
    if (size > maxSize || journalRebuildRequired()) {
      executor.execute(cleanupRunnable);
    }
  }

从注释中可以看到，此处加入了size的条件控制

最大缓存大小改变

public synchronized void setMaxSize(long maxSize) {
    this.maxSize = maxSize;
    if (initialized) {
      executor.execute(cleanupRunnable);
    }
  }

只要动态修改了最大缓存大小，都要清理一次缓存

3.2 如何清理缓存文件：

我们看下cleanRunnable的具体实现：

private final Runnable cleanupRunnable = new Runnable() {
    public void run() {
      synchronized (DiskLruCache.this) {
        if (!initialized | closed) {
          return; // Nothing to do
        }
        try {
          //删除必要缓存文件
          trimToSize();
        } catch (IOException ignored) {
          mostRecentTrimFailed = true;
        }
        try {
          if (journalRebuildRequired()) {
            //重新建立journal文件
            rebuildJournal();
            redundantOpCount = 0;
          }
        } catch (IOException e) {
          mostRecentRebuildFailed = true;
          journalWriter = Okio.buffer(Okio.blackhole());
        }
      }
    }
  };

整个清理工作分成两部分，一部分是删除具体缓存文件，另一部分是重新生成journal文件。
首先看删除具体缓存文件：

void trimToSize() throws IOException {
    while (size > maxSize) {
      Entry toEvict = lruEntries.values().iterator().next();
      removeEntry(toEvict);
    }
    mostRecentTrimFailed = false;
  }

就是一个简单的循环，删到缓存大小符合要求，具体删除就是文件删除，此处不再展开。
清理完缓存文件后,如果需要清理journal文件的话在重新建立journal文件.那么journal文件如何清理?
我们继续回顾下之前的文章提到的journal文件结构:

libcore.io.DiskLruCache
1
201105
2

DIRTY 2f6822d346ffd682c8e88bcd087a7d52
CLEAN 2f6822d346ffd682c8e88bcd087a7d52 275 197
READ 2f6822d346ffd682c8e88bcd087a7d52
READ 2f6822d346ffd682c8e88bcd087a7d52
DIRTY 2f6822d346ffd682c8e88bcd087a7d52
CLEAN 2f6822d346ffd682c8e88bcd087a7d52 275 192

可以看到目前journal文件记录6行,但其实这都是对一个url请求的写-->读-->读-->写操作,真正能代表当前缓存状态的其实是最后一行,因此只需要保留最后一行就可以了.

synchronized void rebuildJournal() throws IOException {
    if (journalWriter != null) {
      journalWriter.close();
    }

    BufferedSink writer = Okio.buffer(fileSystem.sink(journalFileTmp));
    try {
      //写入journal的前5行
      writer.writeUtf8(MAGIC).writeByte('\n');
      writer.writeUtf8(VERSION_1).writeByte('\n');
      writer.writeDecimalLong(appVersion).writeByte('\n');
      writer.writeDecimalLong(valueCount).writeByte('\n');
      writer.writeByte('\n');

      for (Entry entry : lruEntries.values()) {
        if (entry.currentEditor != null) {
          writer.writeUtf8(DIRTY).writeByte(' ');
          writer.writeUtf8(entry.key);
          writer.writeByte('\n');
        } else {
          writer.writeUtf8(CLEAN).writeByte(' ');
          writer.writeUtf8(entry.key);
          entry.writeLengths(writer);
          writer.writeByte('\n');
        }
      }
    } finally {
      writer.close();
    }
    if (fileSystem.exists(journalFile)) {
      fileSystem.rename(journalFile, journalFileBackup);
    }
    fileSystem.rename(journalFileTmp, journalFile);
    fileSystem.delete(journalFileBackup);

    journalWriter = newJournalWriter();
    hasJournalErrors = false;
    mostRecentRebuildFailed = false;
  }

源码很简单,可以看到,重建时,如果当前的请求正在写入,则依然保留为DIRTY行,否则都只保留CLEAN行.

四. 总结

本文乍看之下比较冗长,但大家如果想将okhttp的缓存思想吸收,这些细节才是关键,所以最好是打开源码参照本文细细琢磨,耐心看下去.

OkHttp源码之磁盘缓存的实现

一、缓存读取

1.1 snapshot获取

1.2 Cache.Entry的构建

1.3 Response的构建

二、缓存写入

2.1 获取editor

2.2将Cache.Entry写入到文件

2.3 返回一个缓存的CacheRequestImpl

三、缓存文件大小控制

3.1 何时清理缓存

Journal文件太大

真正的缓存文件大小超过规定

最大缓存大小改变

3.2 如何清理缓存文件：

四. 总结

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

Android开发

Android技术知识