1 概述
本文只是ElasticSearch中缓存综述,不具体介绍缓存如何实现以及如何使用。ElasticSearch的缓存分为Node级别、Index级别两种。根据阅读源码,发现Node级别的缓存有Query缓存IndicesQueryCache
、索引Field缓存IndicesFieldDataCache
、索引Request缓存IndicesRequestCache
。Index级别的缓存有Filter缓存BitsetFilterCache
。这里Node级别和Index级别的划分依据是Cache实例是Node级别维护一个还是每个Index各维护一个,所以有的Cache在本文被划分为Node级别,但是也有可能是通过Index级别管理的。
2 Node级别缓存
根据上面的介绍,Node级别的缓存有Query缓存IndicesQueryCache
、索引Field缓存IndicesFieldDataCache
、索引Request缓存IndicesRequestCache
。
2.1 Query缓存IndicesQueryCache
官网说明:
The query cache is responsible for caching the results of queries. There is one queries cache per node that is shared by all shards. The cache implements an LRU eviction policy: when a cache becomes full, the least recently used data is evicted to make way for new data. It is not possible to look at the contents being cached.
The query cache only caches queries which are being used in a filter context.
查看IndicesQueryCache
定义可以发现,其使用org.apache.lucene.search.LRUQueryCache
缓存查询信息。
IndicesQueryCache
在IndicesService
构造函数中被实例化:
//IndicesService.IndicesService(...)
this.indicesQueryCache = new IndicesQueryCache(settings);
相关的配置有如下几个,后面的介绍来自官网或者源码注释:
- indices.queries.cache.size:
Controls the memory size for the filter cache , defaults to 10%. Accepts either a percentage value, like 5%, or an exact value, like 512mb.
- indices.queries.cache.count
mostly a way to prevent queries from being the main source of memory usage of the cache
- indices.queries.cache.all_segments:
enables caching on all segments instead of only the larger ones, for testing only
IndicesQueryCache
会在IndicesService
创建Index时传给创建的IndexService
构造函数作为参数,调用轨迹如下:
IndexService.createIndex
->
IndexService.createIndexService
->
new IndexModule().newIndexService
->
IndexService.IndexService()
在上面函数调用中new IndexModule().newIndexService
中会使用传入的IndicesQueryCache
实例化QueryCache
对象:
//IndexModule.newIndexService
final QueryCache queryCache;
//如果index.queries.cache.enabled设置启用(默认启用),则
//创建具有缓存功能的QueryCache,否则创建DisabledQueryCache
if (indexSettings.getValue(INDEX_QUERY_CACHE_ENABLED_SETTING)) {
BiFunction<IndexSettings, IndicesQueryCache, QueryCache> queryCacheProvider = forceQueryCacheProvider.get();
if (queryCacheProvider == null) {
queryCache = new IndexQueryCache(indexSettings, indicesQueryCache);
} else {
queryCache = queryCacheProvider.apply(indexSettings, indicesQueryCache);
}
} else {
queryCache = new DisabledQueryCache(indexSettings);
}
上面的IndexQueryCache
定义如下,其实就是封装了IndicesQueryCache
:
//IndexQueryCache
//注意下面的注释,说IndexQueryCache是index级别的缓存,但是实际的缓存功能是通过委托到Node级别的IndicesQueryCache实现的。
/**
* The index-level query cache. This class mostly delegates to the node-level
* query cache: {@link IndicesQueryCache}.
*/
public class IndexQueryCache extends AbstractIndexComponent implements QueryCache {
final IndicesQueryCache indicesQueryCache;
public IndexQueryCache(IndexSettings indexSettings, IndicesQueryCache indicesQueryCache) {
super(indexSettings);
this.indicesQueryCache = indicesQueryCache;
}
@Override
public void close() throws ElasticsearchException {
clear("close");
}
@Override
public void clear(String reason) {
logger.debug("full cache clear, reason [{}]", reason);
indicesQueryCache.clearIndex(index().getName());
}
@Override
public Weight doCache(Weight weight, QueryCachingPolicy policy) {
return indicesQueryCache.doCache(weight, policy);
}
}
创建的QueryCache
会被传入IndexService
构造函数,IndexService
使用此QueryCache
实例化IndexCache
对象实例。IndexService
在实例化IndexCache
时使用了两个参数,一个是QueryCache
,一个是第三节会介绍的BitsetFilterCache
。
//IndexService.IndexService(...)
this.bitsetFilterCache = new BitsetFilterCache(indexSettings, new BitsetCacheListener(this));
this.indexCache = new IndexCache(indexSettings, queryCache, bitsetFilterCache);
2.2 Field缓存IndicesFieldDataCache
官网说明:
The field data cache is used mainly when sorting on or computing aggregations on a field. It loads all the field values to memory in order to provide fast document based access to those values. The field data cache can be expensive to build for a field, so its recommended to have enough memory to allocate it, and to keep it loaded.
IndicesFieldDataCache
也是在IndicesService
构造函数中实例化的,属于Node级别的缓存:
//IndicesService.IndicesService(...
this.indicesFieldDataCache = new IndicesFieldDataCache(settings, new IndexFieldDataCache.Listener() {
@Override
public void onRemoval(ShardId shardId, String fieldName, boolean wasEvicted, long sizeInBytes) {
assert sizeInBytes >= 0 : "When reducing circuit breaker, it should be adjusted with a number higher or equal to 0 and not [" + sizeInBytes + "]";
circuitBreakerService.getBreaker(CircuitBreaker.FIELDDATA).addWithoutBreaking(-sizeInBytes);
}
});)
实例化IndicesFieldDataCache
时,其构造函数中使用了配置如下:
indices.fielddata.cache.size:
The max size of the field data cache, eg 30% of node heap space, or an absolute value, eg 12GB. Defaults to unbounded. Also see Field data circuit breakeredit.
查看IndicesFieldDataCache
的定义可知,其内部使用org.elasticsearch.common.cache.Cache
实现Filed缓存。
在实例化IndexService
时,IndicesService
会将IndicesFieldDataCache
实例作为参数传入IndexService
构造函数,IndexService
在其构造函数中会使用IndicesFieldDataCache
实例作为参数构造IndexFieldDataService
对象实例。IndexFieldDataService
是Index自己维护的用于缓存Filed信息的Service。
//IndexService.IndexService(...)
this.indexFieldData = new IndexFieldDataService(indexSettings, indicesFieldDataCache, circuitBreakerService, mapperService);
在使用IndexFieldDataService
进行缓存操作时,最主要使用其getForField
方法:
//IndexFieldDataService
public <IFD extends IndexFieldData<?>> IFD getForField(MappedFieldType fieldType) {
return getForField(fieldType, index().getName());
}
@SuppressWarnings("unchecked")
public <IFD extends IndexFieldData<?>> IFD getForField(MappedFieldType fieldType, String fullyQualifiedIndexName) {
final String fieldName = fieldType.name();
IndexFieldData.Builder builder = fieldType.fielddataBuilder(fullyQualifiedIndexName);
IndexFieldDataCache cache;
synchronized (this) {
cache = fieldDataCaches.get(fieldName);
if (cache == null) {
String cacheType = indexSettings.getValue(INDEX_FIELDDATA_CACHE_KEY);
if (FIELDDATA_CACHE_VALUE_NODE.equals(cacheType)) {
//调用IndicesFieldDataCache.buildIndexFieldDataCache方法
//构建实际的IndexFieldDataCache实例用于缓存Filed数据
cache = indicesFieldDataCache.buildIndexFieldDataCache(listener, index(), fieldName);
} else if ("none".equals(cacheType)){
cache = new IndexFieldDataCache.None();
} else {
throw new IllegalArgumentException("cache type not supported [" + cacheType + "] for field [" + fieldName + "]");
}
fieldDataCaches.put(fieldName, cache);
}
}
return (IFD) builder.build(indexSettings, fieldType, cache, circuitBreakerService, mapperService);
}
下面看下IndicesFieldDataCache.buildIndexFieldDataCache
是如何构建IndexFieldDataCache
的:
public IndexFieldDataCache buildIndexFieldDataCache(IndexFieldDataCache.Listener listener, Index index, String fieldName) {
//参数cache是Node级别IndicesFieldDataCache的成员,从这里可以看出
//虽然IndexService会自己创建IndexFieldDataService,但是其最终
//创建的IndexFieldDataCache实际缓存的数据还是放在Node级别实例
//IndicesFieldDataCache的成员cache中。
return new IndexFieldCache(logger, cache, index, fieldName, indicesFieldDataCacheListener, listener);
}
2.3 Request缓存IndicesRequestCache
官网说明:
When a search request is run against an index or against many indices, each involved shard executes the search locally and returns its local results to the coordinating node, which combines these shard-level results into a “global” result set.
The shard-level request cache module caches the local results on each shard. This allows frequently used (and potentially heavy) search requests to return results almost instantly. The requests cache is a very good fit for the logging use case, where only the most recent index is being actively updated — results from older indices will be served directly from the cache.
IndicesRequestCache
在IndicesService
构造函数被实例化:
//IndicesService.IndicesService(...)
this.indicesRequestCache = new IndicesRequestCache(settings);
涉及到的配置参数有:
- index.requests.cache.enable
配置是否开启Request缓存,配置粒度可以到Index级别。
- indices.requests.cache.size
The cache is managed at the node level, and has a default maximum size of 1% of the heap(配置Request缓存占用的最大内存)
- indices.requests.cache.expire
下面是来自官网的说明:
Also, you can use the indices.requests.cache.expire setting to specify a TTL for cached results, but there should be no reason to do so. Remember that stale results are automatically invalidated when the index is refreshed. This setting is provided for completeness' sake only.
3 Index级别缓存
Index级别的缓存有Filter缓存BitsetFilterCache
。
BitsetFilterCache
是在IndexService
构造函数被实例化的:
//IndexService.IndexService(...)
this.bitsetFilterCache = new BitsetFilterCache(indexSettings, new BitsetCacheListener(this));
BitsetFilterCache
使用的参数是:
- index.load_fixed_bitset_filters_eagerly
下面是来自github issue的一句说明:
FixedBitSetFilterCache is a data structure that is loaded eagerly in memory (by default) to support nested query/filter and nested aggregations. However, the problem is that it can cause it to use too much heap for it is loaded for all nested fields (regardless of whether these fields are being used). To prevent this from happening, a common configuration workaround is to set index.load_fixed_bitset_filters_eagerly: false in the yml of the nodes and restart them to prevent the nodes from running OOM when attempting to eagerly load the fixedbitsets.
BitsetFilterCache
在IndexService
构造函数实例化之后,会被作为参数构建IndexCache
缓存
//IndexService.IndexService(...)
this.bitsetFilterCache = new BitsetFilterCache(indexSettings, new BitsetCacheListener(this));
this.indexCache = new IndexCache(indexSettings, queryCache, bitsetFilterCache);
网友评论