Overview
整理一下es的snapshot功能,分两块,一块是本地磁盘disk存储,一块是远程hdfs作存储,目录如下,
0. Overview
1. Version
2. Install plugin
3. Disk
- create repo
- create snapshot
- restore
- setp
4. HDFS
- create hdfs repo
- insert data
- create hdfs snapshot
- restore from hdfs
5. Restoring to a different cluster
- registering repository
- list snapshot
- starting restore from a snapshot
6. benchmark
- snapshoting speed
- restoring speed
7. plugin auto route
8. other
9. Reference
Version
- elasticsearch-5.4.3.zip
- repository-hdfs-5.4.3.zip
Install plugin
# need to specified absolute path
bin/elasticsearch-plugin install file:///data/mapleleaf/es_snapshot/repository-hdfs-5.4.3.zip
# check hdfs master namenode ip and port using webhdfs
curl -i "http://localhost:8081/webhdfs/v1/?op=LISTSTATUS"
# start es
sh bin/elasticsearch -d
ps aux | grep elasticsearch | grep -v "grep" | awk '{print $2}' | xargs kill -9
ps aux | grep elasticsearch | grep -v "grep" | awk '{print $2}' | xargs kill -9 ; sleep 3 && sh bin/elasticsearch -d && ps aux | grep elasticsearch | grep -v "grep" && tailf logs/es_snap.log
Disk
create repo
# add below line to esyml
path.repo: ["/data/mapleleaf/es_snapshot/my_backup"]
# create repo, named: my_backup
curl -XPUT 'http://localhost:9200/_snapshot/my_backup' -H 'Content-Type: application/json' -d '{
"type": "fs",
"settings": {
"location": "/data/mapleleaf/es_snapshot/my_backup",
"compress": true
}
}'
curl -X GET "localhost:9200/_snapshot/my_backup?pretty"
curl -X DELETE "localhost:9200/_snapshot/my_backup"
create snapshot
# create snapshot
curl -X PUT "localhost:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true&pretty"
curl -X GET "localhost:9200/_snapshot/my_backup/*?pretty"
curl -X GET "localhost:9200/_snapshot/my_backup/snapshot_1/_status?pretty"
curl -X DELETE "localhost:9200/_snapshot/my_backup/snapshot_1?pretty"
restore
# restore
curl -X POST "localhost:9200/_snapshot/my_backup/snapshot_1/_restore?pretty"
setp
- check index
curl -X PUT "localhost:9200/customer" -H 'Content-Type: application/json' -d'
{
"settings" : {
"index" : {
"number_of_shards" : 5,
"number_of_replicas" : 0
}
}
}
'
curl -X GET "localhost:9200/_cat/indices?v"
curl -X DELETE "localhost:9200/customer?pretty"
- insert data
for i in {1..10000};
do
curl -s -X POST "localhost:9200/customer/external/?pretty" -H 'Content-Type: application/json' -d"
{
\"id\": ${i},
\"num\": ${i},
\"name\": \"John Doe\"
}" > /dev/null
done
data:image/s3,"s3://crabby-images/9cad9/9cad99ebf60ae9abf579c42cf02b77b2573144d6" alt=""
- close index
curl -X POST "localhost:9200/customer/_close?pretty"
- restore
因为之前我store了一次backup,当时backup只有1条doc,当插入1万条之后,close,然后restore,是以当时store的snapshot来恢复。
data:image/s3,"s3://crabby-images/8d3d4/8d3d4c1708e76939eebd552e446fe3467ee54469" alt=""
- reinsert
curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match_all": {}
}
}'
data:image/s3,"s3://crabby-images/ddbe7/ddbe751e574ab0366e38dd2e985188e5ff678dbe" alt=""
- create snapshot_2
data:image/s3,"s3://crabby-images/b6371/b63711063ae993f680ba45b467dc970d4ca511cc" alt=""
data:image/s3,"s3://crabby-images/c747f/c747fbf855281fafd211f2e114a4f817ceb87008" alt=""
7 close & restore
HDFS
create hdfs repo
curl -X PUT "localhost:9200/_snapshot/my_hdfs_repository?pretty" -H 'Content-Type: application/json' -d'
{
"type": "hdfs",
"settings": {
"uri": "hdfs://xxxxx:xxxx",
"path": "elasticsearch/respositories/my_hdfs_repository",
"compress": true
}
}'
如果在这一步出现异常,可以参考这里。
data:image/s3,"s3://crabby-images/235f6/235f6f7dadc5f264a30f4ad1a353b99712aa15e8" alt=""
insert data
data:image/s3,"s3://crabby-images/970dc/970dc118b8a578e682ad09ddd29568b48f4b4999" alt=""
create hdfs snapshot
curl -X PUT "localhost:9200/_snapshot/my_hdfs_repository/snapshot_hdfs_1?wait_for_completion=true&pretty"
data:image/s3,"s3://crabby-images/afa1f/afa1f5ebf81ed9f104ae46765fb4ecd28b3ab602" alt=""
在jvm.optiopns
添加插件的安全配置
data:image/s3,"s3://crabby-images/9afa3/9afa34e7cad6cb1cdab3f15a63aec2486741a111" alt=""
data:image/s3,"s3://crabby-images/b9560/b9560100e8871a00bc16ea733fddf65590cdd92c" alt=""
data:image/s3,"s3://crabby-images/3bc9a/3bc9a34de3d202b243513f5056654c142bc8e384" alt=""
restore from hdfs
- 随意增加一些docs,使得与snapshot时的index有差异,便于观察restore效果。
data:image/s3,"s3://crabby-images/fa878/fa878d372329fe3b63f0248b2ce93b2083415e84" alt=""
- close index
data:image/s3,"s3://crabby-images/84346/8434674a72cf37b68e1207b8534cd30fec14ca8d" alt=""
- restore
curl -X POST "localhost:9200/_snapshot/my_hdfs_repository/snapshot_hdfs_1/_restore?pretty"
data:image/s3,"s3://crabby-images/76392/76392869e6b455a13b09d0bca1426694d3075574" alt=""
data:image/s3,"s3://crabby-images/ff2fb/ff2fb8510604f2989ec84d92638e15a175bdb864" alt=""
Restoring to a different cluster
All that is required is
registering
the repository containing the snapshot in the new cluster andstarting
the restore process.
curl -X GET "localhost:9201/_cat/indices?v"
data:image/s3,"s3://crabby-images/62734/62734304eb60f7fb05f91cf3c6c10b0b20b1a911" alt=""
registering repository
curl -X PUT "localhost:9201/_snapshot/my_hdfs_repository?pretty" -H 'Content-Type: application/json' -d'
{
"type": "hdfs",
"settings": {
"uri": "hdfs://xxxxx:xxxx",
"path": "elasticsearch/respositories/my_hdfs_repository",
"compress": true
}
}'
data:image/s3,"s3://crabby-images/768f0/768f09203980ebc2a2eae41f81f9ce0b46a404af" alt=""
list snapshot
curl -X GET "localhost:9201/_snapshot/my_hdfs_repository/*?pretty"
data:image/s3,"s3://crabby-images/fdadc/fdadc375000c1cd725368b64102a70bd6e2b992e" alt=""
starting restore
curl -X POST "localhost:9201/_snapshot/my_hdfs_repository/snapshot_hdfs_1/_restore?pretty"
data:image/s3,"s3://crabby-images/afcab/afcab0d1da3439784b332fc56b48529db10e6f22" alt=""
benchmark
会用esrally将数据写入
data:image/s3,"s3://crabby-images/341cf/341cf687694b05cfb945e3720ff804e28ab4233c" alt=""
snapshoting speed
data:image/s3,"s3://crabby-images/4cea9/4cea9ac68b70506729aa35cd28b07c4ad6531542" alt=""
# backgroud running
curl -X PUT "XXX:9200/_snapshot/my_hdfs_repository/snapshot_hdfs_long_1" -H 'Content-Type: application/json' -d'
{
"indices": "591_etl_fuhaochen_test_2018062500",
"ignore_unavailable": true,
"include_global_state": false
}'
# check running status
curl -X GET "XXX:9200/_snapshot/my_hdfs_repository/*?pretty"
data:image/s3,"s3://crabby-images/41c6f/41c6fd47f18fd41e77833e94f7ed688aa024795f" alt=""
data:image/s3,"s3://crabby-images/9bfa1/9bfa17a3cc67e0f4a4a0755abfe69621b09043af" alt=""
data:image/s3,"s3://crabby-images/a9ad8/a9ad851f57afca8fad1f9f8f6a587d894b5056a6" alt=""
restoring speed
date
curl -X POST "XXX:9201/_snapshot/my_hdfs_repository/snapshot_hdfs_long_1/_restore?wait_for_completion=true&pretty"
date
data:image/s3,"s3://crabby-images/1a4e2/1a4e2b3c0f9b680342acbaeb3368f01f68ad0057" alt=""
snapshoting耗时远比restoring高。
plugin auto route
测试一下插件会不会自动路由,即是否需要在每一个节点(datanode,masternode等)都安装?还是只需要在整个es集群的其中一个node安装之后,该node就会将plugin自动路由安装到集群的其他node上?
data:image/s3,"s3://crabby-images/38765/387658f9754cbd5495fa8a34626f92bdac5e8128" alt=""
data:image/s3,"s3://crabby-images/ac8b1/ac8b1b20ce07be76b3c0959131c55382dcb36400" alt=""
data:image/s3,"s3://crabby-images/91fdf/91fdf596f5270bd4d2b8747995212f89a881106f" alt=""
自动路由不可用。
other
- 尝试snapshot更大的index,但是报错了,配置应该没有问题(因为小索引是snapshot成功的)
data:image/s3,"s3://crabby-images/e0fe4/e0fe4fa669411ea89b274f30231ae959031c87e3" alt=""
data:image/s3,"s3://crabby-images/ba78a/ba78aea5f6fba151f7d86a155e840bde55679e93" alt=""
Self-suppression not permitted这个error应该是hadoop的DataNode剩余空间不够导致。
网友评论