cis目标数据库文件相当大(其中大多数为1-100GB)。
为了避免损坏或不完整的下载,文件可以下载与zsync_curl(这基本上是rsync超过HTTP(S))。
它允许恢复已经部分下载的数据库,并且只会下载丢失或重新下载损坏的块。
首选:下载带有zsync_curl的数据库
您可以从源下载静态链接的自编译版本:
A) 下载静态链接zsync_curl
Download (with wget or curl):
wget https://resources.aertslab.org/cistarget/zsync_curl
# curl -O https://resources.aertslab.org/cistarget/zsync_curl
# Make executable:
chmod a+x zsync_curl
# Display full path to zsync_curl.
ZSYNC_CURL="${PWD}/zsync_curl"
echo "${ZSYNC_CURL}"
B) 从源头编译zsync_curl
按照 https://github.com/AppImage/zsync-curl 上的生成说明操作
# Display path to zsync_curl:
ZSYNC_CURL='zsync_curl'
echo "${ZSYNC_CURL}"
下载包含zsync_curl的数据库
# Specify database name:
feather_database_url='https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/gene_based/hg19-500bp-upstream-7species.mc8nr.feather'
# Download database with zsync_curl:
"${ZSYNC_CURL}" "${feather_database_url}.zsync"
备选方案:直接下载数据库
# Specify database name:
feather_database_url='https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/gene_based/hg19-500bp-upstream-7species.mc8nr.feather'
feather_database="${feather_database_url##*/}"
# Download database directly (with wget or curl):
wget "${feather_database_url}"
# curl -O "${feather_database_url}"
# Download sha256sum.txt (with wget or curl):
wget https://resources.aertslab.org/cistarget/databases/sha256sum.txt
# curl -O https://resources.aertslab.org/cistarget/databases/sha256sum.txt
# Check if sha256 checksum matches for the downloaded database:
awk -v feather_database=${feather_database} '$2 == feather_database' sha256sum.txt | sha256sum -c -
# If you downloaded mulitple databases, you can check them all at onces with:
sha256sum -c sha256sum.txt
网友评论