美文网首页
python:批量下载存储

python:批量下载存储

作者: 胡童远 | 来源:发表于2021-06-30 09:27 被阅读0次

1 需求

下载链接数据(第三列)到文件夹(第一列),进度提示(第二列)

AF24-6AC        Streptococcus salivarius        https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/902/858/935/GCF_902858935.1_Ssal_L25/GCF_902858935.1_Ssal_L25_genomic.fna.gz
OF04-10BH       Bacteroides dorei       https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/902/387/545/GCF_902387545.1_UHGG_MGYG-HGUT-02478/GCF_902387545.1_UHGG_MGYG-HGUT-02478_genomic.fna.gz
AM33-14BH       Acidaminococcus intestini       https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/902/381/715/GCF_902381715.1_UHGG_MGYG-HGUT-01440/GCF_902381715.1_UHGG_MGYG-HGUT-01440_genomic.fna.gz
AF17-3  Parabacteroides goldsteinii     https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/902/375/575/GCF_902375575.1_MGYG-HGUT-01489/GCF_902375575.1_MGYG-HGUT-01489_genomic.fna.gz
AM17-44 Bacteroides plebeius    https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/902/374/375/GCF_902374375.1_MGYG-HGUT-01364/GCF_902374375.1_MGYG-HGUT-01364_genomic.fna.gz
AF18-27 Clostridium lavalense   https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/902/364/025/GCF_902364025.1_MGYG-HGUT-00172/GCF_902364025.1_MGYG-HGUT-00172_genomic.fna.gz

2 python3

思路:
行行读取,re.split截取各列,新建文件夹,wget -c -P下载,print输出提示

#!/usr/bin/env python3
import os, sys, re

count = 1
with open("ref_url.txt") as infile:
    for line in infile:
        line = line.strip()
        line = re.split(r'\t', line)
        if line[2] != "NA":
            os.mkdir("ref/" + line[0])
            os.system("wget -c " + line[2] + " -P " + "ref/" + line[0])
            print("\033[32m number {} : {} done... \033[0m".format(count, line[1]))
            count = count + 1

3 运行

python3 down.py

4 结果

5 解压,改名

## 解压
for i in `ls ./`;
do
    cd $i
    gunzip *
    cd ..
    echo -e "$i done..."
done

## 改名
for i in `ls ./`;
do
    mv $i/* $i/${i}.fna
    echo -e "$i done..."
done

相关文章

网友评论

      本文标题:python:批量下载存储

      本文链接:https://www.haomeiwen.com/subject/ooglultx.html