KOBAS3-docker终于跑起来了。

作者: 守望一株麦穗 | 来源:发表于2022-02-03 15:31 被阅读0次

KOBAS3-docker终于跑起来了。
终于又跑起来了
我终于跑起来了
学习开源 java 项目 guns
EOS DAWN 3.0 安装及智能合约初体验
跑起来了
跑起来了
跑起来了
【vue-admin】终于把它跑起来了……
2018-09-06跑步的心得体会

前言：Docker安装成功了
Kobas下载成功了，也load成功了
无意间的操作，竟然跑成功了。呵呵

在成功跑起来之前，一直没有运行映射命令。

docker run -it -v /gpfs:/gpfs -v /local/dir/path/to/seq_pep:/opt/kobas-3.0/seq_pep   -v /local/dir/path/to/sqlite3/:/opt/kobas-3.0/sqlite3 -v  /home/budc:/home/budc  kobas

-v /gpfs:/gpfs  是指把本地的目录/gpfs与docker中的gpfs中映射
-v  /home/budc:/home/budc是指把本地的目录/home/budc与docker中的/home/budc进行映射。":"前的是本地目录，":"后的是docker容器中的目录
-v /local/dir/path/to/seq_pep:/opt/kobas-3.0/seq_pep  是把本地seq_pep与docker中的seq_pep进行映射，这样就省去拷贝的事情了。seq_pep是kobas用来做blast的序列库。
-v /local/dir/path/to/sqlite3/:/opt/kobas-3.0/sqlite3   这里和seq_pep的意思相同，这里是用sqlit3中的数据库文件记录着各种id映射关系，这里的映射关系也可以导出为txt文本。
这里的映射是为了docker容器内外的数据沟通。方便数据进出。

如果没有运行上面的映射命令。运行测试蛋白example.fasta，都出错，奇了怪了。

annotate.py  -i example.fasta -t fasta:pro -s hsa -o test_annot.tsv -e 1e-5 -r 1 -n 4 -y /opt/kobas-3.0/seq_pep -q /opt/kobas-3.0/sqlite3

annotate.py [-l] -i infile [-t intype] -s species [-o outfile] [-e evalue] [-r rank] [-n nCPUs] [-c coverage] [-z ortholog] [-k kobas_home] [-v blast_home] [-y blastdb] [-q kobasdb] [-p blastp] [-x blastx]

Options:
  -h, --help            show this help message and exit
  -l, --list            list available species, or list available databases
                        for a specific species
  -i INFILE, --infile=INFILE
                        input data file
  -t INTYPE, --intype=INTYPE
                        input type (fasta:pro, fasta:nuc, blastout:xml,
                        blastout:tab, id:refseqpro, id:uniprot, id:ensembl,
                        id:ncbigene, id:gene_symbol), default fasta:pro
  -s SPECIES, --species=SPECIES
                        species abbreviation (for example: ko for KEGG
                        Orthology, hsa for Homo sapiens, mmu for Mus musculus,
                        dme for Drosophila melanogaster, ath for Arabidopsis
                        thaliana, sce for Saccharomyces cerevisiae and eco for
                        Escherichia coli K-12 MG1655)
  -o OUTFILE, --outfile=OUTFILE
                        output file for annotation result, default stdout
  -e EVALUE, --evalue=EVALUE
                        expect threshold for BLAST, default 1e-5
  -r RANK, --rank=RANK  rank cutoff for valid hits from BLAST result, default
                        5
  -n NCPUS, --nCPUs=NCPUS
                        number of CPUs to be used by BLAST, default 1
  -c COVERAGE, --coverage=COVERAGE
                        subject coverage cutoff for BLAST, default 0
  -z ORTHOLOG, --ortholog=ORTHOLOG
                        whether only use orthologs for cross-species
                        annotation or not, default NO (if only use orthologs,
                        please provide the species abbreviation of your input)
  -k KOBAS_HOME, --kobashome=KOBAS_HOME
                        Optional parameter. To set path to kobas_home, which
                        is parent directory of sqlite3/ and seq_pep/ , default
                        value is read from ~/.kobasrcwhere you set before
                        running kobas. If you set this parameter, it means you
                        set "kobasdb" and "blastdb" in this following
                        directory. e.g. "-k /home/user/kobas/", means that you
                        set kobasdb = /home/user/kobas/sqlite3/ and blastdb =
                        /home/user/kobas/seq_pep/
  -v BLAST_HOME, --blasthome=BLAST_HOME
                        Optional parameter. To set parent directory of blastx
                        and blastp. If you set this parameter, it means you
                        set "blastx" and "blastp" in this following directory.
                        Default value is read from ~/.kobasrc where you set
                        before running kobas
  -y BLASTDB, --blastdb=BLASTDB
                        Optional parameter. To set path to sep_pep/, default
                        value is read from ~/.kobasrc where you set before
                        running kobas
  -q KOBASDB, --kobasdb=KOBASDB
                        Optional parameter. To set path to sqlite3/, default
                        value is read from ~/.kobasrc where you set before
                        running kobas, e.g. "-q /kobas_home/sqlite3/"
  -p BLASTP, --blastp=BLASTP
                        Optional parameter. To set path to blastp program,
                        default value is read from ~/.kobasrc where you set
                        before running kobas
  -x BLASTX, --blastx=BLASTX
                        Optional parameter. To set path to  blasx program,
                        default value is read from ~/.kobasrc where you set
                        before running kobas

无意间运行了映射命令之后，那些报错信息消失了，可以直接运行成功。
理论上说，即使我不要这些映射信息，我把数据都拷入docker中，应该也能运行起来，

docker cp   local_file     kobasexxxxx:/opt/kobas-3.0/  #

但事实就是这样，具体因为什么，不清楚。

kobas3做出的效果的确不错，但也吐槽一下，它的帮助网页，写的有点简单，比较烂。
从用户的角度，需要的帮助文件应该是这样的。
网页版的功能这么炫，网页上的一步一步的功能，如果能和本地版的命令对应起来就显著降低了使用门槛。

后记：现在摸索了2天，只是能让kobas跑起来，不出错了。但这些程序主要是用来做什么的，还有待探索，希望后续的网页帮助，能出个较详细的使用说明，数据格式，配图有图和实例，才更好上手。

annotate.py  
cluster.py 
identify.py
run_gsea.py 
run_kobas.py  
run_mulmds.py

重新部署了一个，出现了一个错误信息

运行命令,出现了一个错误信息。

annotate.py -i /opt/kobas-3.0/test/example.fasta -t fasta:pro -s hsa -o test_annot.tsv -e 1e-5 -r 1 -n 2 -y /opt/kobas-3.0/seq_pep -q /opt/kobas-3.0/sqlite3
仔细看，是因为找不到数据库都有哪些物种的，这些信息在organism.db数据库中。

################################
Traceback (most recent call last):
  File "/opt/kobas-3.0/scripts/annotate.py", line 3, in <module>
    from kobas.scripts import annotate
  File "/opt/kobas-3.0/src/kobas/scripts/annotate.py", line 221, in <module>
    species_name = organismdb.name_from_abbr(opt.species)
  File "/opt/kobas-3.0/src/kobas/dbutils.py", line 31, in name_from_abbr
    return self.con.execute('SELECT name FROM Organisms WHERE abbr = ?', (abbr, )).fetchone()[0]
sqlite3.OperationalError: no such table: Organisms

另外再啰嗦一点：seq_pep与sqlite3中的文件是有对应关系的，你仔细瞅瞅。
kobas的注释库文件很多，多大5900多个物种，真正你用上的其实不超过十种，只下载常用的就行，其他的真没必要下载，太占地方了。