使用conda创建独立环境进行安装
conda create -n buscoEnv -c bioconda -c conda-forge busco=4.1.2 augustus=3.3.3 biopython python
#biopython是我后加的,默认给的没有这个,但是biopython也是busco运行需要的环境支持。
#下面是我第一次安装的顺序,一开始没有安装biopython后来报错查看issue发现需要biopython
conda create -n buscoEnv -c bioconda -c conda-forge busco=4.1.2 augustus=3.3.3 python
source activate buscoEnv
conda install biopython
检查augutus是否可以使用,查看可以使用的物种数据库。
(buscoEnv) @animalia:~$ augustus --species=help
usage:
augustus [parameters] --species=SPECIES queryfilename
where SPECIES is one of the following identifiers
identifier | species
-----------------------------------------|----------------------
pea_aphid | Acyrthosiphon pisum #豌豆蚜
aedes | Aedes aegypti #埃及伊蚊
amphimedon | Amphimedon queenslandica #大堡礁海绵
ancylostoma_ceylanicum | Ancylostoma ceylanicum #锡兰钩虫
adorsata | Apis dorsata #大蜜蜂
honeybee1 | Apis mellifera #西方蜜蜂
arabidopsis | Arabidopsis thaliana #拟南芥
aspergillus_fumigatus | Aspergillus fumigatus #烟曲霉
aspergillus_nidulans | Aspergillus nidulans #小巢状曲菌
(anidulans) | Aspergillus nidulans #构巢曲霉
aspergillus_oryzae | Aspergillus oryzae #米曲霉
aspergillus_terreus | Aspergillus terreus #土曲霉
bombus_impatiens1 | Bombus impatiens #美洲东部熊蜂
bombus_terrestris2 | Bombus terrestris #欧洲熊蜂
botrytis_cinerea | Botrytis cinerea #灰霉菌
brugia | Brugia malayi #马来丝虫
b_pseudomallei | Burkholderia pseudomallei # 类鼻疽菌
caenorhabditis | Caenorhabditis elegans #秀丽隐杆线虫
(c_elegans_trsk) | Caenorhabditis elegans
(elegans) | Caenorhabditis elegans
elephant_shark | Callorhinchus milii #叶吻银鲛
camponotus_floridanus | Camponotus floridanus #佛罗里达弓背蚁
candida_albicans | Candida albicans #白假丝酵母菌
candida_guilliermondii | Candida guilliermondii # 季氏假丝酵母
candida_tropicalis | Candida tropicalis #热带假丝酵母
chaetomium_globosum | Chaetomium globosum # 球毛壳菌
chiloscyllium | Chiloscyllium punctatum #点纹斑竹鲨
chlamy2011 | Chlamydomonas reinhardtii #莱茵衣藻
(chlamydomonas) | Chlamydomonas reinhardtii
chlorella | Chlorella sp. #小球藻
ciona | Ciona intestinalis #玻璃海鞘
coccidioides_immitis | Coccidioides immitis #粗球孢子菌
coccidioides_immitis | Coccidioides immitis
Conidiobolus_coronatus | Conidiobolus coronatus #副冠耳霉
coprinus_cinereus | Coprinus cinereus #灰盖鬼伞
coprinus_cinereus | Coprinus cinereus
(coprinus) | Coprinus cinereus
coprinus | Coprinus cinereus
cryptococcus_neoformans_gattii | Cryptococcus neoformans gattii #新型隐球菌
cryptococcus_neoformans_neoformans_B | Cryptococcus neoformans neoformans
(cryptococcus) | Cryptococcus neoformans
culex | Culex pipiens #尖音库蚊
zebrafish | Danio rerio #斑马鱼
debaryomyces_hansenii | Debaryomyces hansenii #汉逊德巴利酵母
fly | Drosophila melanogaster #黑腹果蝇
(fly_exp) | Drosophila melanogaster
encephalitozoon_cuniculi_GB | Encephalitozoon cuniculi #兔脑炎微孢子虫
eremothecium_gossypii | Eremothecium gossypii #假丝酵母
E_coli_K12 | Escherichia coli K12 #大肠埃希氏杆菌
fusarium_graminearum | Fusarium graminearium #禾谷镰孢菌
(fusarium) | Fusarium graminearium
galdieria | Galdieria sulphuraria #红藻
chicken | Gallus gallus domesticus #藏香鸡
heliconius_melpomene1 | Heliconius melpomene #红带袖蝶
histoplasma_capsulatum | Histoplasma capsulatum #荚膜组织胞浆菌
(histoplasma) | Histoplasma capsulatum
human | Homo sapiens #智人
kluyveromyces_lactis | Kluyveromyces lactis #乳酸克鲁维酵母
laccaria_bicolor | Laccaria bicolor #双色蜡蘑
leishmania_tarentolae | Leishmania tarentolae #利什曼原虫
japaneselamprey | Lethenteron camtschaticum #东亚叉牙七鳃鳗
lodderomyces_elongisporus | Lodderomyces elongisporus #长孢洛德酵母
magnaporthe_grisea | Magnaporthe grisea #稻瘟病菌
mnemiopsis_leidyi | Mnemiopsis leidyi #淡海栉水母
nasonia | Nasonia vitripennis #丽蝇蛹集金小蜂
nematostella_vectensis | Nematostella vectensis #星状海葵
neurospora_crassa | Neurospora crassa # 粗糙脉孢菌
(neurospora) | Neurospora crassa
coyote_tobacco | Nicotiana attenuata #烟草
rice | Oryza sativa #水稻
parasteatoda | Parasteatoda sp. #某蜘蛛
sealamprey | Petromyzon marinus #七鳃鳗
phanerochaete_chrysosporium | Phanerochaete chrysosporium #黄孢原毛平革菌
(pchrysosporium) | Phanerochaete chrysosporium #黄孢原毛平革菌
pichia_stipitis | Pichia stipitis #树干毕赤酵母
pisaster | Pisaster ochraceus #如海星
pfalciparum | Plasmodium falciparum #恶性疟原虫
pneumocystis | Pneumocystis jirovecii #耶氏肺孢子虫
rhincodon | Rhincodon typus #鲸鲨
rhizopus_oryzae | Rhizopus oryzae # 米根霉菌
rhodnius | Rhodnius prolixus # 长红猎蝽
saccharomyces_cerevisiae_rm11-1a_1 | Saccharomyces cerevisiae #酿酒酵母
saccharomyces_cerevisiae_S288C | Saccharomyces cerevisiae
(saccharomyces) | Saccharomyces cerevisiae
schistosoma | Schistosoma mansoni #血吸虫
(schistosoma2) | Schistosoma mansoni
schizosaccharomyces_pombe | Schizosaccharomyces pombe #粟酒裂殖酵母
scyliorhinus | Scyliorhinus torazame #虎纹猫鲨
tomato | Solanum lycopersicum #番茄
s_aureus | Staphylococcus aureus #金黃葡萄球菌
s_pneumoniae | Streptococcus #链球菌pneumoniae
strongylocentrotus_purpuratus | Strongylocentrotus purpuratus #紫色球海胆
sulfolobus_solfataricus | Sulfolobus solfataricus #硫化叶菌
tetrahymena | Tetrahymena thermophila #嗜热四膜虫
cacao | Theobroma cacao #可可树
thermoanaerobacter_tengcongensis | Thermoanaerobacter tengcongensis #腾冲嗜热厌氧菌
toxoplasma | Toxoplasma gondii #弓形虫
tribolium2012 | Tribolium castaneum #赤拟谷盗 鞘翅目 拟步甲科
trichinella | Trichinella sp. #旋毛虫
wheat | Triticum sp. #小麦属
ustilago_maydis | Ustilago maydis #玉米黑粉菌
(ustilago) | Ustilago maydis
verticillium_albo_atrum1 | Verticillium albo atrum
verticillium_longisporum1 | Verticillium #轮枝孢属longisporum
volvox | Volvox sp. #团藻属
Xipophorus_maculatus | Xipophorus maculatus #剑尾鱼(属)
yarrowia_lipolytica | Yarrowia lipolytica #解脂耶氏酵母
maize | Zea mays #玉米
(maize5) | Zea mays
检查Busco是否可以使用,查看可以使用的物种数据库。
(buscoEnv) @animalia:~$ busco -h
usage: busco -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]
Welcome to BUSCO 4.1.2: the Benchmarking Universal Single-Copy Ortholog assessment tool.
For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide.
optional arguments:
-i FASTA FILE, --in FASTA FILE
Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set.
-c N, --cpu N Specify the number (N=integer) of threads/cores to use.
-o OUTPUT, --out OUTPUT
Give your analysis run a recognisable short name. Output folders and files will be labelled with this name. WARNING: do not provide a path
--out_path OUTPUT_PATH
Optional location for results folder, excluding results folder name. Default is current working directory.
-e N, --evalue N E-value cutoff for BLAST searches. Allowed formats, 0.001 or 1e-03 (Default: 1e-03)
-m MODE, --mode MODE Specify which BUSCO analysis mode to run.
There are three valid modes:
- geno or genome, for genome assemblies (DNA)
- tran or transcriptome, for transcriptome assemblies (DNA)
- prot or proteins, for annotated gene sets (protein)
-l LINEAGE, --lineage_dataset LINEAGE
Specify the name of the BUSCO lineage to be used.
-f, --force Force rewriting of existing files. Must be used when output files with the provided name already exist.
-r, --restart Continue a run that had already partially completed.
--limit REGION_LIMIT How many candidate regions (contig or transcript) to consider per BUSCO (default: 3)
--long Optimization mode Augustus self-training (Default: Off) adds considerably to the run time, but can improve results for some non-model organisms
-q, --quiet Disable the info logs, displays only errors
--augustus_parameters AUGUSTUS_PARAMETERS
Pass additional arguments to Augustus. All arguments should be contained within a single pair of quotation marks, separated by commas. E.g. '--param1=1,--param2=2'
--augustus_species AUGUSTUS_SPECIES
Specify a species for Augustus training.
--auto-lineage Run auto-lineage to find optimum lineage path
--auto-lineage-prok Run auto-lineage just on non-eukaryote trees to find optimum lineage path
--auto-lineage-euk Run auto-placement just on eukaryote tree to find optimum lineage path
--update-data Download and replace with last versions all lineages datasets and files necessary to their automated selection
--offline To indicate that BUSCO cannot attempt to download files
--config CONFIG_FILE Provide a config file
-v, --version Show this version and exit
-h, --help Show this help message and exit
--list-datasets Print the list of available BUSCO datasets
使用该命令查看物种库
busco --list-datasets
Datasets available to be used with BUSCOv4 as of 2019/11/27:
bacteria_odb10 #细菌
- acidobacteria_odb10
- actinobacteria_phylum_odb10
- actinobacteria_class_odb10
- corynebacteriales_odb10
- micrococcales_odb10
- propionibacteriales_odb10
- streptomycetales_odb10
- streptosporangiales_odb10
- coriobacteriia_odb10
- coriobacteriales_odb10
- aquificae_odb10
- bacteroidetes-chlorobi_group_odb10
- bacteroidetes_odb10
- bacteroidia_odb10
- bacteroidales_odb10
- cytophagia_odb10
- cytophagales_odb10
- flavobacteriia_odb10
- flavobacteriales_odb10
- sphingobacteriia_odb10
- chlorobi_odb10
- chlamydiae_odb10
- chloroflexi_odb10
- cyanobacteria_odb10
- chroococcales_odb10
- nostocales_odb10
- oscillatoriales_odb10
- synechococcales_odb10
- firmicutes_odb10
- bacilli_odb10
- bacillales_odb10
- lactobacillales_odb10
- clostridia_odb10
- clostridiales_odb10
- thermoanaerobacterales_odb10
- selenomonadales_odb10
- tissierellia_odb10
- tissierellales_odb10
- fusobacteria_odb10
- fusobacteriales_odb10
- planctomycetes_odb10
- proteobacteria_odb10
- alphaproteobacteria_odb10
- rhizobiales_odb10
- rhizobium-agrobacterium_group_odb10
- rhodobacterales_odb10
- rhodospirillales_odb10
- rickettsiales_odb10
- sphingomonadales_odb10
- betaproteobacteria_odb10
- burkholderiales_odb10
- neisseriales_odb10
- nitrosomonadales_odb10
- delta-epsilon-subdivisions_odb10
- deltaproteobacteria_odb10
- desulfobacterales_odb10
- desulfovibrionales_odb10
- desulfuromonadales_odb10
- epsilonproteobacteria_odb10
- campylobacterales_odb10
- gammaproteobacteria_odb10
- alteromonadales_odb10
- cellvibrionales_odb10
- chromatiales_odb10
- enterobacterales_odb10
- legionellales_odb10
- oceanospirillales_odb10
- pasteurellales_odb10
- pseudomonadales_odb10
- thiotrichales_odb10
- vibrionales_odb10
- xanthomonadales_odb10
- spirochaetes_odb10
- spirochaetia_odb10
- spirochaetales_odb10
- synergistetes_odb10
- tenericutes_odb10
- mollicutes_odb10
- entomoplasmatales_odb10
- mycoplasmatales_odb10
- thermotogae_odb10
- verrucomicrobia_odb10
archaea_odb10 # 古细菌
- thaumarchaeota_odb10
- thermoprotei_odb10
- thermoproteales_odb10
- sulfolobales_odb10
- desulfurococcales_odb10
- euryarchaeota_odb10
- thermoplasmata_odb10
- methanococcales_odb10
- methanobacteria_odb10
- methanomicrobia_odb10
- methanomicrobiales_odb10
- halobacteria_odb10
- halobacteriales_odb10
- natrialbales_odb10
- haloferacales_odb10
eukaryota_odb10 #真核生物
- alveolata_odb10
- apicomplexa_odb10
- aconoidasida_odb10
- plasmodium_odb10
- coccidia_odb10
- euglenozoa_odb10
- fungi_odb10
- ascomycota_odb10
- dothideomycetes_odb10
- capnodiales_odb10
- pleosporales_odb10
- eurotiomycetes_odb10
- chaetothyriales_odb10
- eurotiales_odb10
- onygenales_odb10
- leotiomycetes_odb10
- helotiales_odb10
- saccharomycetes_odb10
- sordariomycetes_odb10
- glomerellales_odb10
- hypocreales_odb10
- basidiomycota_odb10
- agaricomycetes_odb10
- agaricales_odb10
- boletales_odb10
- polyporales_odb10
- tremellomycetes_odb10
- microsporidia_odb10
- mucoromycota_odb10
- mucorales_odb10
- metazoa_odb10 #后生动物
- arthropoda_odb10
- arachnida_odb10 #蛛形纲
- insecta_odb10 #昆虫纲
- endopterygota_odb10 #内翅部
- diptera_odb10 #双翅目
- hymenoptera_odb10 #膜翅目
- lepidoptera_odb10 #鳞翅目
- hemiptera_odb10 #半翅目
- mollusca_odb10 #软体动物门
- nematoda_odb10 #线虫动物门
- vertebrata_odb10 #脊椎动物门
- actinopterygii_odb10
- cyprinodontiformes_odb10
- tetrapoda_odb10
- mammalia_odb10
- eutheria_odb10
- euarchontoglires_odb10
- glires_odb10
- primates_odb10
- laurasiatheria_odb10
- carnivora_odb10
- cetartiodactyla_odb10
- sauropsida_odb10
- aves_odb10
- passeriformes_odb10
- stramenopiles_odb10
- viridiplantae_odb10
- chlorophyta_odb10
- embryophyta_odb10
- liliopsida_odb10
- poales_odb10
- eudicots_odb10
- brassicales_odb10
- fabales_odb10
- solanales_odb10
网友评论