genemark

作者: 扇子和杯子 | 来源:发表于2020-05-21 20:10 被阅读0次

genemark
braker2注释流程
MAKER 进行基因预测
安装GeneMark-ET遇到的错误
metagenemark
使用BRAKER2进行基因组注释(v 2.1.5版)
使用BRAKER2进行基因组注释
GeneMark-ES/ET安装与模型训练
MetaGeneMark使用报错：GeneMark.hmm 40
基因注释：基于SNAP+Augustus+GeneMark的ma

genemark-原核GeneMarkS

genemark-真核GeneMark ES/ET

genemark在README中详细介绍了为什么程序叫gmes_petap.pl：

GeneMark.hmm  -> gm
Eukaryotic    -> e
Self-training -> s
Plus          -> p
Evidence      -> e
Transcripts   -> t
And           -> a
Proteins      -> p

总结一下，即采用隐马尔可夫模型通过self-training或evidence（transcripts和protein）预测真核生物基因

1. 主要参数说明

GeneMark-ES：self-training

--sequence：序列文件，FASTA格式
--ES：代表self-training，不需要设置值

GeneMark-ET：transcript

--sequence：序列文件，FASTA格式
--ET：RNA-Seq read通过剪切比对map到genome上得到的intron坐标文件，gff格式
--et_score：内含子分数阈值（最低）。根据使用的RNA-Seq read 比对工具，需要设置不同的et-score值。TopHat2：10；UnSplicer/TrueSight：0.5。默认10

GeneMark-EP：protein

--sequence：序列文件，FASTA格式
--EP：蛋白质剪切比对map到genome上得到的intron坐标文件，gff格式（ProtHint pipeline的输出结果）
--dbep：FASTA格式的蛋白质库文件
--ep_score：内含子分数阈值

GeneMark.hmm

--predict_with：物种特异性的基因预测参数

其他参数

--fungus：用于真菌基因组预测
--evidence：PLUS模式下的hint文件
--soft_mask：数字或auto；mask长于指定长度的小写repeat。auto模式下，根据基因组大小调整长度。默认auto
--cores：线程数，默认为1
--pbs：在pbs系统中运行
--max_contig：将基因组分成小于max_contig长度的contigs
--min_contig：训练时，忽略短于min_contig的contigs
--max_mask：分割长于max_mask的repeat序列。将x和X解释成hard masking的结果
--gc_donor：转换为GC donor的概率，位于0-1。auto模式下，从训练数据中估计概率
--gc3：稻科植物类训练时的GC3阈值
--training：只运行training步骤，在ES，ET，EP模式下使用
--prediction：根据之前训练得到的物种特异性参数进行预测，在ES，ET，EP模式下使用
--usr_cfg：用户自定义的配置文件
--ini_mod：算法所需的参数文件
--test_set：在指定的测试文件上评估预测结果

intron坐标文件-GFF格式

"seqname"  "source"  "feature"  "start"  "end"  "score"  "strait" "frame" "attribute"
2L     TopHat2 intron  2740    2888    25      +       .       .

tophat result➡️gff
path_to/bed_to_gff.pl  --bet path_to_tophat_out/junctions.bed   --gff introns.gff  --label TopHat2

STAR result➡️gff
path_to/star_to_gff.pl --star path_to_star_out/SJ.out.tab  --gff introns.gff  --label STAR

2. 指标简要说明:

结果文件每列分别是：

"seqname"  "source"  "feature"  "start"  "end"  "score"  "strait" "frame" "attribute"

3. 结果文件简要说明:
genemark.gtf：输出结果，gtf格式
gmhmm.mod：genemark的训练模型，可以作为maker的输入

网友评论

本文标题：genemark

本文链接：https://www.haomeiwen.com/subject/vzntahtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

genemark

genemark-原核GeneMarkS

genemark-真核GeneMark ES/ET

相关文章

genemark

braker2注释流程

MAKER 进行基因预测

安装GeneMark-ET遇到的错误

metagenemark

使用BRAKER2进行基因组注释(v 2.1.5版)

使用BRAKER2进行基因组注释

GeneMark-ES/ET安装与模型训练

MetaGeneMark使用报错：GeneMark.hmm 40

基因注释：基于SNAP+Augustus+GeneMark的ma

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读