参考:Users Guide for New BCsFt Tools for R/qtl
R/qtl: A QTL mapping environment
R/qtl包学习(一)
使用R/qtl进行QTL分析
R/qtl包的QTL学习
以及师兄给我的数据代码~
一、介绍
QTL图谱研究方法:回交(backcross)、同胞交配(sib-mating)、自交(selfing)、重组自交系(RI lines)以及种群中随机交配(generations of random
mating within mapping populations)
QTL比对问题可分为两个部分:缺失数据的处理和模型选择
1)缺失数据处理:重组模型
这其中最简单的是no crossover interference:recombination events in disjoint intervals are independent. 这样,染色体上就形成了一个马尔可夫链。也就是判断基因型的时候,只需要看它两侧的marker就可以,不需要考虑其他marker。
2)模型选择:
常见positive crossover interference,不倾向在相邻位置发生重组,通过使用HMM模型
二、流程
install.packegs("qtl")
library(qtl)
setwd("./xxx/xxx")
- 导入数据
函数:read.cross
用法:
read.cross(format=c("csv", "csvr", "csvs", "csvsr", "mm", "qtx",
"qtlcart", "gary", "karl", "mapqtl", "tidy"),
dir="", file, genfile, mapfile, phefile, chridfile,
mnamesfile, pnamesfile, na.strings=c("-","NA"),
genotypes=c("A","H","B","D","C"), alleles=c("A","B"),
estimate.map=FALSE, convertXdata=TRUE, error.prob=0.0001,
map.function=c("haldane", "kosambi", "c-f", "morgan"),
BC.gen=0, F.gen=0, crosstype, ...)
基因型文件:
image.png
表型文件:
image.png
示例:
a <- read.cross("csvs", ".", "gen.csv", "phe.csv", genotypes=c("A","H","B"),crosstype="riself")
##查看输入文件相关信息
summary(a)
image.png
- 统计对应信息函数
## 样本数
nind(a)
## 染色体数
nchr(a)
## 标记数
totmar(a)
## 每个染色体上的标记数
nmar(a)
## 表型数
nphe(a)
image.png
##用图来展示信息
plot(a)
##展示缺失基因型数据(黑色为缺失的基因型)
plotMissing(a)
image.png
## 绘制遗传图谱
plotMap(a)
image.png
## 绘制表型分布直方图
plotPheno(a, pheno.col=1)
image.png
- jittermap()
Jitter the marker positions in a genetic map so that no two markers are on top of each other.
jit.a<-jittermap(a)
- calc.genoprob()
Uses the hidden Markov model technology to calculate the probabilities of the true underlying genotypes given the observed multipoint marker data, with possible allowance for genotyping errors.
利用隐马尔可夫模型技术,在观察到的多点标记数据的情况下,计算真正的基本基因型的概率,并可能考虑到基因分型误差。
calc.a<-calc.genoprob(jit.a)
- sim.geno()
Uses the hidden Markov model technology to simulate from the joint distribution Pr(g | O) where g is the underlying genotype vector and O is the observed multipoint marker data, with possible allowance for genotyping errors.
sim.a<-sim.geno(jit.a)
- cim()
Composite interval mapping by a scheme from QTL Cartographer: forward selection at the markers (here, with filled-in genotype data) to a fixed number, followed by interval mapping with the selected markers as covariates, dropping marker covariates if they are within some fixed window size of the location under test.
lod_result<-cim(sim.a,pheno.col="phe",method="imp",window=1)
per<-cim(sim.a,n.perm=1000,pheno.col="phe",method="imp",window=1)
n.perm
:置换检验中置换组数
- 画图
pdf(file="./xxx.chr1.pdf",width=14)
par(mfrow=c(2,1))
plot(lod_result, bandcol="gray70", cex=0.3, pch=21, bg="slateblue",main="xxx",chr='1')
##adds one or more straight lines through the current plot.
abline(h=threshold[2+1],col="lightgray")
scan<-effectscan(sim.a,draw=TRUE,pheno.col="phe",chr='1')
dev.off()
##后面就修改一下chr参数将所有染色体的图绘制出来
##绘制总的
pdf(file="./xxx.chr1.pdf",width=14)
par(mfrow=c(2,1))
plot(lod_result, bandcol="gray70", cex=0.3, pch=21, bg="slateblue",main="xxx",chr='1')
##adds one or more straight lines through the current plot.
abline(h=threshold[2+1],col="lightgray")
scan<-effectscan(sim.a,draw=TRUE,pheno.col="phe")
dev.off()
- lodint()
Calculate a LOD support interval for a particular chromosome, using output from scanone.
计算特定染色质的LOD区间
Tips:我们常说的LOD值=log10 (L1/L0) ,L1指该位点含QTL的概率,L0指该位点不含QTL的概率。LOD值为3表示该位点含QLT的概率是不含QTL概率的1000倍。
网友评论