美文网首页
liftOver | 将hg19转为hg38

liftOver | 将hg19转为hg38

作者: 苦哈哈的柠檬水 | 来源:发表于2022-09-07 19:49 被阅读0次

处理工具 liftOver
https://gwaslab.com/2021/05/09/liftover-%E5%9F%BA%E5%9B%A0%E7%BB%84%E5%9D%90%E6%A0%87%E5%8F%98%E6%8D%A2/

输入文件

(表达谱提取circRNA ID)
BED格式文件,BED格式文件只定义前三列:chr start end,无表头
注:start不等于end(UCSC 使用基于0的坐标系统,而 Ensembl 等使用基于1的坐标系统)

R处理
A <- expr
IDchange1_0 <- function(x){
  str_split(x,"[:,|]") %>% sapply(function(y){
    paste(y[1],gsub(y[2],as.numeric(y[2])-1,y[2]),sep = ":") %>% paste(y[3],sep = "|")
  })
}
rownames(A) <- IDchange1_0(rownames(A))

rawID <- str_split(rownames(A),"[:,|]")
rawID <- data.frame(matrix(unlist(rawID),ncol = 3,byrow = TRUE))
colnames(rawID) <- c("chr","start","end")
write.table(rawID,"GSE_hg19_0.bed",col.names = F,row.names = F,quote = F, sep = "\t")

坐标转换

LINUX处理
chmod +x liftOver
./liftOver GSE_hg19_0.bed hg19ToHg38.over.chain GSE_hg38_0.bed unmapped.txt

输出文件

R处理
ID <- fread("GSE_hg38_0.bed",data.table = F)
ID <- paste(ID$V1,ID$V2,sep = ":") %>% paste(ID$V3,sep = "|")
IDchange0_1 <- function(x){
  str_split(x,"[:,|]") %>% sapply(function(y){
    paste(y[1],gsub(y[2],as.numeric(y[2])+1,y[2]),sep = ":") %>% paste(y[3],sep = "|")
  })
}
ID <- IDchange0_1(ID)

unID <- fread("unMapped",data.table = F)
unID <- paste(unID$V1,unID$V2,sep = ":") %>% paste(unID$V3,sep = "|")

A1 <- A[!rownames(A) %in% unID,]
rownames(A1) <- ID
write.csv(A1,"GSE_circRNA_counts_hg38_0.csv")

相关文章

网友评论

      本文标题:liftOver | 将hg19转为hg38

      本文链接:https://www.haomeiwen.com/subject/ivggnrtx.html