美文网首页简书付费文章TCGA文献
一文解决pan-cancer泛癌单基因分析问题(TCGA,GTE

一文解决pan-cancer泛癌单基因分析问题(TCGA,GTE

作者: 柳叶刀与小鼠标 | 来源:发表于2020-05-30 22:04 被阅读0次

(1)分析单个基因在TCGA多个类型肿瘤中的分布(正常/肿瘤)

  • 通过TCGAbiolinks下载表达量数据集
  • 注释将表达量矩阵转化成基于gene symbol的表达矩阵
  • 绘图展示单个基因在tcga数据库泛癌中的分布
    第一步,下载表达矩阵
#=======================================================


#=======================================================


library(GenomicDataCommons)

setwd('D:\\SCIwork\\F16DMDmeta\\review\\TCGA')

rm(list=ls())


library(dplyr)

library(TCGAbiolinks)

library(dplyr)

library(DT)

library(SummarizedExperiment)

library(stringr)

#=======================================================


#=======================================================

cancer  <- TCGAbiolinks:::getGDCprojects()$project_id

cancer <- str_subset(cancer, "TCGA")

cancer <- sort(cancer)




for (i in 1:33) {
  cancer_select <- cancer[i]
  print(cancer_select)
  #下载rna-seq的counts数据
  suppressMessages({
    query <- GDCquery(
      project = cancer_select,
      data.category = "Transcriptome Profiling",
      data.type = "Gene Expression Quantification",
      workflow.type = "HTSeq - FPKM")  })
  
  
  if (is.null(query)){
    print(paste0("No FPKM data of solid normal tissue for ", cancer_select ))
  } else{
    
    GDCdownload(query, method = "api", 
                files.per.chunk = 150)
    expdat <- GDCprepare(query = query, save = TRUE,
                         save.filename = paste0(cancer_select,".rda"))
    count_matrix=assay(expdat)
    write.csv(count_matrix,
              file = paste( cancer_select,"Counts.csv",
                            sep = "-"))}}

相关文章

网友评论

    本文标题:一文解决pan-cancer泛癌单基因分析问题(TCGA,GTE

    本文链接:https://www.haomeiwen.com/subject/dtwhzhtx.html