美文网首页
TCGAbiolinks —— 1.数据下载

TCGAbiolinks —— 1.数据下载

作者: 重拾生活信心 | 来源:发表于2023-06-07 16:20 被阅读0次

References

GDCquery

GDCquery(
  project,
  data.category,
  data.type,
  workflow.type,
  legacy = FALSE,
  access,
  platform,
  file.type,
  barcode,
  data.format,
  experimental.strategy,
  sample.type
)
  • project :A list of valid project (see list with TCGAbiolinks:::getGDCprojects()$project_id)]
projects = TCGAbiolinks:::getGDCprojects()$project_id
projects
TCGAs = grep("TCGA", projects, value = T)
sort(TCGAs)
GDC
GDC
  • data.category
TCGAbiolinks:::getProjectSummary(project="TCGA-LUAD")

$file_count
[1] 29733

$data_categories
  file_count case_count               data_category
1       4266        582            Sequencing Reads
2       2731        585                 Biospecimen
3       4553        518       Copy Number Variation
4       9967        571 Simple Nucleotide Variation
5       2334        519     Transcriptome Profiling
6       1971        579             DNA Methylation
7       1146        585                    Clinical
8       2400        517        Structural Variation
9        365        365          Proteome Profiling

$case_count
[1] 585

$file_size
[1] 1.349824e+14
TCGA_barcode
barcode_meaning.png

GDCdownload

GDCdownload(
  query,
  token.file,
  method = "api",
  directory = "GDCdata",
  files.per.chunk = NULL
)

GDAprepare

GDCprepare(
  query,
  save = FALSE,
  save.filename,
  directory = "GDCdata",
  summarizedExperiment = TRUE,
  remove.files.prepared = FALSE,
  add.gistic2.mut = NULL,
  mut.pipeline = "mutect2",
  mutant_variant_classification = c("Frame_Shift_Del", "Frame_Shift_Ins",
    "Missense_Mutation", "Nonsense_Mutation", "Splice_Site", "In_Frame_Del",
    "In_Frame_Ins", "Translation_Start_Site", "Nonstop_Mutation")
)

Examples (query-download-prepare : 查询-下载-处理并读取)

query <- GDCquery(project = "TCGA-KIRP",
                  data.category = "Simple Nucleotide Variation",
                  data.type = "Masked Somatic Mutation",
                  workflow.type = "MuSE Variant Aggregation and Masking")
GDCdownload(query, method = "api", directory = "maf")
maf <- GDCprepare(query, directory = "maf")

# Get GISTIC values
gistic.query <- GDCquery(project = "TCGA-ACC",
                         data.category = "Copy Number Variation",
                         data.type = "Gene Level Copy Number Scores",
                         access = "open")
GDCdownload(gistic.query)
gistic <- GDCprepare(gistic.query)

## End(Not run)

相关文章

网友评论

      本文标题:TCGAbiolinks —— 1.数据下载

      本文链接:https://www.haomeiwen.com/subject/xoonedtx.html