美文网首页
Znhit1基因差异分析-2020-03-06

Znhit1基因差异分析-2020-03-06

作者: 爬山小虎 | 来源:发表于2020-03-15 22:51 被阅读0次

    目的:分析基因Znhit1在肿瘤acute myeloid leukemia与正常样本中国是否存在表达差异和甲基化差异


    截屏2020-03-07下午10.55.54.png

    下载的文件按照ensemble ID整理,需要提取出Znhit1对应的基因,NCBI上查得对应的ensembl ID为ENSG00000106400

    参考:
    https://www.cnblogs.com/zdwu/p/9072533.html
    https://cloud.tencent.com/developer/article/1512900
    https://cloud.tencent.com/developer/article/1422044
    https://cloud.tencent.com/developer/article/1531071

    TCGA网址:https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga

    TCGA数据库介绍:
    https://blog.csdn.net/weixin_43700050/article/details/100527245
    https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables/sample-type-codes

    一、下载数据

    setwd("/Users/asang/Desktop/TCGA_Analysis")
    
    if (!requireNamespace("BiocManager", quietly = TRUE))
      install.packages("BiocManager")
    
    BiocManager::install("TCGAbiolinks")
    BiocManager::install("Rsamtools")
    
    library(TCGAbiolinks)
    
    query <- GDCquery(project = "TCGA-LAML", 
                      data.category = "Transcriptome Profiling",
                      data.type = "Gene Expression Quantification",
                      workflow.type = "HTSeq - Counts")
    GDCdownload(query)
    query.met <- GDCquery(project ="TCGA-LAML", 
                          legacy = TRUE,
                          data.category = "DNA methylation",
                          platform = "Illumina Human Methylation 450")
    GDCdownload(query.met)
    

    由于下载的文件按照sample放在各自的文件夹中,需要移动到一个文件夹进行操作:

    #!/bin/bash
    cd /Users/asang/Desktop/TCGA_Analysis/GDCdata/TCGA-LAML/harmonized/Transcriptome_Profiling/Gene_Expression_Quantification 
    for i in `ls`
    do
        cd $i
        mv ./* ../
        cd ../
    done
    

    由于之前下载数据的时候没有考虑到癌与癌旁的对应关系,现重新下载,参考链接:
    https://mp.weixin.qq.com/s?__biz=MzA4NDAzODkzMA==&mid=2651263330&idx=1&sn=ff440567bbacae48dd41fb3d1daa8751&chksm=841ef51fb3697c098f2c84276930d46884bcb51663f11f2c1c01d822427fed9c4a5c7fe0cf2c&scene=21#wechat_redirect
    https://cloud.tencent.com/developer/article/1481904

    就RNA-seq而言,对照组一般是Solid Tissue Normal,而不是血,原因是由于血和肿瘤的RNA差异太大了。然而Solid Tissue Normal占的是少数,有些时候你会发现这种肿瘤根本就没有。这种情况下,就没法进行DEA分析了。

    参考文献,如果遇到这样非实体瘤的情况,可以1)与正常组织进行比较,2)与正常人的骨髓夜结果进行比较

    NCBI上下载

    未完成,待整理。

    相关文章

      网友评论

          本文标题:Znhit1基因差异分析-2020-03-06

          本文链接:https://www.haomeiwen.com/subject/mlairhtx.html