【00】
GEO数据库。与后面差不多。
【PCA,KEGG,GO,火山图等】
> getwd()
[1] "D:/R_code/follow_practice/xuetu_GEO_follow/week_practise/01_follow_practise/00_TNBC_GSE76275"
> dir()
[1] "exprSet_by_group.Rdata" "finalSet.Rdata"
[3] "GPL570.annot.gz" "GSE76275_eSet.Rdata"
[5] "GSE76275_series_matrix.txt.gz" "heatmap_top100_logFC.png"
[7] "ID2gene.Rdata" "kegg_up_down.png"
[9] "nrDEG.out" "nrDEG_by_logFC.Rdata"
[11] "pca_plot.png" "readme.txt"
[13] "Step01_getGEO.R" "step02_getmarix.R"
[15] "step03_gene_symbol.R" "step04_rm_nogene.R"
[17] "step05_PCA.R" "step06_DEG.R"
[19] "step07_pheatmap.R" "Step08_Volcano_plot.R"
[21] "step09_KEGG_GO.R" "TNBC_breastcancer.Rproj"
[23] "volcano.png"
【1】.02_GSE108565【比较完整】
.GEO:【hclust富集分析,DEG(limma包),volcano,heatmap,KEGG,GO分析】
dir()
[1] "02_GSE108565.Rproj"
[2] "dotplot_gene_diff_BP.png"
[3] "dotplot_gene_diff_CC.png"
[4] "dotplot_gene_diff_MF.png"
[5] "dotplot_gene_down_BP.png"
[6] "dotplot_gene_down_CC.png"
[7] "dotplot_gene_down_MF.png"
[8] "dotplot_gene_up_BP.png"
[9] "dotplot_gene_up_CC.png"
[10] "dotplot_gene_up_MF.png"
[11] "final_exprSet.Rdata"
[12] "go_enrich_results.Rdata"
[13] "gset.Rdata"
[14] "hclust.png"
[15] "heatmap.png"
[16] "kegg_up_down.png"
[17] "nrDEG.out"
[18] "nrDEG.Rdata"
[19] "pca_plot.png"
[20] "step01_download.R"
[21] "step02_handle_data.R"
[22] "step03_DEG_heatmap_volcano.R"
[23] "step04_KEGG_GO.R"
[24] "volcano.png"
【2】.UCSCXenaTools
【使用UCSCXenaTools下载TCGA数据并处理】
dir()
[1] "01_UCSCXenaTools_download.R"
[2] "02_UCSCXenaTools_.R"
[3] "03_TCGA-BRCA.Rproj"
[4] "GDCdata"
[5] "MANIFEST.txt"
[6] "race_sample.Rdata"
[7] "step01_download_handle.R"
[8] "TCGA-BRCA.GDC_phenotype_file.Rdata"
[9] "TCGA-BRCA.GDC_phenotype_file.tsv"
[10] "TCGA-BRCA.htseq_counts.Rdata"
[11] "TCGA-BRCA.htseq_counts.tsv"
[12] "必读.txt"
getwd()
[1] "D:/R_code/follow_practice/xuetu_GEO_follow/week_practise/01_follow_practise/03_TCGA-BRCA"
【3】下载合并TCGA文件
【合并文件(★),lncRNA,miRNA的提取,相关注释】
【未完成:①ncRNA的表达谱标准化一下,再自行下载microRNA的数据,就可以构建ceRNA的网络 ②按照pvalue和fc来排序,选择自己一定数量的基因,数量你来定,最终得到基因列表gene(我没有演示,需要自己做)】
【此处GO分析和KEGG分析似乎与TCGA不太一样,后面再看看】
dir()
[1] "02_GTF_mRNA_ncRNA.R"
[2] "03_DESeq.R"
[3] "04_01_DIY.R"
[4] "04_02_DIY.R"
[5] "04_GO_KEGG.R"
[6] "04_test.Rproj"
[7] "BRCA_DEG.xls"
[8] "dds_DEseq.Rda"
[9] "expr_df.Rda"
[10] "expr_df_nopoint.Rda"
[11] "gdc_download_20210717_084045.768266"
[12] "gdc_download_20210717_084045.768266.tar.gz"
[13] "gdc_manifest_20210717_083623.txt"
[14] "gtf_df.Rda"
[15] "Homo_sapiens.GRCh38.104.chr.gtf"
[16] "Homo_sapiens.GRCh38.104.chr.gtf.gz"
[17] "Homo_sapiens.GRCh38.104.chr.gtf0000"
[18] "LuminalABvsNormal_FC6.TSS.pdf"
[19] "MANIFEST.txt"
[20] "metadata.cart.2021-07-17.json"
[21] "metadata.Rda"
[22] "mRNA_exprSet.Rda"
[23] "mRNA_exprSet_vst.Rda"
[24] "readme.txt"
[25] "resSig.Rdata"
[26] "result.Rda"
[27] "TCGA-BRCA.htseq_counts.tsv"
[28] "TCGA-BRCA.htseq_counts.tsv.gz"
[29] "volcano.png"
getwd()
[1] "D:/R_code/follow_practice/xuetu_GEO_follow/week_practise/01_follow_practise/04_test"
【4】这里是比较TP53的。跟【3】有类似之处。可以结合起来看。
【亮点:先将火山图和热图的代码用函数包装起来,然后,进行limma或edge进行差异分析,最后再画图】【KEGG和GO富集分析也是同样的】
【这里的函数包可以直接调用到别处】
【中间问题:】
ggsave( volcano, filename = './fig/volcano.png' ))
###出现报错,运行【1】同样的代码,未报错,原因未明
报错图片
dir()
[1] "data" "fig"
[3] "nrDEG.Rdata" "raw_data"
[5] "step01_downpackage.R" "step02_download.R"
[7] "step03_DEG.R" "step04_KEGG_GO.R"
[9] "TP53_BACR.Rproj"
> dir()
[1] "data" "fig"
[3] "nrDEG.Rdata" "raw_data"
[5] "step01_downpackage.R" "step02_download.R"
[7] "step03_DEG.R" "step04_KEGG_GO.R"
[9] "TP53_BACR.Rproj"
> getwd()
[1] "D:/R_code/follow_practice/xuetu_GEO_follow/week_practise/01_follow_practise/01_TP53_BRCA"
【05】
【shell】cd/d D:
D:\R_code\follow_practice>cd/d xuetu_GEO_follow/week_practise/01_follow_practise/05_GBM_GSE4290/shell
【06】
【绘制进化树 WGCNA】【构建共表达矩阵】【TOM图】
2021.7.23 22:30
生存分析
以下内容重要
【生存分析】
存在问题:用包下载的方式和直接网上下载的存在差异。读取有异常。因此建议直接网上下载在分析。
##来源
"D:/R_code/follow_practice/xuetu_GEO_follow/week_practise/02_follow/02_TCGA_KM_KIRC"
后面分析内容
网友评论