美文网首页
图片复现-ONECUT2 is a driver of neur

图片复现-ONECUT2 is a driver of neur

作者: Juan_NF | 来源:发表于2019-04-01 00:17 被阅读0次
    1 术语?

    NEPC-neuroendocrine prostate cancer 神经内分泌前列腺癌
    PCa-prostate cancer 前列腺癌
    t-NEPC-treatment-emergent NEPC (t-NEPC)
    hypoxia-directed therapy

    2 数据

    PCa Beltran data set
    ---All BAM files and associated sample information are described in Supplementary Table 11; data are deposited in dbGap phs000909.v.p1 and accessible on the cBIO Portal for Cancer Genomics.

    使用[RSEM]处理和标准化来自TCGA的RNASeqV2 以产生TPM(每百万转录物)。
    原始数据太大了,而且貌似,我们一般没有dbGAP的下载权限?
    所以我使用的数据是data_RNA_Seq_expression_median.txt中的数据


    image.png
    dbGAP image.png
    PCa Lin data set

    ----?LTL545| 结合了临床队列? ------先不处理

    GPL14450的数据---quantile normalization

    因为是芯片数据,所以需要找到对应平台的注释文件,我是从GEO下载的对应的注释txt
    https://www.ncbi.nlm.nih.gov/geo/browse/

    image.png
    image.png
    CCLE数据

    CCLE: Lung Cancer
    CCLE: Nervous system tumor

    • SCLC=lung+small_cell+ATCC+Gender(F/M)-note 重复 38
    • NSCLC=lung-small_cell-large_cell+ATCC|ECACC+Gender(F/M)-note 重复 71
    • Neuroblastoma=neuroblastoma+Gender(F/M)-note 重复 11
    • glioma=glioma+Gender(F/M)-note 重复 33
    zcat CCLE_RNAseq_genes_rpkm_20180929.gct.gz |sed -n '3p' > cell_line.txt
    awk '{for(i=1;i<=NF;i++){a[FNR,i]=$i}}END{for(i=1;i<=NF;i++){for(j=1;j<=FNR;j++){printf a[j,i]" "}print ""}}' cell_line.txt  > tcell_line.txt
    cat > num.sh
    cat $1|while read line
    do
      cat tcell_line.txt|grep -n ${line} >>$1_num.txt
    done
    #####此处有教训,scc.txt是在window里从excel筛选出来粘贴得到的,然后传到服务器,这里的格式不是unix格式,在grep过程中一直没有结果,在notepad++转成unix格式后,再传到服务器,运行脚本,才有结果;$1这里是指我在windows里根据文章描述筛选出来的细胞系的txt;这里是要把对应的列取出来,之后方便用cut函数将对应的细胞系的表达情况的列取出来
    cat > target.sh
    cat $1|while read line
    do
    echo $line > line.txt
    num=`cut -d ':' -f 1 line.txt`
    col=`zcat CCLE_RNAseq_genes_rpkm_20180929.gct.gz|cut -f ${num} -`
    echo $col > line1.txt
    paste line1.txt  >>$1_target.txt
    done
    #####这里是要根据上一步的列号,进行cut操作,echo之后,就是行的模式,可以重定向
    

    zscore
    For mRNA and microRNA expression data, we typically compute the relative expression of an individual gene and tumor to the gene's expression distribution in a reference population. That reference population is all samples that are diploid for the gene in question (by default for mRNA), or normal samples (when specified), or all profiled samples . The returned value indicates the number of standard deviations away from the mean of expression in the reference population (Z-score). This measure is useful to determine whether a gene is up- or down-regulated relative to the normal samples or all other tumor samples.
    the z-scores are calculated using only patient data. Hence, overexpressed in this case implies higher expression than the average patient.

    3 R部分

    Wilcoxon test was used to calculate p-value in every comparison and Benjamini-Hochberg adjustment was conducted to assess the false discovery rates (FDR) of multiple comparisons. Genes co-up-regulated (fold change >2 and FDR < 0.05) in NE vs.non-NE comparisons of all the four data sets were subjected to the following network analysis.

    相关文章

      网友评论

          本文标题:图片复现-ONECUT2 is a driver of neur

          本文链接:https://www.haomeiwen.com/subject/mewfvqtx.html