美文网首页
Analyzing RNA-seq data with DESe

Analyzing RNA-seq data with DESe

作者: BINBINCC | 来源:发表于2020-10-05 21:48 被阅读0次

    书接上文,我们已经学会了如何利用count matrix数据来构建DESeqDataSet,今天我们来学习另一种数据输入的构建方法htseq-count input

    Htseq-count input

    先介绍一下什么是HTSeq,它是一个Python包用来对测序数据进行分析。

    1.Getting statistical summaries about the base-call quality scores to study the data quality.
    2.Calculating a coverage vector and exporting it for visualization in a genome browser.
    3.Reading in annotation data from a GFF file.
    4.Assigning aligned reads from an RNA-Seq experiments to exons and genes.

    该包学习参考地址:https://htseq.readthedocs.io/en/master/tour.html

    Analyzing RNA-seq data with DESeq2(一)
    Analyzing RNA-seq data with DESeq2(二)
    Analyzing RNA-seq data with DESeq2(三)
    Analyzing RNA-seq data with DESeq2(四)
    Analyzing RNA-seq data with DESeq2(五)

    directory <- "/path/to/your/files/"
    directory <- system.file("extdata", package="pasilla",
                             mustWork=TRUE)
    
    sampleFiles <- grep("treated",list.files(directory),value=TRUE)
    sampleCondition <- sub("(.*treated).*","\\1",sampleFiles)
    sampleTable <- data.frame(sampleName = sampleFiles,
                              fileName = sampleFiles,
                              condition = sampleCondition)
    sampleTable$condition <- factor(sampleTable$condition)
    
    library("DESeq2")
    ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
                                           directory = directory,
                                           design= ~ condition)
    
    ddsHTSeq
    ## class: DESeqDataSet 
    ## dim: 70463 7 
    ## metadata(1): version
    ## assays(1): counts
    ## rownames(70463): FBgn0000003:001 FBgn0000008:001 ... FBgn0261575:001
    ##   FBgn0261575:002
    ## rowData names(0):
    ## colnames(7): treated1fb.txt treated2fb.txt ... untreated3fb.txt
    ##   untreated4fb.txt
    ## colData names(1): condition
    

    看看原始数据是什么样子呢?

    > head(ddsHTSeq@assays@data$counts)
                    treated1fb.txt treated2fb.txt treated3fb.txt untreated1fb.txt untreated2fb.txt untreated3fb.txt untreated4fb.txt
    FBgn0000003:001              0              0              1                0                0                0                0
    FBgn0000008:001              0              0              0                0                0                0                0
    FBgn0000008:002              0              0              0                0                0                1                0
    FBgn0000008:003              0              1              0                1                1                1                0
    FBgn0000008:004              1              0              1                0                1                0                1
    FBgn0000008:005              4              1              1                2                2                0                1
    

    到现在两种常用输入数据形式已经学习完了,接下来就是对数据进行处理了哦。
    对了有时间可以学习一下HTSeq这个Python包,感觉很强大的样子呀。

    下次见咯( ^ . ^ )

    大家一起学习讨论鸭!

    来一杯!

    相关文章

      网友评论

          本文标题:Analyzing RNA-seq data with DESe

          本文链接:https://www.haomeiwen.com/subject/tiepuktx.html