Analyzing RNA-seq data with DESe

作者: BINBINCC | 来源:发表于2020-10-05 21:48 被阅读0次

书接上文，我们已经学会了如何利用count matrix数据来构建DESeqDataSet，今天我们来学习另一种数据输入的构建方法htseq-count input

Htseq-count input

先介绍一下什么是HTSeq，它是一个Python包用来对测序数据进行分析。

1.Getting statistical summaries about the base-call quality scores to study the data quality.
2.Calculating a coverage vector and exporting it for visualization in a genome browser.
3.Reading in annotation data from a GFF file.
4.Assigning aligned reads from an RNA-Seq experiments to exons and genes.

该包学习参考地址：https://htseq.readthedocs.io/en/master/tour.html

Analyzing RNA-seq data with DESeq2（一）
Analyzing RNA-seq data with DESeq2（二）
Analyzing RNA-seq data with DESeq2（三）
Analyzing RNA-seq data with DESeq2（四）
Analyzing RNA-seq data with DESeq2（五）

directory <- "/path/to/your/files/"
directory <- system.file("extdata", package="pasilla",
                         mustWork=TRUE)

sampleFiles <- grep("treated",list.files(directory),value=TRUE)
sampleCondition <- sub("(.*treated).*","\\1",sampleFiles)
sampleTable <- data.frame(sampleName = sampleFiles,
                          fileName = sampleFiles,
                          condition = sampleCondition)
sampleTable$condition <- factor(sampleTable$condition)

library("DESeq2")
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
                                       directory = directory,
                                       design= ~ condition)

ddsHTSeq
## class: DESeqDataSet 
## dim: 70463 7 
## metadata(1): version
## assays(1): counts
## rownames(70463): FBgn0000003:001 FBgn0000008:001 ... FBgn0261575:001
##   FBgn0261575:002
## rowData names(0):
## colnames(7): treated1fb.txt treated2fb.txt ... untreated3fb.txt
##   untreated4fb.txt
## colData names(1): condition

看看原始数据是什么样子呢？

> head(ddsHTSeq@assays@data$counts)
                treated1fb.txt treated2fb.txt treated3fb.txt untreated1fb.txt untreated2fb.txt untreated3fb.txt untreated4fb.txt
FBgn0000003:001              0              0              1                0                0                0                0
FBgn0000008:001              0              0              0                0                0                0                0
FBgn0000008:002              0              0              0                0                0                1                0
FBgn0000008:003              0              1              0                1                1                1                0
FBgn0000008:004              1              0              1                0                1                0                1
FBgn0000008:005              4              1              1                2                2                0                1

到现在两种常用输入数据形式已经学习完了，接下来就是对数据进行处理了哦。
对了有时间可以学习一下HTSeq这个Python包，感觉很强大的样子呀。

下次见咯( ^ . ^ )

大家一起学习讨论鸭！

来一杯！

网友评论

本文标题：Analyzing RNA-seq data with DESe

本文链接：https://www.haomeiwen.com/subject/tiepuktx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！