Statquest笔记3—DEseq2 (No.60)

作者: 为腹不为目_7a92 | 来源:发表于2020-01-31 22:58 被阅读0次

Statquest笔记3—DEseq2 (No.60)
StatQuest学习笔记13——LDA
TPM、read counts、RPKM/FPKM你选对了吗？
RNA sequencing ten years
StatQuest学习笔记11——p值详解
StatQuest的github主页
StatQuest学习笔记24——RPKM FPKM TPM
StatQuest学习笔记23——RNA-seq简介
生物统计-StatQuest学习笔记（一）-基础知识
StatQuest学习笔记01——统计学分布及抽样

Tow main problems in library normalization

Problem1 Adjusting for differences in library sizes

Problem1

Problem2 Adjusting for differences in library composition

Problem2

We’ll start with a small dataset to illustrate how DESeq2 scales the different samples.
The goal is to calculate a scaling for each sample. The scaling factor has to take read depth and library coposition into account.

Step 1 Take the log of all values

Step1

Step 2 Average Each Row

Step2

One thing cool about the average of log values is that this average is not easily swayed by outliers. Averages calculated with logs are called “Geometric Averages”.

Step 3 Filter out Genes with Infinity

In general, this step filters out genes with zero read counts in one or more samples.
In theory, this helps focus the scaling factors on the house keeping genes

Step4

Step 5 Calculate the median of the ratios for each sample

Step5

Step 6 Convert the medians to “normal numbers” to get the final scaling factors for each sample

The median values are exponents for e.

Step 7 Divide the original read counts by the scaling factors

Step7

Summary of DESeq2’s Library Size Scaling Factor

Logs eliminate all genes that are only transcribed in one sample type (liver vs. spleen). They also help smooth over outlier read counts (via the Geometric Mean).
The median further downplays genes that soak up a lot of the reads, putting more emphasis on moderately expressed genes.

网友评论

本文标题：Statquest笔记3—DEseq2 (No.60)

本文链接：https://www.haomeiwen.com/subject/qwxothtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！