课前准备-单细胞新版velocity（cellrank 2）

作者: 单细胞空间交响乐 | 来源:发表于2024-06-16 16:15 被阅读0次

单细胞之轨迹分析-2：monocle2 原理解读+实操
单细胞之轨迹分析-4：scVelo
单细胞之轨迹分析-5：slingshot
单细胞之轨迹分析-6：velocyto.R+Seurat
单细胞之轨迹分析-7：Seurat+scVelo
单细胞之轨迹分析-1：RNA velocity
RNA velocity分析练习（一）文件下载以及预处理
Supercell代码实操（三）：RNA velocity分析
民主决策：做出最佳选择（李敏）
课前准备

作者，Evil Genius

参考文章CellRank 2: unified fate mapping in multiview single-cell data(nature methods)，2024年6月。

官网示例在cellrank documentation

分析框架

CellRank 2为使用马尔可夫链研究单细胞命运决策提供了统一的框架

自动确定初始和最终状态，计算命运概率，绘制轨迹特异性基因表达趋势图表，并识别谱系相关基因
采用概率系统描述，其中每个细胞构成马尔可夫链中的一个状态，边缘表示细胞-细胞转移概率
CellRank 2提供了一组基于基因表达、RNA速率、伪时间、发育潜力、实验时间点和代谢标记数据的转换概率的不同kernels 。
引入了一种random walk-based的可视化方案

CellRank’s Key Applications

Estimate differentiation direction based on a varied number of biological priors, including pseudotime, developmental potential, RNA velocity, experimental time points, and more.
Compute initial, terminal and intermediate macrostates [Reuter et al., 2019, Reuter et al., 2022].
Infer fate probabilities and driver genes.
Visualize and cluster gene expression trends

基于velocyto

use scvelo to compute RNA velocity
set up CellRank’s VelocityKernel and compute a transition matrix based on RNA velocity.
combine the VelocityKernel with the ConnectivityKernel to emphasize gene expression similarity.
visualize the transition matrix in a low-dimensional embedding.

#######示例

import sys

if "google.colab" in sys.modules:
    !pip install -q git+https://github.com/theislab/cellrank

import numpy as np

import cellrank as cr
import scanpy as sc
import scvelo as scv

scv.settings.verbosity = 3
scv.settings.set_figure_params("scvelo")
cr.settings.verbosity = 2

import warnings

warnings.simplefilter("ignore", category=UserWarning)

adata = cr.datasets.pancreas()
scv.pl.proportions(adata)
adata

Preprocess the data

scv.pp.filter_and_normalize(
    adata, min_shared_counts=20, n_top_genes=2000, subset_highly_variable=False
)

sc.tl.pca(adata)
sc.pp.neighbors(adata, n_pcs=30, n_neighbors=30, random_state=0)
scv.pp.moments(adata, n_pcs=None, n_neighbors=None)

scv.tl.recover_dynamics(adata, n_jobs=8)

scv.tl.velocity(adata, mode="dynamical")

Combine RNA velocity with expression similarity in high dimensions

Set up the VelocityKernel

vk.compute_transition_matrix()

默认情况下，使用确定性模式来计算transiton matrix。如果要传播速率向量中的不确定性，查看随机模式和蒙特卡罗模式。随机模式使用KNN图估计速率向量上的分布，并使用分析近似将该分布传播到过渡矩阵中。

Combine with gene expression similarity

RNA velocity can be a very noise quantity; to make our computations more robust, we combine the VelocityKernel with the similarity-based ConnectivityKernel

ck = cr.kernels.ConnectivityKernel(adata)
ck.compute_transition_matrix()

combined_kernel = 0.8 * vk + 0.2 * ck

可视化

vk.plot_projection()

vk.plot_random_walks(start_ixs={"clusters": "Ngn3 low EP"}, max_iter=200, seed=0)

基于Diffusion pseudotime (DPT)

import sys

if "google.colab" in sys.modules:
    !pip install -q git+https://github.com/theislab/cellrank

import numpy as np

import cellrank as cr
import scanpy as sc
import scvelo as scv

scv.settings.verbosity = 3
scv.settings.set_figure_params("scvelo")
sc.settings.set_figure_params(frameon=False, dpi=100)
cr.settings.verbosity = 2
import warnings

warnings.simplefilter("ignore", category=UserWarning)
adata = cr.datasets.bone_marrow()

Check RNA velocity on this data

scv.pl.proportions(adata)

scv.pp.filter_and_normalize(
    adata, min_shared_counts=20, n_top_genes=2000, subset_highly_variable=False
)

sc.tl.pca(adata)
sc.pp.neighbors(adata, n_pcs=30, n_neighbors=30, random_state=0)
scv.pp.moments(adata, n_pcs=None, n_neighbors=None)

scv.tl.recover_dynamics(adata, n_jobs=8)
scv.tl.velocity(adata, mode="dynamical")

vk = cr.kernels.VelocityKernel(adata)
vk.compute_transition_matrix()

vk.plot_projection(basis="tsne")

top_genes = adata.var["fit_likelihood"].sort_values(ascending=False).index
scv.pl.scatter(adata, basis=top_genes[:10], ncols=5, frameon=False)

Use pseudotime to recover directed differentiation

Choosing the right pseudotime

sc.tl.diffmap(adata)

root_ixs = 2394  # has been found using `adata.obsm['X_diffmap'][:, 3].argmax()`
scv.pl.scatter(
    adata,
    basis="diffmap",
    c=["clusters", root_ixs],
    legend_loc="right",
    components=["2, 3"],
)

adata.uns["iroot"] = root_ixs

sc.tl.dpt(adata)
sc.pl.embedding(
    adata,
    basis="tsne",
    color=["dpt_pseudotime", "palantir_pseudotime"],
    color_map="gnuplot2",
)

Compute a transition matrix

pk = cr.kernels.PseudotimeKernel(adata, time_key="palantir_pseudotime")
pk.compute_transition_matrix()

print(pk)
pk.plot_projection(basis="tsne", recompute=True)