R:scRepertoire
Python: Scirpy
因为我用Scanpy比较多,所以先学习一下Scirpy。
----------------------------------------------------
复习一些基础知识
Data structure
Scirpy leverages the AnnData data structure which combines a gene expression matrix (.X
), gene-level annotations (.var
) and cell-level annotations (.obs
) into a single object. AnnData
forms the basis for the Scanpy analysis workflow for single-cell transcriptomics data.
Scirpy adds the following IR-related columns to AnnData.obs
:
IR_VJ_1_<attr>
/IR_VJ_2_<attr>
: columns related to the primary and secondary VJ-chain of a receptor (TRA
,TRG
,IGK
, orIGL
)
IR_VDJ_1_<attr>
/IR_VDJ_2_<attr>
: columns related to the primary and secondary VDJ-chain of a receptor (TRB
,TRD
, orIGH
)
has_ir
:True
for all cells with an adaptive immune receptor
extra_chains
: Contains non-productive chains (if not filtered out), and extra chains that do not fit into the 2VJ
+ 2VDJ
chain model encoded as JSON. Scirpy does not use this information except for writing it back to AIRR format usingscirpy.io.write_airr()
.
multi_chain
:True
for all cells with more than two productiveVJ
cells or two or more productiveVDJ
cells.
重点:
clonotype:
A clonotype designates a collection of T or B cells that descend from a common, antecedent cell, and therefore, bear the same adaptive immune receptors and recognize the same epitopes.
In single-cell RNA-sequencing (scRNA-seq) data, T or B cells sharing identical complementarity-determining regions 3 (CDR3) nucleotide sequences of both VJ and VDJ chains (e.g. both α and β TCR chains) make up a clonotype.
clonotype cluster
A higher-order aggregation of clonotypes that have different CDR3 nucleotide sequences, but might recognize the same antigen because they have the same or similar CDR3 amino acid sequence.
clonotype modularity
The clonotype modularity measures how densly connected the transcriptomics neighborhood graph underlying the cells in a clonotype is. Clonotypes with a high modularity consist of cells that are transcriptionally more similar than that of a clonotype with a low modularity.
Convergent evolution of clonotypes
Evidence of convergent evolution could be clonotypes with the same CDR3 amino acid sequence, but different CDR3 nucleotide sequences (due to synonymous codons) or clonotypes with highly similar CDR3 amino acid sequences that recognize the same antigen.
CDR3区域密码子不同,但为相通的氨基酸
有的细胞可能会有Dual IR,有的是multichain-cell,后者多可能是由于doublets或multiplets。
orphan chain多是由于sequencing inefficiencies,需要质控
non-productive chain: ypically chains are flagged as non-productive if they contain a stop codon or are not within the reading frame.
we define clonotypes based on nt-sequence identity. In a later step, we will define clonotype clusters based on amino-acid similarity.
先记录到这里,去细胞注释去了
网友评论