Cell type discovery and representation in the era of high-content single cell phenotyping
题目:高含量单细胞分型时代下的细胞类型发现和表示
作者及单位:
Trygve Bakken, Lindsay Cowell, Brian D. Aevermann, Mark Novotny, [...], Richard H. Scheuermann
Richard H. Scheuermann:
-
J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, CA 92037, USA
-
Department of Pathology, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
发表期刊及时间:
BMC BioinformaticsBMC series – open, inclusive and trusted2017 18(Suppl 17):559
Published:21 December 2017
摘要:
Background
A fundamental characteristic of multicellular organisms is the specialization of functional cell types through the process of differentiation. These specialized cell types not only characterize the normal functioning of different organs and tissues, they can also be used as cellular biomarkers of a variety of different disease states and therapeutic/vaccine responses. In order to serve as a reference for cell type representation, the Cell Ontology has been developed to provide a standard nomenclature of defined cell types for comparative analysis and biomarker discovery. Historically, these cell types have been defined based on unique cellular shapes and structures, anatomic locations, and marker protein expression. However, we are now experiencing a revolution in cellular characterization resulting from the application of new high-throughput, high-content cytometry and sequencing technologies. The resulting explosion in the number of distinct cell types being identified is challenging the current paradigm for cell type definition in the Cell Ontology.
Results
In this paper, we provide examples of state-of-the-art cellular biomarker characterization using high-content cytometry and single cell RNA sequencing, and present strategies for standardized cell type representations based on the data outputs from these cutting-edge technologies, including “context annotations” in the form of standardized experiment metadata about the specimen source analyzed and marker genes that serve as the most useful features in machine learning-based cell type classification models. We also propose a statistical strategy for comparing new experiment data to these standardized cell type representations.
Conclusion
The advent of high-throughput/high-content single cell technologies is leading to an explosion in the number of distinct cell types being identified. It will be critical for the bioinformatics community to develop and adopt data standard conventions that will be compatible with these new technologies and support the data representation needs of the research community. The proposals enumerated here will serve as a useful starting point to address these challenges.
背景
多细胞生物的基本特征是通过分化过程使功能细胞类型特化。这些特化细胞类型不仅表征不同器官和组织的正常功能,它们还可以用作各种不同疾病状态和治疗/疫苗反应的细胞生物标志物。为了作为细胞类型代表的参考,已经开发了细胞本体论(Cell Ontology)数据库以提供定义细胞类型的标准命名法,这套方法可以用于比较分析和生物标记物的发现。历史上,已经基于细胞独特的形状、结构、解剖学位置和标记蛋白表达来定义这些细胞类型。然而,由于新高通量、高含量细胞计数和测序技术的应用,我们现在正在经历细胞表征的革命。由此导致鉴定出的不同细胞类型数量的爆炸性增长,给当前细胞本体论中细胞类型定义的范例带来了挑战。
结果
在本文中,我们利用高含量细胞计数和单细胞RNA测序的方法,提供最新细胞生物标记物表征的实例,并基于这些尖端技术的输出数据提供了标准化细胞类型表示的策略,包括以标准化实验元数据的形式的“上下文注释”,这些元数据关于标本源分析以及marker基因(基于机器学习的细胞类型分类模型中最有用的特征得到)。我们还提出了一种统计策略,用于将新实验数据与这些标准化细胞类型代表进行比较。
结论
高通量/高含量单细胞技术的出现带来了不同细胞类型的数量激增。重要的是,生物信息学领域需要开发和采用与这些新技术兼容的数据标准惯例,并支持研究界的数据表示需求。这里列举的提案将成为应对这些挑战的有用起点。
图表选析
imageFigure 2. Cell type representations in the Cell Ontology. 在细胞本体论中的细胞类型的表示
a The expanded is_a hierarchy of the monocyte branch. b The expanded is_a hierarchy of the dendritic cell branch. c An example of a cell type term record for dendritic cell. Note the presence of both textual definitions in the “definition” field, and the components of the logical axioms in the “has part”, “lacks_plasma_membrane_part”, and “subClassOf” fields.
a 扩展的是单核细胞分支的各类分级。b 扩展的是树突状细胞分支的各类分级。 c 树突状细胞的细胞类型术语的例子。注意文本定义同时存在于“definition”部分,以及组成逻辑公理的“has part”,“lacks_plasma_membrane_part”,以及“subClassOf” 部分。
网友评论