参考文章是:The cis-regulatory dynamics of embryonic development at single-cell resolution
概述
单细胞水平的发育谱系调控机制:
Here we investigate the dynamics of chromatin regulatory landscapes during embryogenesis at single-cell resolution
作者利用他们以前开发的sci-ATAC技术,对果蝇胚胎超过20000个单细胞核追踪三个主要发育阶段的染色质图谱:
2–4 h after egg laying (predominantly stage 5 blastoderm nuclei), 此时的胚胎由约6000个多能细胞组成
6–8 h after egg laying (predominantly stage 10–11),此时中胚层和内胚层的基本谱系已确定
10–12 h after egg laying (predominantly stage 13), 此时每个胚胎的20000多个细胞正进行终端分化
Results
对于每个发育阶段,所采样的细胞核来自几百个胚胎(雌雄都有),数据的质量衡量包括barcode的reads数分布、片段大小、平均每个细胞的reads数、与以前的研究中定义的DHS(DNase超敏位点)的cover率等等。
通过把基因组分成2-kb的windows,对每个细胞的每个bin进行评分,选取其中20,000 most frequently accessible windows使用LSI(latent semantic indexing)算法初步聚类(因为单细胞数据的稀疏特征,需要合并一些bins的数据)
LSI_clade.jpg如上图,根据细胞全部windows的相似性进行聚类,聚类获得不同cell clade,每个clade的细胞的reads即可合并作为peaks calling的input。
clade的处理流程:合并clade的reads,进行peaks calling,类似差异分析的方法寻找clade特异的peaks,这些peaks再用于寻找peaks link(比如enhancer和其相关联的promoter),结合enhancer胚胎活性数据库、基因表达数据库验证这些links
clade_anno.jpg每个clade中富集的特异peaks及links可以用来鉴定不同的clade并用来做注释,举例来说:
mesoderm is split into myogenic mesoderm (clade 3) and non-myogenic mesoderm (such as fat body and haemocytes) combined with endoderm (clade 4). The latter indicates that non-myogenic mesoderm and endoderm exhibit similar chromatin accessibility, suggesting a shared developmental program.
注释以后,作者发现原来的non-myogenic mesoderm和 endoderm是合并在同一个clade里面的,说明这两种类型可能有共有的shared developmental program. 不过就以前的知识,Drosophila的中胚层和内胚层并没有共同的起源,作者给出的解释是:可能是进化遗留的特征
Although, to our knowledge, Drosophila mesoderm and endoderm have not been shown to share a common origin, this is highly reminiscent of the mesendoderm lineage in Caenorhabditis elegans, sea urchins and vertebrates.
为了验证clade assignments的可靠性,作者结合了FACS和motif enrichment进行了验证:
FACS.jpg上图e中用FACS筛选特定细胞类群,然后进行(DNase-seq),与sci-ATAC的结果进行比较
t-SNE聚类以及每个cluster的different developmental stages,这里作者对2-4h的细胞进行t-SNE聚类,发现细胞的聚类结果和不同发育阶段的细胞类型高度一致,在下图b中,展示了每个cluster都有的阶段特异enhancer活性,说明developmental time是造成这些细胞聚成不同类的主要因素。
t-sne_pseudotime.jpg当然还是少不了trajectory拟时分析:可以看到在上图c中,细胞最终分成了三支,分别对应ectoderm、endoderm、mesoderm,说明在2-4h的发育阶段后期,细胞逐渐分化成三支
Notably, the trajectory split cells into three major branches that were consistent with our annotations of the major germ layers (neuronal cells are rare at this time point, as expected)
利用拟时分析,还可以鉴定出那些在拟时序列中动态变化的enhancer和gene loci,比如slam这个基因逐渐关闭的动态:
pseudotime.jpgFor example, the most significant closing site (P value = 5 × 10−224) is within the slam locus, a gene that is essential for blastoderm cellularization during a very brief temporal window
上图d展示了随拟时序列变化的brancn specific sites,e和f展示的是对应的anterior enhancer和posterior enhancer分别驱动具有空间表达特异性的gap genes(knirps和giant)的表达,说明:
sci-ATAC-seq can identify regulatory regions that are specifically accessible in spatially refined subsets of cells without the need for FACS sorting.
胚胎发育研究中经典的lineage-tracing和transplantation experiments都揭示了细胞命运决定主要发生在blastoderm时期,因此有了blastoderm fate map这一概念。利用sci-ATAC技术,进一步很好地描绘了胚胎发育时期染色质可及性的空间异质性。spatial heterogeneity in chromatin accessibility.
作者进一步对lineage commitment(6–8 h) 和differentiation (10–12 h)两个胚胎发育时期进行探索。通过对较晚期胚胎发育阶段的数据应用t-SNE,展示出更精细的细胞类型、组织图谱(这也与细胞进入分化的特征相符合)
tissue_assignment.jpgClusters were annotated based on overlaps between cluster-enriched peaks and enhancers or genes with known tissue-specific activity.
从上图中可以发现clade和更精细的cell annotation的对应关系、包含关系(比如a图的mesendoderm分支成8、16、14三支)
A major advantage of profiling chromatin accessibility is its potential to identify distal regulatory elements that shape gene expression.
为了验证那些组织特异性peaks确实是一些组织特异的enhancer,作者进行了胚胎转基因实验(体内enhancer活性验证,lacZ reporter gene),大概就是对candidate基因组区域进行PCR扩增、克隆至hsp70 promoter(驱动lacZ报告基因)上游,然后注射入胚胎、整合。后期在candidate有活性的胚胎区域将会有报告基因表达
transgenic embryo.jpgWe obtained 31 transgenic lines, representing six candidate regions with specific accessibility in neurogenic ectoderm, ten in non-neurogenic ectoderm, eight in myogenic mesoderm and seven in non-myogenic mesoderm plus endoderm.
candidate的筛选根据不同clade的开放peaks进行选择。作者发现一些mesendoderm,clade4的candidates同样在yolk nuclei也有活性,然而yolk实际上是胚外组织,理论上不应该有报告基因表达,作者给出的解释是:
As the yolk is extra- embryonic, this was unexpected, and suggests a potential regulatory link between the yolk and mesendodermal tissues, which is supported by the role of the GATA transcription factor serpent in both yolk and nonmyogenic mesoderm
总结
sci-ATAC-seq不仅能解释胚胎发育过程中动态的染色质可及性,还能大规模地预测体内活性的enhancer。作者还提供了一个网页工具:http://shiny.furlonglab.embl.de/scATACseqBrowser/
Our ability to understand how changes in the regulatory landscape underlie lineage commitment would be greatly aided by the concurrent measurement of chromatin accessibility and transcription.
In the long term, the integration of chromatin state, transcriptional output, lineage history, and spatial information at single-cell resolution has the potential to unlock how an organism’s genome encodes its development.
整合单细胞水平的染色质状态、转录谱、发育轨迹、空间信息等数据,将进一步有利于解答发育生物学的问题
参考文献
The cis-regulatory dynamics of embryonic development at single-cell resolution:https://doi.org/10.1038/nature25981
网友评论