期刊:Nature methods (47.990/Q1)
Exploring genomic data coupled with 3D chromatin structures using the WashU Epigenome Browser
使用 WashU Epigenome Browser 探索与 3D 染色质结构相结合的基因组数据
To the Editor — Three-dimensional (3D) genomic structures are vital for gene regulation and cell function. High-throughput technologies based on chromosome conformation capture have been used to study genome-wide physical chromosome interactions. These interactions can be visualized as 2D heatmaps or as interaction networks decorated with genomic features. In addition, computational approaches using interaction data based on models, such as constrained physical models, polymer models and population-based analysis, have been developed to predict the physical 3D structures of chromosomes. Large consortia, such as ENCODE, Roadmap, and 4D Nucleome, have generated tens of thousands of genome-wide datasets of transcription factor binding sites and epigenetic marks across numerous cell types and tissues. Biologists wish to visually explore the connections between these genome-wide profiles and 3D genome structures, which will facilitate the generation and testing of diverse hypotheses. This presents a challenge to conventional genome browsers, where most genomic data is visualized in linear genomic coordinates. The WashU Epigenome Browser was invented in 2011 as an interactive tool for exploring genomic data in a web browser. We have now expanded the browser functions to allow investigators to visually explore 1D, 2D and 3D genomic data on a single webpage. The key innovation is to thread the linear genomic coordinates onto a multi-resolution 3D model of the chromosome.
致编辑——三维 (3D) 基因组结构对于基因调控和细胞功能至关重要。基于染色体构象捕获的高通量技术已被用于研究全基因组物理染色体相互作用。这些交互可以可视化为 2D 热图或装饰有基因组特征的交互网络。此外,已经开发了使用基于模型的相互作用数据的计算方法,例如受限物理模型、聚合物模型和基于群体的分析,以预测染色体的物理 3D 结构。大型联盟,如 ENCODE、Roadmap 和 4D Nucleome,已经生成了数以万计的跨多种细胞类型和组织的转录因子结合位点和表观遗传标记的全基因组数据集。生物学家希望直观地探索这些全基因组图谱和 3D 基因组结构之间的联系,这将有助于产生和检验各种假设。这对传统的基因组浏览器提出了挑战,其中大多数基因组数据在线性基因组坐标中可视化。 WashU Epigenome Browser 于 2011 年发明,作为一种交互式工具,用于在 Web 浏览器中探索基因组数据。我们现在扩展了浏览器功能,允许研究人员在单个网页上直观地探索 1D、2D 和 3D 基因组数据。关键的创新是将线性基因组坐标连接到染色体的多分辨率 3D 模型上。
We created a data format called g3d to encode the 3D model and have provided tools to convert various 3D modeling approaches. Given a 3D model of the chromosome, any genomic information anchored on the linear coordinates can be displayed in a 3D model. The 3D browser contains the general WashU Epigenome Browser functions, as well as 3D-specific functions, which enable users to intuitively examine biological information that is difficult to display on a typical browser. To facilitate a direct comparison between the linear and 3D browser views, we designed a panel system for integrated data visualization. The linear browser panel displays 1D and 2D data, while the 3D browser panel displays multiple 3D models (Fig. 1). The 3D highlighted region on the 3D browser and the linear browser are synchronized (Fig. 1b,d). Users can choose multiple non-adjacent genome regions from the 3D model and display them on the linear browser using the region-set-view mode or label the regions on the 3D model (Fig. 1c). The 3D browser also contains a thumbnail viewer that provides a synchronized global display of the main viewer (Fig. 1e).
我们创建了一种名为 g3d 的数据格式来编码 3D 模型,并提供了转换各种 3D 建模方法的工具。给定染色体的 3D 模型,任何锚定在线性坐标上的基因组信息都可以在 3D 模型中显示。 3D 浏览器包含一般的 WashU Epigenome Browser 功能,以及 3D 特定功能,使用户能够直观地检查在典型浏览器上难以显示的生物信息。为了便于直接比较线性和 3D 浏览器视图,我们设计了一个用于集成数据可视化的面板系统。线性浏览器面板显示 1D 和 2D 数据,而 3D 浏览器面板显示多个 3D 模型(图 1)。 3D 浏览器和线性浏览器上的 3D 突出显示区域是同步的(图 1b,d)。用户可以从 3D 模型中选择多个不相邻的基因组区域,并使用 region-set-view 模式将它们显示在线性浏览器上,或者在 3D 模型上标记区域(图 1c)。 3D 浏览器还包含一个缩略图查看器,可提供主查看器的同步全局显示(图 1e)。
Figure 1Fig. 1 | 3D visualization integration with panels, decoration in 3D structure, viewing chromatin loops and compartmental annotations. a–e, The linear genome browser shows sequence, gene annotation, epigenetic marks and Hi-C heat map (a); two 3D structures are displayed in b,d. From the 3D structure, click a 3D segment to reveal a popup menu with options to highlight the linear browser region or label this segment (c). By default, the 3D viewer also comes with a thumbnail, which is synchronized with the main 3D viewer (e). f–i, Browser view shows cytobands, gene annotations, H3K4me3 signal and chromHMM segmental annotation for GM12878 (f). The H3K4me3 signal is used to paint the 3D model in the current region (g) and chromosome (h). Highlighting colors and scale can be customized from the legend and configuration menu. ChromHMM annotation is used to paint the 3D model in the current region (i) and chromosome. Other genomic annotation can be used to annotate the 3D in annotation painting (cytoband, Supplementary Fig. 1b,c; gene annotation, Supplementary Fig. 1d,e). j, Typical genome browser view with a ruler, compartment annotation with subtypes, bed track with loop anchor locations and Hi-C track for GM12878. Users can also customize category colors. k–q, Spatial compartment definition using five subtypes: A1, A2, B1, B2 and B3 (k). Blue dashed line indicates the domain or loop, which is highlighted in l,m,n. Orange and green rectangles indicate loop anchors. Two subdomains or loops colored purple and green in the browser view are also shown in the 3D view with the same colors. Blue spheres in the 3D model indicate the same loop anchors as in the browser view. Compartmental annotation is used to paint the 3D model in the current region (o) and chromosome (p). The color of each subtype can be customized by clicking on the subtype and selecting a color (q).
图 1 | 3D 可视化与面板集成、3D 结构装饰、查看染色质环和区室注释。 a-e,线性基因组浏览器显示序列、基因注释、表观遗传标记和 Hi-C 热图(a);两个 3D 结构显示在 b、d 中。从 3D 结构中,单击 3D 段以显示弹出菜单,其中包含突出显示线性浏览器区域或标记该段 (c) 的选项。默认情况下,3D 查看器还附带一个缩略图,该缩略图与主 3D 查看器 (e) 同步。 f-i,浏览器视图显示 GM12878 (f) 的细胞带、基因注释、H3K4me3 信号和 chromHMM 节段注释。 H3K4me3 信号用于在当前区域 (g) 和染色体 (h) 中绘制 3D 模型。可以从图例和配置菜单中自定义突出显示颜色和比例。 ChromHMM 注释用于在当前区域 (i) 和染色体中绘制 3D 模型。其他基因组注释可用于注释注释绘画中的 3D(细胞带,补充图 1b,c;基因注释,补充图 1d,e)。 j,带有标尺的典型基因组浏览器视图、带有子类型的隔间注释、带有环锚位置的床轨道和 GM12878 的 Hi-C 轨道。用户还可以自定义类别颜色。 k–q,使用五个子类型的空间隔间定义:A1、A2、B1、B2 和 B3 (k)。蓝色虚线表示域或循环,以 l、m、n 突出显示。橙色和绿色矩形表示循环锚。浏览器视图中紫色和绿色的两个子域或循环也以相同的颜色显示在 3D 视图中。 3D 模型中的蓝色球体表示与浏览器视图中相同的循环锚点。分区注释用于在当前区域 (o) 和染色体 (p) 中绘制 3D 模型。可以通过单击子类型并选择颜色 (q) 来自定义每个子类型的颜色。
The 3D browser allows users to paint the 3D model with genomic data and features using two painting styles: annotation and numerical. In annotation painting, users can define regions on the 3D model using any genomic annotation in a segmentation format (Fig. 1i,j). In numerical painting, users can apply numerical data, such as GC content or epigenetic signals, to the 3D model. Users can configure paint thickness, background opacity, scales and color gradient (Fig. 1g,h). The 3D browser allows visualization of the multiple spatial compartment definitions (Fig. 1o,p and Supplementary Fig. 2). The spatial compartments can be customized in the 3D browser and viewed as an annotation track in the linear browser (Fig. 1q,k and Supplementary Fig. 2d,e). The 3D browser makes viewing chromatin loops and topologically associated domains (TADs) intuitive. Users can view Hi-C loop anchors on the 3D model and the linear browser, where genomic segments can be customlabeled corresponding to the ‘Domain’ track (Fig. 1k–n). Users can also load genomic locations of TADs to paint the 3D structure (Supplementary Fig. 3).
3D 浏览器允许用户使用两种绘画风格使用基因组数据和特征绘制 3D 模型:注释和数字。在注释绘画中,用户可以使用任何基因组注释以分割格式定义 3D 模型上的区域(图 1i,j)。在数值绘画中,用户可以将数值数据(例如 GC 含量或表观遗传信号)应用于 3D 模型。用户可以配置油漆厚度、背景不透明度、比例和颜色渐变(图 1g,h)。 3D 浏览器允许可视化多个空间隔间定义(图 1o、p 和补充图 2)。可以在 3D 浏览器中自定义空间隔间,并在线性浏览器中将其视为注释轨迹(图 1q,k 和补充图 2d,e)。 3D 浏览器使查看染色质环和拓扑相关域 (TAD) 变得直观。用户可以在 3D 模型和线性浏览器上查看 Hi-C 循环锚点,其中可以自定义标记对应于“域”轨道的基因组片段(图 1k-n)。用户还可以加载 TAD 的基因组位置以绘制 3D 结构(补充图 3)。
We collected 11,045 3D models from published studies, converted them to g3d format and built corresponding data hubs, including 3D models generated from single-cell studies (Supplementary Fig. 4 and Supplementary Table 1). Investigators can upload their own models for visualization, annotation and comparison. By displaying different models in multiple browser panels, investigators can compare differences in 3D modeling tools and compare single-cell and bulk Hi-C. Recent advances in genomic technologies and computational algorithms have provided unprecedented opportunities to probe chromatin interactions and generate 3D models. These models not only facilitate investigation into the formation and function of the 3D genome, but also provide a different paradigm to display and interact with genomic data. Typical genome browsers anchor genomic data on a 1D, linear axis, emulating a process of untangling and straightening genomic DNA. This process makes it convenient and straightforward to overlay genomic data; however, it destroys the spatial configuration of the chromatin. We strived to maintain the 3D structure of the underlying genomic DNA by threading the linear coordinates through a 3D model. Our approach implements an updated coordinate system from the linear genome and the 3D spatial locations to build visualization tools into a seamless format. This enables investigators to intuitively examine 3D features, such as loops and TADs; to visualize all genomic data on 3D genome coordinates; and to explore the dynamics of the 3D genome structure.
我们从已发表的研究中收集了 11,045 个 3D 模型,将它们转换为 g3d 格式并建立了相应的数据中心,包括从单细胞研究生成的 3D 模型(补充图 4 和补充表 1)。研究人员可以上传自己的模型进行可视化、注释和比较。通过在多个浏览器面板中显示不同的模型,研究人员可以比较 3D 建模工具的差异,并比较单细胞和块体 Hi-C。基因组技术和计算算法的最新进展为探测染色质相互作用和生成 3D 模型提供了前所未有的机会。这些模型不仅有助于研究 3D 基因组的形成和功能,而且还提供了不同的范例来显示基因组数据并与之交互。典型的基因组浏览器将基因组数据锚定在一维线性轴上,模拟解开和拉直基因组 DNA 的过程。这个过程使得覆盖基因组数据变得方便和直接;然而,它破坏了染色质的空间结构。我们努力通过将线性坐标穿过 3D 模型来维持基础基因组 DNA 的 3D 结构。我们的方法从线性基因组和 3D 空间位置实现了更新的坐标系,以将可视化工具构建为无缝格式。这使调查人员能够直观地检查 3D 特征,例如循环和 TAD;在 3D 基因组坐标上可视化所有基因组数据;并探索 3D 基因组结构的动力学。
Data availability
All data used in this study are published or public from open databases, which are listed in Supplementary Table 2.
Code availability
g3d-related source code is freely available at https://github.com/lidaof/g3d, and g3dtools documentation is available at https://g3d. readthedocs.io/en/latest/. The browser codebase is available at GitHub (https://github.com/lidaof/eg-react) and Zenodo (https://doi.org/10.5281/zenodo.6353838), and documentation can be found at https:// eg.readthedocs.io/en/latest/. The repository https://github.com/lidaof/eg-3d-demo contains demo files for 3D visualization and brief instructions. Video tutorials are at https://bit.ly/eg3dtutorial. All code is open source.
网友评论