PepBDB: a comprehensive structural database of biological peptide–protein interactions
PepBDB:一个全面的生物肽-蛋白相互作用的结构数据库
Summary: A structural database of peptide–protein interactions is important for drug discovery tar-geting peptide-mediated interactions. Although some peptide databases, especially for specialtypes of peptides, have been developed, a comprehensive database of cleaned peptide–proteincomplex structures is still not available. Such cleaned structures are valuable for docking and scor-ing studies in structure-based drug design. Here, we have developed PepBDB—a curated PeptideBinding DataBase of biological complex structures from the Protein Data Bank (PDB). PepBDBpresents not only cleaned structures but also extensive information about biological peptide–pro-tein interactions, and allows users to search the database with a variety of options and interactivelyvisualize the search results.
摘要:肽-蛋白相互作用的结构数据库对于靶向肽介导的相互作用的药物发现非常重要。虽然已经建立了一些肽数据库,尤其是特殊类型的肽,但仍没有一个整洁的肽-蛋白复杂结构的全面数据库。这种整洁的结构对于基于结构的药物设计的对接和评分研究很有价值。在这里,我们开发了pepbDB-一个来自蛋白质数据库(PDB)的生物复合物结构的精选肽结合数据库。PepBDB不仅提供了整洁的结构,而且还提供了关于生物肽-蛋白相互作用的广泛信息,并允许用户使用各种选项搜索数据库,并交互式地可视化搜索结果。
1 Introduction
1 介绍
Protein-protein interactions are crucial in life activities, and then have a wide application in drug discovery (Stanfield and Wilson, 1995). It was found that peptide-mediated interactions are estimated to make up to 40% of all these interactions (Vanhee et al., 2009). The huge and complex peptide-mediated interaction networks have a great impact on molecular biology Y.Before, the study of peptide has a tremendous influence on investigating the intracellular life activity (Petsalaki and Russell, 2008), in which computational approaches like protein-peptide docking play an important role (Lee et al., 2015). Generally, to study peptide-mediated interactions, the structures of both receptor and peptide are needed (Neduva and R. ussell, 2006). Before, a structural database of peptide-protein interactions is valuable for not ong ly the understanding of existing peptide-protein interactions but also the development of new docking algorithms for peptide drug discovery (de Vries et al., 2017; Lee et al., 2015; Yan et al., 2016, 2017aTrellet et al., 2013). There are severe existing peptide Databases with structural data, such as PepX (Vanhee et al., 2010), PepBind (Das et al., 2013) and PepBank (Shtatland et al., 2007). However, their structures are not well curated/cleaned and before cannot be directed used in peptide-related docking study or drug design (Zhou et al., 2018) due to some irrelevant elements like ligands, ions, and non-interacting chains (Yan et al., 2017a). Moreover, severe databases are No longer updated, and ss before, a functional and regularly updated peptide binding database (PepBDB) which contains curated structures on biologic protein-peptide interactions is strongly needed.
蛋白质-蛋白质相互作用在生命活动中至关重要,因此在药物发现中有广泛的应用(Stanfield和Wilson,1995)。研究发现,肽介导的相互作用估计占所有这些相互作用的40%(Vanheeetal.,2009)。巨大而复杂的肽介导的相互作用网络对分子生物学有很大的影响。因此,肽的研究对细胞内生命活性的研究具有巨大的影响(Petsalakiaulssel,2008),其中蛋白质-肽对接等计算方法发挥了重要作用(Lee et al.,2015)。一般来说,为了研究肽介导的相互作用,同时需要受体和肽的结构(Neduva和Rassell,2006)。因此,肽-蛋白相互作用的结构数据库不仅对理解现有的肽-蛋白相互作用,而且对开发肽药物发现的新对接算法都有价值(deVries等,2017;Lee等,2015;Yan等,2016,2017aTrellet等,2013)。现有有一些具有结构数据的肽数据库,如PepX(Vanhee et al.,2010),PepBind(Das等人,2013年)和PepBank(Shtatland等人,2007年)。然而,由于配体、离子和非相互作用链等不相关元素,其结构整理不当,不能直接用于肽相关对接研究或药物设计(Yan et al.,2017a)。此外,一些数据库不再更新,有些数据库甚至不再在线。因此,迫切需要一个功能和定期更新的肽结合数据库(PepBDB),其中包含生物蛋白-肽相互作用的策划结构。
2 Materials and methods
2 材料和方法
A work flow of constructing the PepBDB database is shown in Figure 1A, which can be described as follows. First, we queried all the peptides from the PDB (Berman et al., 2000) based on the se-quence. Then, a shell script was used to obtain the direct peptide–protein interactions, in which the biological unit of the PDB entry was used. Specifically, we obtained the cleaned peptide–protein complex structures by only keeping the peptide and its interacting protein chains, where a peptide and a protein were defined to be interacting if their minimum distance is within 5.0A˚ .
构建PepBDB数据库的工作流程如图1A所示,可以描述如下。首先,我们根据该序列查询了PDB(Bermanetal.,2000)中的所有多肽段。然后,使用一个shell脚本来获得直接的肽-蛋白相互作用,其中使用了PDB条目的生物单位。具体来说,我们只保留肽及其相互作用的蛋白链获得了干净的肽-蛋白复合物结构,其中如果肽和蛋白质的最小距离在5.0A˚内,则被定义为相互作用。
Another filtering process was performed to check the peptide length (Huang and Zou, 2008). To focus on normal peptides, we only chose those complexes including peptides with less than 50 amino acids. Thus, a structural database of cleaned protein–peptide complexes without irrelevant ions and non-interacting chains was generated. To make the database searchable and facilitate the visualization, all the sequences, resolution data, and interacting residues as well as the descriptions of peptide–protein complexes were extracted and added into the database. The interaction map was also prepared by LigPlotþ (Laskowski and Swindells, 2011) for each complex. Users can directly use these cleaned structures to perform peptide-related docking, prediction (Verschueren et al., 2013) or drug design (Yin et al., 2007).
此外通过过滤过程来检查肽的长度(HuangandZou,2008)。为了关注正常的肽,我们只选择了那些包含少于50个氨基酸的肽的复合物。因此,建立了一个没有无关离子和非相互作用链的清洁蛋白质-肽复合物的结构数据库。为了使数据库可搜索和可视化,我们提取了所有序列、分辨率数据、相互作用残基以及肽-蛋白复合物的描述并添加到数据库中。LigPlotþ(拉斯科夫斯基和斯温德尔斯,2011)也为每个复合物绘制了交互图。用户可以直接使用这些清理后的结构来执行肽相关对接、预测(Verschueren等,2013)或药物设计(Yin等,2007)。
The PepBDB is scheduled to be updated monthly, and provides not only curated receptors and ligands structures of protein–peptide complexes but also a searching service (Fig. 1B). For a given se� quence, sequence identity, number of protein chains and/or peptide length, the database will run sequence alignment to search for the most similar protein–peptide complexes according to the given parameters (Fig. 1), in which FASTA (Pearson and Lipman, 1988) is used to perform sequence alignment. The database search will yield a list of nonredundant peptide–protein complexes for browsing and visualization through Jmol (Hanson et al., 2013) and LigPlotþ (Laskowski and Swindells, 2011) (Fig. 1C).
PepBDB计划每月更新一次,不仅提供蛋白质-肽复合物的精选受体和配体结构,而且还提供搜索服务(图1b)。对于给定的序列、序列标识、蛋白质链数量和/或肽长度,数据库将根据给定的参数搜索最相似的蛋白质-肽复合物(图1),其中使用FASTA(Pearson和Lipman,1988)进行序列比对。数据库搜索将生成一份非冗余的肽-蛋白复合物列表,用于通过Jmol(Hanson等人,2013)和LigPlotþ(拉斯科夫斯基和斯温德尔斯,2011)(图1c)进行浏览和可视化。
图1 PepBDB数据库的构建工作流程(A)、搜索界面(B)和蛋白质-肽复合物结构的可视化示例及其相互作用信息(C)
3 Features and applications
3 特性和应用
PepBDB is a complete database of biological peptide-mediated com-plex structures derived from the PDB. Table 1 gives a comparison between PepBDB and several peptide–protein databases. Following features distinguish our PepBDB from the other similar databases
PepBDB是一个完整的生物肽介导的复杂结构的数据库。表1给出了PepBDB与几个肽-蛋白数据库之间的比较。以下特性将我们的PepBDB与其他类似的数据库区分开来
- PepBDB is a comprehensive database of biological peptidemediate complex structures with peptide lengths of up to 50 residues.
- PepBDB是一个全面的生物肽修复复杂结构数据库,肽长度高达50个残基。
- It allows users to dynamically search and/or analyze the database with given sequences and other eight options. The complexes may also be clustered with provided sequence identities of protein and/or peptide.
- 它允许用户使用给定的序列和其他八个选项动态地搜索和/或分析数据库。这些复合物也可以与所提供的蛋白质和/或肽的序列身份聚在一起。
- It provides users both cleaned complex structures and binding information for interactive visualization and download.
- 它为用户提供了清理过的复杂结构和绑定信息,并可用于交互式可视化和下载。
- It presents extensive information about the biological interactions for both peptide and protein in the searching results.
- 它提供了关于肽和蛋白质的生物相互作用的广泛信息。
- The database is updated monthly to include new data released in the PDB.
- 该数据库每月更新一次,以包括在PDB中发布的新数据。
PepBDB provides both sequence and structure information of receptor and peptide as well as the resolution and interaction data from the PDB. Therefore, PepBDB can also be used to construct the training and test sets for docking and scoring studies.
PepBDB提供了受体和肽的序列和结构信息,以及来自PDB的分辨率和相互作用数据。因此,PepBDB也可以用于构建对接和评分研究的训练和测试集。
网友评论