Pathway enrichment analysis resources
Pathway databases
We list a selection of large, open-access and conveniently accessible pathway databases that offer the maximal
value for pathway enrichment analysis. Hundreds of pathway databases are available for many purposes82.
Gene set databases
● Gene Ontology (GO)57: GO provides a hierarchically organized set of thousands of standardized terms for
biological processes, molecular functions and cellular components, as well as curated and predicted gene
annotations based on these terms for multiple species. Biological process GO annotations are the most
commonly used resource for pathway enrichment analysis.
● Molecular Signatures Database (MSigDB)80,81: MSigDB is a database of gene sets based on GO, pathways, curation,
individual omics studies, sequence motifs, chromosomal position, oncogenic and immunological expression
signatures, and various computational analyses maintained by the GSEA team (http://www.msigdb.org).
A relatively non-redundant collection of ‘hallmark’ gene sets is available. The data can be used with many pathway
enrichment methods
Detailed biochemical pathway databases
These databases are maintained by a team of curators who manually collect detailed pathway information,
including biochemical reactions, gene regulatory events and other gene interactions. The information can be
exported or converted to gene set format.
● Reactome58: The most actively updated general-purpose public database of human pathways (http://www.
reactome.org). ● Panther38: Human signaling pathways (http://pantherdb.org/pathway). ● NetPath60: Human signaling pathways with a focus on cancer and immunology (http://www.netpath.org/). ● HumanCyc59: Human metabolic pathways (http://humancyc.org/). ● National Cancer Institute (NCI) Pathway Interaction Database (PID): Human cancer-related signaling
pathways; this database is no longer updated.
● KEGG83: The KEGG database is most useful for its intuitive pathway diagrams. It contains multiple types of
pathways, some of which are not normal pathways but are rather disease-associated gene sets, such as
‘pathways in cancer’ (http://www.genome.jp/kegg/). Up-to-date GMT files for KEGG pathways are currently
not freely available because of data licensing restrictions.
Pathway meta-databases
These databases collect detailed pathway descriptions from multiple originating pathway databases.
● Pathway Commons45: Collects information from other pathway databases and provides it in a standardized
format (http://www.pathwaycommons.org). ● WikiPathways48: A community-driven collection of pathways that also includes pathways from other databases
(http://www.wikipathways.org/).
tools
● g:Profiler (https://biit.cs.ut.ee/gprofiler/)
● GSEA (http://software.broadinstitute.org/gsea/)
● Cytoscape (http://www.cytoscape.org/)
● EnrichmentMap (http://www.baderlab.org/Software/EnrichmentMap)
网友评论