Combining multiple tools outperforms individual methods in gene set enrichment analyses
AuthorAlhamdoosh, M; Ng, M; Wilson, NJ; Sheridan, JM; Huynh, H; Wilson, MJ; Ritchie, ME
PublisherOXFORD UNIV PRESS
University of Melbourne Author/sSheridan, Julie; Ritchie, Matthew; Ritchie, Matthew; Wilson, Nicholas
AffiliationMedical Biology (W.E.H.I.)
School of Mathematics and Statistics
Document TypeJournal Article
CitationsAlhamdoosh, M., Ng, M., Wilson, N. J., Sheridan, J. M., Huynh, H., Wilson, M. J. & Ritchie, M. E. (2017). Combining multiple tools outperforms individual methods in gene set enrichment analyses. BIOINFORMATICS, 33 (3), pp.414-424. https://doi.org/10.1093/bioinformatics/btw623.
Access StatusOpen Access
Motivation: Gene set enrichment (GSE) analysis allows researchers to efficiently extract biological insight from long lists of differentially expressed genes by interrogating them at a systems level. In recent years, there has been a proliferation of GSE analysis methods and hence it has become increasingly difficult for researchers to select an optimal GSE tool based on their particular dataset. Moreover, the majority of GSE analysis methods do not allow researchers to simultaneously compare gene set level results between multiple experimental conditions. Results: The ensemble of genes set enrichment analyses (EGSEA) is a method developed for RNA-sequencing data that combines results from twelve algorithms and calculates collective gene set scores to improve the biological relevance of the highest ranked gene sets. EGSEA's gene set database contains around 25 000 gene sets from sixteen collections. It has multiple visualization capabilities that allow researchers to view gene sets at various levels of granularity. EGSEA has been tested on simulated data and on a number of human and mouse datasets and, based on biologists' feedback, consistently outperforms the individual tools that have been combined. Our evaluation demonstrates the superiority of the ensemble approach for GSE analysis, and its utility to effectively and efficiently extrapolate biological functions and potential involvement in disease processes from lists of differentially regulated genes. Availability and Implementation: EGSEA is available as an R package at http://www.bioconductor.org/packages/EGSEA/ . The gene sets collections are available in the R package EGSEAdata from http://www.bioconductor.org/packages/EGSEAdata/ . Contacts: firstname.lastname@example.org email@example.com. Supplementary information: Supplementary data are available at Bioinformatics online.
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References