How to perform a Gene Set Enrichment Analysis (GSEA) with Blast2GO
What is an enrichment analysis?
An enrichment analysis is a bioinformatics method which identifies enriched or over-represented gene sets among a list of ranked genes. Gene sets are groups of genes that are functionally related according to current knowledge. Commonly used sets of genes are those sharing biological functions like gene ontology terms, pathways or a common relation like a disease, chromosomal location or regulation.
How works a Gene Set Enrichment Analysis (GSEA)?
GSEA is a computational method to determine whether an a priori defined set of genes shows a statistically significant difference between biological samples. This method is used to identify classes of genes or proteins that are over-represented in a large set of genes or proteins; these classes may have an association with biological functions or disease phenotypes. The method uses statistical approaches to identify significantly enriched or depleted classes or functions.
The standard GSEA method involvedthree steps in the analytical process:
A gene set enrichment analysis uses specific statistics and requires the corresponding implementations to run the analysis.
Blast2GO makes it very easy to perform a gene set enrichment analysis (GSEA)
Blast2GO as a complete bioinformatics toolset allows you to perform gene set enrichment analysis (GSEA), among many other functions. Blast2GO makes use of the GSEA software package developed by the MIT/BROAD Institute. Its integration in Blast2GO makes it easy to run the analysis and review the results, allowing you to focus on its interpretation.
The steps on how to perform a gene set enrichment analysis (GSEA) with Blast2GO are explained in this short video.
The video shows how to identify enriched functions from a tissue comparison performing GSEA with Blast2GO. To run GSEA a ranked list of functionally annotated genes is required. This list can be created in differents ways:
To start the GSEA you have to load the functional annotations of your genes/proteins which have to match the IDs of your ranked list. Once the Blast2GO project is loaded and the ranked list is created, you are ready to run the enrichment analysis. Click on ‘Analysis - Gene set enrichment analysis (GSEA)’ and select the input file, you can choose among different formats. Then provide the analysis parameters and hit run:
Once the analysis is finished you will obtain a result table which shows all significantly over-represented functions among the IDs at the top and bottom of your ranked list. Additionally to the GO ID and GO term of each function the results provides many details:
By right-clicking on the GO IDs a new page provides more details like the GO description and GSEA result details. An “enrichment plot” provides a graphical view of the enrichment score (ES) for a gene set.
The enrichment plot shows a green line representing the running ES for a given GO as the analysis goes down the ranked list. The value at the peak is the final ES. The middle part shows where the members (GOs) of the dataset appear in the ranked list. Those genes that appear at or before the ES represent the Leading Edge Subset. The lower part shows the value of the ranking metric as it moves down the list of the ranked genes.
The result page has a toolbar with several options like created charts, filter the results or save it as a text file. The option ‘Reduce to most specific’ allows to filter the results based on their specificity; ‘Make an enrichment graph’ generates a GO graph for each GO category selected in the wizard and ‘Show global statistics’ which generate different statistical graphs.
These visualizations will help in the interpretation of the results, to find biological meaning as well as to communicate your findings.
If you want to try all this yourself you can download Blast2GO from here.