Blast2GO Blog


Tutorial, Quickstarts, New Features, etc.

Blast2GO Supported Project:

Study of gene expression patterns associated with reproductive plasticity in the peacock blenny Salaria pavo


  • MSc Sara D. Cardoso, Ph.D. candidate at Gulbenkian Institute for Science (IGC), Oeiras, Portugal
  • Supervisors: Prof. Rui F. Oliveira, Gulbenkian Institute for Science (IGC), Oeiras, Portugal and Prof. Adelino V. M. Canário, CCMAR - Centre of Marine Sciences, University of Algarve, Faro, Portugal

Salaria pavo nest-holder male inside its nest being courted by a female.

Background and Project Overview:

The peacock blenny Salaria pavo (family Blenniidae) is a small intertidal fish usually found in rocky shores of the Mediterranean Sea and adjacent Atlantic areas. Populations of the peacock blenny under divergent ecological conditions (i.e. nest site availability) exhibit profound differences in their mating system, involving sex role reversal in courtship behaviour and the expression of sequential alternative reproductive tactics (ARTs).

Continue Reading

Coding-Potential Assessment

A basic evaluation of the Coding-Potential Assessment Tool in Blast2GO

RNA-seq technologies detect coding as well as multiple forms of noncoding RNA. RNA-seq can accurately measure gene and transcript abundance as well as identify known and novel features of a transcriptome. While the coding transcripts will lead to effector proteins, the non-coding transcripts are usually involved in the gene expression regulation and in the transcription and translation machinery.

In this evaluation, we will predict the coding potential for the transcripts of an RNA-seq experiment showing the different options of this tool which is based on the 'Coding-Potential Assessment Tool 1 ' and an overview of the results.

Continue Reading

Blast2GO Supported Project:

Study of changes in the maize transcriptome in response to Maize Iranian Mosaic Virus (MIMV) infection


  • Mr. Abozar Ghorbani, Ph.D. candidate at Plant Virology Research Center, College of Agriculture, Shiraz University, Shiraz, Iran
  • Supervisors: Prof. Keramatollah Izadpanah, Plant Virology Research Center, College of Agriculture, Shiraz University, Shiraz, Iran and A/Prof. Ralf Dietzgen, Queensland Alliance for Agriculture and Food Innovation, the University of Queensland, Australia

Background and Project Overview:

Maize Iranian mosaic virus (MIMV, genus Nucleorhabdovirus, family Rhabdoviridae) is an economically important virus in maize in Iran. In addition to maize, it infects wheat, barley, rice and several other gramineous plant species. The virus is transmitted by the planthopper Laodelphax striatellus in a persistent-propagative manner. There is no close serological relationship between MIMV and other rhabdoviruses infecting gramineous plants such as Maize mosaic virus, Barley yellow striate mosaic virus and Cynodon chlorotic streak virus. In recent years, several differential screening techniques have been devised to identify changes in the expression of host genes in response to virus infection. Next-generation deep-sequencing techniques, such as Illumina RNA-seq, have provided new approaches to study plant transcriptomes to allow insights into plant defense responses.

Continue Reading


Eukaryotic gene finding with Blast2GO

A basic evaluation of Augustus

Blast2GO allows executing eukaryotic de-novo and RNA-seq based gene finding with Augustus. In this way, it is possible to discover novel, putative coding genes and their genomic positions for yet uncharacterized genome. Based on the Augustus algorithm an 'ab-initio' (DNA sequences only), as well as RNA-seq guided (BAM files) gene predictions, are supported. As shown below, the latter increases the prediction accuracy significantly.

In this evaluation, we will guide you through a typical gene finding process while comparing the different results obtained using the 'ab-initio' and the RNA-seq supported approach. The performance (time) is also compared with the standalone Augustus version.


Continue Reading

Time Course Expression Analysis with Blast2GO

A simple use-case comparing Blast2GO with R chunks

The Blast2GO feature “Time Course Expression Analysis” is designed to perform time-course expression analysis of count data arising from RNA-seq technology. Based on the software package 'maSigPro', which belongs to the Bioconductor project, this tool allows the detection of genomic features with significant temporal expression changes and significant differences between experimental groups by applying a two steps regression strategy. This use case shows the basic analysis workflow, comparing the results obtained with R Bioconductor and Blast2GO. 

Continue Reading

Pairwise Differential Expression Analysis with Blast2GO

A simple use-case comparing Blast2GO with R chunks

The Blast2GO feature “Pairwise Differential Expression Analysis” is designed to perform differential expression analysis of count data arising from RNA-seq technology. This tool allows the identification of differentially expressed genes considering two different conditions based on the software package ‘edgeR’, which belongs to the Bioconductor project. This use case shows the basic analysis workflow, comparing the results obtained with R Bioconductor and Blast2GO.

Blast2GO and R Logo

Continue Reading

Functional Analysis of Pancreatic Cancer Expression Profiles

This use case shows how to perform a functional analysis of pancreatic cancer expression data with Blast2GO. The performed steps are explained more in detail with short video tutorial linked to each section.

Malignant (PDAC), benign (CP) and normal tissue (NP) gene lists are compared against each other. The human genome functional annotation data is retrieved via BioMart and used as the reference. The data set is analysed using two different enrichment-analysis strategies, a Fisher's Exact Test and GSEA and results are visualized in various different ways within Blast2GO. 

Analysis Workflow

  1. Extract and Review data with Blast2GO.
    • Load the complete human genome GO annotation by BioMart.
    • Make several Project Statistics to check coverage of annotated sequences. 
    • Reduce the functional information with GO-Slim.
  2. Import differential expressed gene list.
    • Import two datasets from Pancreatic expression Database: Pancreatic Ductal Adenocarcinoma (PDAC) vs Normal Pancreas (NP) and Chronic Pancreatitis (CP) vs Normal Pancreas (NP).
  3. Enrichment analysis:
    1. Fisher's Exact Test
    2. Gene Set Enrichment Analysis
  4. Methods to visualize functional profiles.
    • Word Cloud
    • Coloured Graph
  5. Conclusions

Continue Reading

Brief review: Gene Finding/Prediction for Bacterial Genomes


You have: Newly aligned genome of a bacterial non-model organism.
You want: Perform functional annotation and analysis of its potential proteins.
You need: Predict all potential genes or coding regions before proceeding to the functional annotation: Gene-Finding
How can this be done?
  • Use Glimmer, a set of algorithms which uses interpolated Markov models to distinguish coding from non-coding DNA in bacteria, archaea, and viruses. Glimmer has been developed at the Center for Computational Biology at Johns Hopkins University, Baltimore, USA which is also the home of tophat, bowtie and cufflinks among others popular bioinformatics tools.
  • Use GeneMark, a family of gene prediction programs, which use species-specific inhomogeneous Markov chain models of protein-coding DNA sequence as well as homogeneous Markov chain models of non- coding DNA. GeneMark is developed at Georgia Institute of Technology, Atlanta, Georgia, USA.
  • Use Prodigal. Prodigal, which name stands for Prokaryotic Dynamic Programming Genefinding Algorithm is a microbial (bacterial and archaeal) gene finding program developed at Oak Ridge National Laboratory and the University of Tennessee, USA. Prodigal is known to be a very fast gene recognition tool and a highly accurate gene finder which performs well also with high GC content genomes. Prodigal is based on log-likelihood functions and does not use Hidden or Interpolated Markov Models.

A brief review of these gene finding tools: 

We describe here a basic review of 3 popular prokaryote gene prediction tools: Glimmer, GeneMark and Prodigal. We performed gene predictions for the Gram-positive bacterium Streptococcus thermophilus. (wikipedia

Continue Reading

Amazon Web Service (AWS) NCBI Blast Search with Blast2GO

(This feature is now obsolete.)


aws market

In the following article, I will explain how to set up an EC2 instance with the Blast+ AMI.

  1. This NCBI webpage takes us to the most recent Blast+ AMI in the AWS Marketplace, which we want to configure after hitting "Continue" on the right-hand side.
  2. We can now play around with the Region and EC2 Instance Type, which will influence the estimated monthly price for our set-up (see right-hand side). Keep in mind that the NCBI states in their documentation the following: BLAST searches will not run efficiently on smaller instances. Minimally, an instance with 32 GB of memory and a minimum of 32 GiB free space is required.

Continue Reading

Reformat/adapt Blast XML results against a custom UniProt sequence database to generate a species distribution chart

Problem: Empty species distribution chart 
Solution: Reformat your SwissProt/UniProt blast XML results 

The species distribution chart is a good way to visualise the species found for all blast hits for a given dataset. It is possible to generate this chart with Blast2GO from the toolbar: charts > Blast Statistics > Species Distribution.

Continue Reading

More Articles...


Join our Blast2GO Google Group