fbpx

How to change sequence names coming from Prokaryotic GeneFinding

How to change sequence names coming from Prokaryotic GeneFinding

When running GeneFinding the sequences receive a name with the predicted genes.

The first part of the sequence identifier comes from the genome reference sequence name (de-novo assembly) and then a _orfx is appended, where x is a number.

Sometimes this name is not useful to proceed with downstream analysis or compare results from other experiments.

Is there any way in which I can attribute the 4,357 gene names to more standard gene IDs, such as rseq gene IDs or ENSG IDs?

The approach that can be followed is to replace the sequence name by the top hit from a reference (.fasta) retrieved by similarity.

OmicsBox/Blast2GO offers the following feature under Tools > Retrieve Blat Top-hit which will search for similar sequences against a reference genome.

If the reference genome is available at the NCBI, this can be downloaded and then used to replace the names.

    1. Download reference genome (e.g. genes) from NCBI.
        1. Usually under Send to on the top right corner from the page e.g. Gene Features (Fasta Nucleotide).
    2. Under Tools > Retrieve Blat Top-hit choose the parameters like in Figure 1.

Retrieve Top Blast Hit

 Figure 1: Retrieve Blat Top-Hit Parameters

The user will end up with a new project, where the sequences itself are from the gene finding project and the sequence names are the ones from the reference. 

Note: The reference genome (genes) used in the feature can also be retrieved from BioMart from within OmicsBox/Blast2GO (see Load Sequences/ Annotation from a list of identifiers with Blast2GO) or from Load Fasta from Reference + GFF/GTF.

Retrieve Top Blast Hit

Blog Categories:

News

Releases, Media, Announcements, etc.

Use Cases, Reviews, Tutorials

Product Tutorial, Quickstarts, New Features, etc.

Video Tutorials

Helpful Features, Tips and Tricks

Tips And Tricks

Mini-tutorials for common use-cases and to address frequently asked questions FAQs

Most Popular:

Facebook
Twitter
LinkedIn
Email
Print