Help

Overview

What is MEGANTE?

MEGANTE is an easy-to-use web service for integrated genome annotation. It has a simple interface to submit query sequences and performs genome annotation including repeat masking, transcript and protein alignments, computational gene prediction, similarity searches to known proteins, functional domain searches, and assignment of GO terms to the predicted genes. Multiple lines of evidence for the predictions are integrated, and an appropriate consensus exon-intron structure is selected for each locus. MEGANTE provides the results of the annotation in Microsoft Excel format and a graphical view based on a widely used genome browser, GBrowse. All data including the query sequences and annotation results are stored at the server end, enabling users to access their own data virtually from anywhere through web access.

What species does MEGANTE support?

Currently, MEGANTE can accept genome sequences of 31 species in 10 families of plants as follows:

FamiliesSpecies
BrassicaceaeArabidopsis thaliana
Brassica napus
Brassica rapa
Raphanus sativus
FabaceaeGlycine max
Lotus japonicus
Medicago truncatula
Vigna angularis
Vigna unguiculata
MusaceaeMusa acuminata
PoaceaeBrachypodium distachyon
Hordeum vulgare
Oryza sativa
Phyllostachys edulis
Sorghum bicolor
Triticum aestivum
Zea mays
RosaceaeMalus x domestica
Prunus persica
RubiaceaeCoffea arabica
Coffea canephora
RutaceaeCitrus clementina
Citrus reticulata
Citrus sinensis
Poncirus trifoliata
SalicaceaePopulus trichocarpa
SolanaceaeNicotiana tabacum
Solanum lycopersicum
Solanum melongena
Solanum tuberosum
VitaceaeVitis vinifera

How to Use

Create an account

MEGANTE stores users' query sequences and analysis results on the server side, so that it is required to create an account at the first use.

Go to https://megante.dna.affrc.go.jp/signup, and submit your email address and a password at the form. Then we will send a confirmation email to the address you entered. Just click a link in the email to activate your account.

Upload sequences

  1. Sign in to MEGANTE at https://megante.dna.affrc.go.jp/sigin.
  2. Click the Upload sequence button to go to the Upload page.
  3. Enter a DNA sequence or upload a file in FASTA format. Multiple sequences can be acceptable in multi-FASTA format. Each sequence is limited to up to 10 Mb, and a maximum of 100 sequences can be stored on the server.
  4. Select species for the query sequence in the Select species drop-down list.
  5. Turn on the Email notification check box if you need an email at the end of annotation.
  6. Enter a description to identify the query in the Query name field. This is optional.
  7. Click the Submit button

After submitting a query, you are redirected to the List of sequences page. You can see the states of your uploaded sequences at the page. The initial state is waiting, which means the sequence is placed into a queue, and the state is changed to running when the system starts annotation process. It takes at least 90 minutes to complete genome annotation for a 1 Mb sequence.

Get annotation results

When annotation is completed, the state will be changed to finished, and and icon will appear in the List of sequences page. Click the buttons to download or visualize annotation results, respectively.

MEGANTE predicts two classes of genes as follows:

1. Genes supported by ESTs or cDNAs
At first, a consensus gene strucutre is generated from multiple evidences, such as ab initio gene prediction, protein alignment and interspecies full-length cDNA alignment, and then the structure is merged with intraspecies EST or full-length cDNA alignments to improve accuracy.
2. Genes without transcript evidences
This class is same as (1) except that there is inconsistency between the consensus gene structure and transcript (EST or cDNA) alignments, or there are no transcript hits to the gene.

Class 1 genes are considered to be more accurate than class 2.

Manage your account

  1. Sign in to MEGANTE at https://megante.dna.affrc.go.jp/sigin.
  2. Click your email address in the top-right of any page to open the drop-down list.
  3. Select one (change email address, change password, or delete account) in the menu.

Account menu

Examples

Examples of annotation results

After finishing annotation process, users can download the analysis result as a single ZIP file, which contains ORF sequences in FASTA format, and gene structure and function information in Excel format. The web service also provides a graphical view based on GBrowse. You can see the sample results (annotation of 1 Mb in Arabidopsis chromosome 1) at the following links:

Resources

DNA and protein sequences

Repeat elements

PGSB Repeat Database (mips-REdat 9.3p, 61,730 entries)
[http://mips.helmholtz-muenchen.de/plant/recat/index.jsp]

Functional domains

InterPro (release 66.0)
[http://www.ebi.ac.uk/interpro/]

Transcript sequences

Full-length cDNAs (FLcDNAs), ESTs, and RNA-seq in INSDC (updated on Apr 2016)
[http://www.insdc.org/]

*De novo assemblies of RNA-seq data were generated using Trinity.

FamiliesSpeciesFLcDNAsESTsRNA-seq assemblies*
BrassicaceaeArabidopsis thaliana47,1931,529,700-
Brassica napus-643,944-
Brassica rapa-214,482-
Raphanus sativus-150,680-
FabaceaeGlycine max4,7121,461,724-
Lotus japonicus-242,432-
Medicago truncatula-269,501-
Vigna angularis--242,442
Vigna unguiculata-187,487-
MusaceaeMusa acuminata-29,610-
PoaceaeBrachypodium distachyon16,079206,255-
Hordeum vulgare28,608828,843-
Oryza sativa47,2161,255,088-
Phyllostachys edulis10,6074,510-
Sorghum bicolor-209,835-
Triticum aestivum6,1491,298,692-
Zea mays65,6792,019,602-
RosaceaeMalus x domestica-326,941-
Prunus persica-80,797-
RubiaceaeCoffea arabica-174,275-
Coffea canephora-69,066-
RutaceaeCitrus clementina-118,365-
Citrus reticulata-56,055-
Citrus sinensis-214,598-
Poncirus trifoliata-63,080-
SalicaceaePopulus trichocarpa4,66489,943-
SolanaceaeNicotiana tabacum-335,916-
Solanum lycopersicum13,140300,442-
Solanum melongena-100,211-
Solanum tuberosum-250,140-
VitaceaeVitis vinifera-446,668-

Protein sequences

Protein sequences in UniProtKB
[http://www.uniprot.org/]

Protein fragments are excluded from Swiss-Prot and TrEMBL, and TrEMBL proteins of PE level 4 and 5 are not used.

Section (division)Number of entriesRelease
Swiss-Prot (plant division)39,7302018_01
TrEMBL (plant division)1,250,4792018_01

Analysis tools for genome annotation

Repeat identification

Transcript alignment

Protein alignment

Gene prediction

Gene functional annotation

Genome browser

Publications

Citing MEGANTE

MEGANTE web service is described in:

  • Numa H. and Itoh T. (2014) MEGANTE: A Web-based System for Integrated Plant Genome Annotation. Plant and Cell Physiology, 55(1):e2.
    [DOI: 10.1093/pcp/pct157] [PMID: 24253915]

Release History

The results of gene prediction may be changed after updating of repeat libraries, full-length cDNAs, ESTs, RNA-seq assemblies, or UniProt proteins. Genome annotations that were already completed are not changed by the update.

2018

Release 2018-02 (February 10, 2018)

Data updates:

  • Updated UniProt from release 2017_06 to 2018_01.
  • Updated InterPro from release 63.0 to 66.0.

2017

Release 2017-07 (July 12, 2017)

Data updates:

  • Updated UniProt from release 2016_03 to 2017_06.
  • Updated InterPro from release 62.0 to 63.0.

Other changes:

  • To reduce the processing time of annotation, protein sequences of PE level 4 and 5 in TrEMBL are no more used.

Release 2017-04 (April 10, 2017)

Data updates:

  • Updated InterPro from release 56.0 to 62.0.

Other changes:

  • Analysis tools for genome annotation were upgraded to the newer versions.

2016

Release 2016-04 (April 12, 2016)

New features:

  • Added a button to download multiple annotation results at once in the List of sequences page.

Data updates:

  • Updated full-length cDNAs and ESTs to the latest.
  • Updated UniProt from release 2015_09 to 2016_03.
  • Updated InterPro from release 53.0 to 56.0.

2015

Release 2015-10 (October 15, 2015)

New features:

  • Added two new species in the Rubiaceae family: Coffea arabica and Coffea canephora.

Data updates:

  • Updated full-length cDNAs and ESTs to the latest.
  • Updated UniProt from release 2015_05 to 2015_09.
  • Updated InterPro from release 51.0 to 53.0.

Release 2015-08 (August 25, 2015)

New features:

  • ProSplign was replaced with Spaln for protein alignment.
  • Added a new tool, PANNZER, for functional annotation.

Release 2015-05 (May 10, 2015)

New features:

  • Added a new species in the Fabaceae family: Vigna angularis.
  • De novo RNA-seq assemblies are used instead of ESTs in transcript alignment for some species.

Data updates:

  • Updated full-length cDNAs and ESTs to the latest.
  • Updated UniProt from release 2014_09 to 2015_05.
  • Updated InterPro from release 48.0 to 51.0.

2014

Release 2014-11 (November 01, 2014)

Data updates:

  • Updated full-length cDNAs and ESTs to the latest.
  • Updated UniProt from release 2014_03 to 2014_09.
  • Updated InterPro from release 42.0 to 48.0.

Release 2014-05 (May 20, 2014)

New features:

  • Added four new species in the Rutaceae family: Citrus clementina, Citrus reticulata, Citrus sinensis, and Poncirus trifoliata.

Release 2014-04 (April 03, 2014)

Data updates:

  • Updated repeat libraries from mips-REdat 9.0p to 9.3p.
  • Updated full-length cDNAs and ESTs to the latest.
  • Updated UniProt from release 2013_06 to 2014_03.

2013

Initial release (August 01, 2013)

MEGANTE was opened to the public. Details of the initial release are described in a paper (Numa and Itoh, 2014).