Help
Overview
What is MEGANTE?
MEGANTE is an easy-to-use web service for integrated genome annotation. It has a simple interface to submit query sequences and performs genome annotation including repeat masking, transcript and protein alignments, computational gene prediction, similarity searches to known proteins, functional domain searches, and assignment of GO terms to the predicted genes. Multiple lines of evidence for the predictions are integrated, and an appropriate consensus exon-intron structure is selected for each locus. MEGANTE provides the results of the annotation in Microsoft Excel format and a graphical view based on a widely used genome browser, GBrowse. All data including the query sequences and annotation results are stored at the server end, enabling users to access their own data virtually from anywhere through web access.
What species does MEGANTE support?
Currently, MEGANTE can accept genome sequences of 31 species in 10 families of plants as follows:
Families | Species |
---|---|
Brassicaceae | Arabidopsis thaliana |
Brassica napus | |
Brassica rapa | |
Raphanus sativus | |
Fabaceae | Glycine max |
Lotus japonicus | |
Medicago truncatula | |
Vigna angularis | |
Vigna unguiculata | |
Musaceae | Musa acuminata |
Poaceae | Brachypodium distachyon |
Hordeum vulgare | |
Oryza sativa | |
Phyllostachys edulis | |
Sorghum bicolor | |
Triticum aestivum | |
Zea mays | |
Rosaceae | Malus x domestica |
Prunus persica | |
Rubiaceae | Coffea arabica |
Coffea canephora | |
Rutaceae | Citrus clementina |
Citrus reticulata | |
Citrus sinensis | |
Poncirus trifoliata | |
Salicaceae | Populus trichocarpa |
Solanaceae | Nicotiana tabacum |
Solanum lycopersicum | |
Solanum melongena | |
Solanum tuberosum | |
Vitaceae | Vitis vinifera |
How to Use
Create an account
MEGANTE stores users' query sequences and analysis results on the server side, so that it is required to create an account at the first use.
Go to https://megante.dna.affrc.go.jp/signup, and submit your email address and a password at the form. Then we will send a confirmation email to the address you entered. Just click a link in the email to activate your account.
Upload sequences
- Sign in to MEGANTE at https://megante.dna.affrc.go.jp/sigin.
- Click the Upload sequence button to go to the Upload page.
- Enter a DNA sequence or upload a file in FASTA format. Multiple sequences can be acceptable in multi-FASTA format. Each sequence is limited to up to 10 Mb, and a maximum of 100 sequences can be stored on the server.
- Select species for the query sequence in the Select species drop-down list.
- Turn on the Email notification check box if you need an email at the end of annotation.
- Enter a description to identify the query in the Query name field. This is optional.
- Click the Submit button
After submitting a query, you are redirected to the List of sequences page. You can see the states of your uploaded sequences at the page. The initial state is waiting, which means the sequence is placed into a queue, and the state is changed to running when the system starts annotation process. It takes at least 90 minutes to complete genome annotation for a 1 Mb sequence.
Get annotation results
When annotation is completed, the state will be changed to finished, and and icon will appear in the List of sequences page. Click the buttons to download or visualize annotation results, respectively.
MEGANTE predicts two classes of genes as follows:
- 1. Genes supported by ESTs or cDNAs
- At first, a consensus gene strucutre is generated from multiple evidences, such as ab initio gene prediction, protein alignment and interspecies full-length cDNA alignment, and then the structure is merged with intraspecies EST or full-length cDNA alignments to improve accuracy.
- 2. Genes without transcript evidences
- This class is same as (1) except that there is inconsistency between the consensus gene structure and transcript (EST or cDNA) alignments, or there are no transcript hits to the gene.
Class 1 genes are considered to be more accurate than class 2.
Manage your account
- Sign in to MEGANTE at https://megante.dna.affrc.go.jp/sigin.
- Click your email address in the top-right of any page to open the drop-down list.
- Select one (change email address, change password, or delete account) in the menu.
Examples
Examples of annotation results
After finishing annotation process, users can download the analysis result as a single ZIP file, which contains ORF sequences in FASTA format, and gene structure and function information in Excel format. The web service also provides a graphical view based on GBrowse. You can see the sample results (annotation of 1 Mb in Arabidopsis chromosome 1) at the following links:
Resources
DNA and protein sequences
Repeat elements
PGSB Repeat Database (mips-REdat 9.3p, 61,730 entries)
[http://mips.helmholtz-muenchen.de/plant/recat/index.jsp]
Functional domains
InterPro (release 66.0)
[http://www.ebi.ac.uk/interpro/]
Transcript sequences
Full-length cDNAs (FLcDNAs), ESTs, and RNA-seq in INSDC (updated on Apr 2016)
[http://www.insdc.org/]
*De novo assemblies of RNA-seq data were generated using Trinity.
Families | Species | FLcDNAs | ESTs | RNA-seq assemblies* |
---|---|---|---|---|
Brassicaceae | Arabidopsis thaliana | 47,193 | 1,529,700 | - |
Brassica napus | - | 643,944 | - | |
Brassica rapa | - | 214,482 | - | |
Raphanus sativus | - | 150,680 | - | |
Fabaceae | Glycine max | 4,712 | 1,461,724 | - |
Lotus japonicus | - | 242,432 | - | |
Medicago truncatula | - | 269,501 | - | |
Vigna angularis | - | - | 242,442 | |
Vigna unguiculata | - | 187,487 | - | |
Musaceae | Musa acuminata | - | 29,610 | - |
Poaceae | Brachypodium distachyon | 16,079 | 206,255 | - |
Hordeum vulgare | 28,608 | 828,843 | - | |
Oryza sativa | 47,216 | 1,255,088 | - | |
Phyllostachys edulis | 10,607 | 4,510 | - | |
Sorghum bicolor | - | 209,835 | - | |
Triticum aestivum | 6,149 | 1,298,692 | - | |
Zea mays | 65,679 | 2,019,602 | - | |
Rosaceae | Malus x domestica | - | 326,941 | - |
Prunus persica | - | 80,797 | - | |
Rubiaceae | Coffea arabica | - | 174,275 | - |
Coffea canephora | - | 69,066 | - | |
Rutaceae | Citrus clementina | - | 118,365 | - |
Citrus reticulata | - | 56,055 | - | |
Citrus sinensis | - | 214,598 | - | |
Poncirus trifoliata | - | 63,080 | - | |
Salicaceae | Populus trichocarpa | 4,664 | 89,943 | - |
Solanaceae | Nicotiana tabacum | - | 335,916 | - |
Solanum lycopersicum | 13,140 | 300,442 | - | |
Solanum melongena | - | 100,211 | - | |
Solanum tuberosum | - | 250,140 | - | |
Vitaceae | Vitis vinifera | - | 446,668 | - |
Protein sequences
Protein sequences in UniProtKB
[http://www.uniprot.org/]
Protein fragments are excluded from Swiss-Prot and TrEMBL, and TrEMBL proteins of PE level 4 and 5 are not used.
Section (division) | Number of entries | Release |
---|---|---|
Swiss-Prot (plant division) | 39,730 | 2018_01 |
TrEMBL (plant division) | 1,250,479 | 2018_01 |
Analysis tools for genome annotation
Repeat identification
- RepeatMasker [http://www.repeatmasker.org/]
Transcript alignment
- PASA [http://pasapipeline.github.io/]
- Sim4db [http://sourceforge.net/projects/kmer/]
Protein alignment
Gene prediction
- AUGUSTUS [http://bioinf.uni-greifswald.de/augustus/]
- GeneZilla [http://www.genezilla.org/]
- GlimmerHMM [http://cbcb.umd.edu/software/glimmerhmm/]
- JIGSAW [http://www.cbcb.umd.edu/software/jigsaw/]
- SNAP [http://korflab.ucdavis.edu/software.html]
Gene functional annotation
- BLAST [http://blast.ncbi.nlm.nih.gov/Blast.cgi]
- InterProScan [http://www.ebi.ac.uk/interpro/interproscan.html]
- PANNZER [http://ekhidna.biocenter.helsinki.fi/pannzer]
Genome browser
- GBrowse [http://gmod.org/wiki/GBrowse]
Publications
Citing MEGANTE
MEGANTE web service is described in:
- Numa H. and Itoh T. (2014) MEGANTE: A Web-based System for Integrated Plant Genome Annotation. Plant and Cell Physiology, 55(1):e2.
[DOI: 10.1093/pcp/pct157] [PMID: 24253915]
Release History
The results of gene prediction may be changed after updating of repeat libraries, full-length cDNAs, ESTs, RNA-seq assemblies, or UniProt proteins. Genome annotations that were already completed are not changed by the update.
2018
Release 2018-02 (February 10, 2018)
Data updates:
- Updated UniProt from release 2017_06 to 2018_01.
- Updated InterPro from release 63.0 to 66.0.
2017
Release 2017-07 (July 12, 2017)
Data updates:
- Updated UniProt from release 2016_03 to 2017_06.
- Updated InterPro from release 62.0 to 63.0.
Other changes:
- To reduce the processing time of annotation, protein sequences of PE level 4 and 5 in TrEMBL are no more used.
Release 2017-04 (April 10, 2017)
Data updates:
- Updated InterPro from release 56.0 to 62.0.
Other changes:
- Analysis tools for genome annotation were upgraded to the newer versions.
2016
Release 2016-04 (April 12, 2016)
New features:
- Added a button to download multiple annotation results at once in the List of sequences page.
Data updates:
- Updated full-length cDNAs and ESTs to the latest.
- Updated UniProt from release 2015_09 to 2016_03.
- Updated InterPro from release 53.0 to 56.0.
2015
Release 2015-10 (October 15, 2015)
New features:
- Added two new species in the Rubiaceae family: Coffea arabica and Coffea canephora.
Data updates:
- Updated full-length cDNAs and ESTs to the latest.
- Updated UniProt from release 2015_05 to 2015_09.
- Updated InterPro from release 51.0 to 53.0.
Release 2015-08 (August 25, 2015)
New features:
- ProSplign was replaced with Spaln for protein alignment.
- Added a new tool, PANNZER, for functional annotation.
Release 2015-05 (May 10, 2015)
New features:
- Added a new species in the Fabaceae family: Vigna angularis.
- De novo RNA-seq assemblies are used instead of ESTs in transcript alignment for some species.
Data updates:
- Updated full-length cDNAs and ESTs to the latest.
- Updated UniProt from release 2014_09 to 2015_05.
- Updated InterPro from release 48.0 to 51.0.
2014
Release 2014-11 (November 01, 2014)
Data updates:
- Updated full-length cDNAs and ESTs to the latest.
- Updated UniProt from release 2014_03 to 2014_09.
- Updated InterPro from release 42.0 to 48.0.
Release 2014-05 (May 20, 2014)
New features:
- Added four new species in the Rutaceae family: Citrus clementina, Citrus reticulata, Citrus sinensis, and Poncirus trifoliata.
Release 2014-04 (April 03, 2014)
Data updates:
- Updated repeat libraries from mips-REdat 9.0p to 9.3p.
- Updated full-length cDNAs and ESTs to the latest.
- Updated UniProt from release 2013_06 to 2014_03.
2013
Initial release (August 01, 2013)
MEGANTE was opened to the public. Details of the initial release are described in a paper (Numa and Itoh, 2014).