New release of the Prokaryotic Genome Annotation Pipeline now available


We have released a new version of the Prokaryotic Genome Annotation Pipeline (PGAP), available on GitHub. The new release includes the ability to ignore pre-annotation validation errors (–ignore-all-errors). This new feature allows you to produce a preliminary annotation for a draft version of the genome, even one that contains vector and adapter sequences or that is outside of the size range for the species. This draft annotation should be helpful with your ongoing work on the genome assembly. Please keep in mind that these pre-annotations and assemblies with contaminants or other errors are not suitable for submission to GenBank.

Another new feature allows you to provide the name of the consortium that generated the assembly and annotation so that this information appears in the final GenBank records. For more details, consult our guidelines on input files.

See our previous post and our documentation for details on how to obtain and run PGAP yourself.

Next on our to-do list is a module for calculating Average Nucleotide Identity (ANI) to confirm the assembly’s taxonomic assignment. Stay tuned!

Genome Workbench 3.0, now with support for preparing GenBank genome submissions


Genome Workbench version 3.0 (release notes) is now available. An important new feature is the submission preparation wizard that allows you to prepare prokaryotic and eukaryotic genome sequences for submission to GenBank. This wizard is the first step toward offering a better alternative to the Sequin submission tool.

You simply load your sequences into Genome Workbench and use the submission wizard to enter information about your submission through a set of dialog boxes and then save a submission-ready data file.  The package also includes tools for editing your sequences, annotation, and metadata.

See the tutorial video on our YouTube channel or the Genome Workbench documentation for more details on how to enable the wizard and prepare a submission.

Try our new SRA data management tools!


Have you ever needed to correct or improve SRA metadata after submitting, change the release date for your data or share your data with reviewers? Now you can perform these tasks yourself using the SRA data management features now LIVE in Submission Portal!

If you have an SRA submission and associated BioProject and BioSample, you can log into the Submission Portal, go to the Manage data tab, click into that BioProject and easily perform the following common tasks (Figure 1).

Continue reading

Prokaryotic Genome Annotation Pipeline (PGAP) now produces results suitable for submission to GenBank


We are happy to announce that you can now submit your genome sequences annotated by  your own local copy of the standalone Prokaryotic Genome Annotation Pipeline (PGAP) to GenBank.

How does it work? Download PGAP from GitHub, provide some basic information and the FASTA sequences for your genome sequence, and run the pipeline on your own machine, compute farm or the cloud. PGAP will produce annotation consistent with NCBI’s internal PGAP. Submit the resulting annotated genome to GenBank through the genome submission portal, and get an accession back.

As with any other submitted assembly, PGAP-annotated genomes will be screened for foreign contaminants and vector sequences at submission.  Any annotated assemblies that don’t pass may need to be modified. We are developing an automated process to handle these edits!

We are also working on other  improvements to stand-alone PGAP such as a module for calculating Average Nucleotide Identity (ANI) to confirm the assembly’s taxonomic assignment. Stay tuned for new developments!

 

Proposed changes to AGP files for genome assemblies


If you are a consumer or producer of AGP (A Golden Path) files for genome assemblies, please read on.  We’d like your feedback on the proposed changes described here.

As you know, AGP files are used to describe the structure of certain genome assemblies. The AGP file format has not kept up with changes in sequencing technology or International Sequence Database Collaboration (INSDC) feature usage. NCBI is therefore proposing to extend the current AGP v2.0 specification to add new linkage evidence types and a gap type of “contamination” as detailed below and described in the AGP v2.1 proposed specification.

Continue reading

New Norovirus GenBank Submission Service


Do you have Norovirus sequence data to submit to GenBank? Try out the newly-released improvements in our submission service for Norovirus data! The new service offers the following advantages:

  • Faster processing and shorter time to accession numbers
  • Improved user interface
  • Automatic Feature annotation
Submisssion_portal

Figure 1. The submission portal page showing the new option for submitting Norovirus data.

Begin a new Norovirus submission or see how to get started submitting other data to GenBank.

GenBank accepts a wide range of data to support scientific discovery and analysis on sequences from all branches of life.

Update single records easily with ClinVar’s Single SCV Update


The ClinVar Team is happy to announce a new online form in the ClinVar Submission Portal, the Single SCV Update, which makes it easier for you to update a single record.

ClinVar_SIngle_SCV_2The new ClinVar Single SCV Update form showing the sections for editing the evaluation date, clinical significance, condition, and citations.

Continue reading

Upcoming Changes to EST and GSS Databases


Update: NCBI is now in the process of merging EST and GSS records into the Nucleotide database, and we expect to complete this process in early 2019. Accession.version and GI identifiers will not change during this process.

As of December 1, 2018, all records from the databases for Expressed Sequence Tags (EST) and Genome Survey Sequences (GSS) will reside in NCBI’s Nucleotide database. This change will provide a single point of access for all GenBank sequence data with a common look and feel.

Read more to learn about how this change affects these resources:

  • Websites (Entrez)
  • APIs (E-utilities)
  • FTP sites
  • Submission procedures
  • BLAST
  • TSA (have a look if you’re not familiar!)

Continue reading

New Influenza Virus Submission Wizard Makes Flu Sequence Submissions Easier


NCBI now offers a flu sequence submission wizard that makes submissions easier and will provide you with accession numbers sooner. To get started, sign in to NCBI, go to the Submission Portal and choose the link for “Ribosomal RNA (rRNA), rRNA-ITS or Influenza sequences” from the GenBank section.

submission portal page with genbank link

Continue reading