Blastdbinfo: API access to a database of BLAST databases


NCBI offers extensive collections of sequences through its BLAST services (http://blast.ncbi.nlm.nih.gov) for comparing and identifying DNA, RNA and protein sequences. NCBI now deposits descriptions of these sequence collections, known as BLAST databases, in a special database called blastdbinfo that you can access through the Entrez Programming Utilities (E-Utilities). Using blastdbinfo, you can enable a program to find an appropriate database and then send BLAST searches to that database using either the BLAST URL API or standalone BLAST (installed locally).

Continue reading

How To Format Sequence Data For GenBank Submissions


Submitting sequences to GenBank can seem complicated at first, but starting with a solid foundation in the form of a properly formatted file will make the process go smoothly.

Before submitting sequence data to GenBank, the data must be formatted correctly, the most common file format being FASTA. This post will show you how to create a FASTA file for submitting single- and multiple-nucleotide sequences.

Submitters can upload FASTA-formatted sequence files using NCBI’s stand-alone software Sequin, command line tbl2asn or our web-based submission tool BankIt.

The image below depicts a single sequence in FASTA format. For multiple sequences, such as those of population or phylogenetic studies, environmental samples, and batch sequences of the same gene, create the file using the steps below and put the set of sequences together in a single FASTA file.

Image

Here is how to create the FASTA file:

Continue reading