Update:NCBI is now in the process of merging EST and GSS records into the Nucleotide database, and we expect to complete this process in early 2019. Accession.version and GI identifiers will not change during this process.
As of December 1, 2018, all records from the databases for Expressed Sequence Tags (EST) and Genome Survey Sequences (GSS) will reside in NCBI’s Nucleotide database. This change will provide a single point of access for all GenBank sequence data with a common look and feel.
Read more to learn about how this change affects these resources:
dbSNP build 152 is a small incremental update from build 151 provided for you to begin testing and integrating the new build products into your workflow. Build 152 uses the new system with SPDI variant notation and is now available on FTP and the new RefSNP webpage.
From February 4-6, 2019, the NCBI will help with a data science hackathon at the Fred Hutchinson Cancer Research Center in Seattle. To apply, complete this form (approximately 10 minutes to complete). Initial applications are due Friday, January 11th by 11 pm ET.
The hackathon will focus on genomics as well as general data science. This event is for researchers, including students and postdocs, who have already engaged in the use of large datasets or in the development of pipelines for analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians, or librarians.
BLAST+ 2.8.1 is now available for download from our FTP site. This the first production release of standalone BLAST to support the new BLAST v5 databases (BLASTDBv5), which are also now available. The new databases have taxonomy information for the database sequences built-in. This gives you the following important advantages over the v4 databases.
The ability to limit your search by taxonomic group — species level as well as higher taxa.
Improved performance when limiting BLAST search with accessions.
Retrieval of sequences by taxonomic group from a BLAST database with blastdbcmd.
There are some additional enhancements to the search program options.
A new option (-subject_besthit) culls HSPs on a per subject sequence basis by removing HSPs that are completely enveloped by another HSP. This is an experimental option and is subject to change.
Use of the -max_target_seqs option for formats 0-4 is now allowed. The number of alignments and descriptions will be set to the max_target_seqs.
BLAST now issues a warning about the possibility not seeing all equivalent matches if -max_target_seqs is set to less than five.
We’ve released a new version of IgBLAST, v1.12. This new version increases the allowed distance between V gene end and J gene start positions (from 90 bp to 150 bp) as well as between V gene end and D gene start positions (from 55 to 120 bp) to accommodate extremely long VDJ junctions found in some antibodies.
IgBLAST 1.12 uses the 1-based sequence coordinate system that reflects the change in the new AIRR Rearrangement Schema. Also, it includes fixes for minor bugs found in previous versions.
The NCBI will host a collaborative biodata science hackathon on the NIH Campus in Bethesda, Maryland February 20-22.
We are now collecting project proposals focusing on building tools and pipelines for advanced analysis of biomedical datasets including text, images, next generation sequencing data, proteomics, and metadata. Proposals for tutorial pipelines and educational tools for advanced analysis are also welcome.
Submit your project proposal here! Submissions are due January 7, 2019.
This month, the NCBI Eukaryotic Genome Annotation Pipeline annotated its 500th organism! The lucky winner is Pocillopora damicornis, a stony reef-building coral frequently used as an experimental model, whose larval dispersal and development are affected by environmental changes in the oceans.