July 25 NCBI Minute: Five Teaching Examples Using NCBI BLAST

Next Wednesday, July 25, 2018, NCBI staff will show you a set of simple teaching examples that use BLAST and related alignment tools at NCBI to explore modern biology concepts and techniques including evolution, taxonomy, homology, multiple sequence alignment, phylogenetic trees, primer design and gene expression analysis. You can easily incorporate these examples into your undergraduate biology courses.

Date and time: Wed, July 25, 2018 12:00 PM – 12:30 PM EDT

Register here.

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on our Webinars and Courses page.

The new BLAST widget seamlessly integrates your results into NCBI’s Genome Data Viewer (GDV)

Want to analyze your BLAST results in the context of a genome browser? Want to compare those results against other genome assembly annotations? The BLAST widget, a new browser feature, lets you do that. It provides direct access within GDV to execute and manage BLAST queries (blastn, tblastn) aligned to the specific assembly displayed in GDV.

To learn about this tool, keep reading or watch this short introduction video. Further details are in GDV’s help documents.

Continue reading

RefSeq release 89 is public

RefSeq release 89 is accessible online, via FTP and through NCBI’s programming utilities. This full release incorporates genomic, transcript, and protein data available as of July 9, 2018. It contains 163,859,625 records, including 113,429,348 proteins, 23,029,67 RNAs and sequences from 81,345 organisms. The release is in several directories as a complete dataset and as divided by logical groupings.

NYGC NCBI-style bioinformatics hackathon August 6-8, 2018

From August 6-8, 2018, the NCBI will help with a data science hackathon at the New York Genome Center in Manhattan. The hackathon will focus on genomics as well as general Data Science. This event is for researchers, including students and postdocs, who have already engaged in the use of large datasets or in the development of pipelines for analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians, or librarians.

Continue reading

July 11 NCBI Minute: Five Teaching Examples with NCBI APIs

Next Wednesday, July 11, 2018, NCBI staff will show you a set of simple exercises that use EDirect to explore aspects of a human gene. You can easily incorporate these examples into your undergraduate biology courses.

Date and time: Wed, July 11, 2018 12:00 PM – 12:30 PM EDT

Register here: https://bit.ly/2KmH1yO

Continue reading

CCDS release 22 for human is public in Gene

The Consensus Coding Sequence (CCDS) update that compares NCBI’s Homo sapiens annotation release 109 to Ensembl’s release 92 is now reflected in Gene. This update adds 894 new CCDS IDs, and adds 154 Genes into the human CCDS set. CCDS release 22 includes a total of 33,397 CCDS IDs that correspond to 19,033 GeneIDs.

The CCDS project is a collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality. The long-term goal is to support convergence towards a standard set of gene annotations.

dbSNP database doubles in size twice in 13 months

In little over a year, dbSNP human data have doubled in size from 150 million Reference SNP (rs) records to 325 million in Build 150, and again to more than 650 million rs records in Build 151. 580 million of these rs records have frequency data in Build 151.This explosive growth makes dbSNP the world’s largest public human variation database. Current trends suggest that large-scale WGS and WES projects will discover millions of new variations in the next few years.

Build 151 was released in March 2018. The data are available for web search and FTP download.

NCBI’s dbSNP houses variation and frequency data from large-scale projects including 1000Genomes, GO-ESP, ExAC, GnomAD, TOPMED and HLI, as well as focused studies like locus-specific databases (LSDB) and clinical sources. The rs records are annotated on RefSeq genomes, mRNA and protein sequences and integrated with other NCBI resources (e.g., Assembly, Gene, RefSeq, PubMed, and BioProject). The database is used worldwide in personal genomics, medical genetics, and for managing, annotating and analysis of variation data.