Seventeen new NCBI annotations in RefSeq for cat, maize, clownfish, and more


In November and December, the NCBI Eukaryotic Genome Annotation Pipeline released new annotations in RefSeq for the following organisms:

  • Amphiprion ocellaris (clown anemonefish)
  • Centruroides sculpturatus (bark scorpion)
  • Ceratitis capitata (Mediterranean fruit fly)
  • Cucurbita maxima (winter squash)
  • Cucurbita moschata (crookneck pumpkin)
  • Drosophila hydei (fly)
  • Drosophila willistoni (fly)
  • Felis catus (domestic cat)
  • Leptinotarsa decemlineata (Colorado potato beetle)
  • Maylandia zebra (zebra mbuna)
  • Olea europaea sylvestris (wild olive)
  • Onthophagus taurus (beetle)
  • Piliocolobus tephrosceles (Ugandan red Colobus)
  • Seriola lalandi dorsalis (yellowtail amberjack)
  • Spodoptera litura (moth)
  • Xiphophorus maculatus (southern platyfish)
  • Zea mays (maize)

See more details on the Eukaryotic RefSeq Genome Annotation Status page.

ClinVar Unveils New, More Intuitive Variation Display


ClinVar, NCBI’s database of clinically relevant genetic variations with supporting evidence, has redesigned its variation display, and welcomes your feedback. The new Variation in ClinVar (VCV) pages provide a better-organized, more-intuitive web display that makes it easy to quickly find the information you need.

In this blog post, we’ll take you through the new design using the example of a coding region variant (VCV000256160.1) in the ABCB4 gene.

ClinVar variation page alpha view. Accession number & feedback tab are circled to highlight them.

The redesign brings the most important information to the top of the display. There are two new fields: (1) the VCV accession number and version used to cite the record, and (2) a short description of the variation (e.g., 11.3 kb deletion, or haplotype) to make it easy to quickly see what type of variation the record represents.

Continue reading

NIH Data Hackathon on campus – January 22-24, 2018


From January 22-24, 2018, the NCBI will help with a data science hackathon on the NIH campus in Bethesda, MD. The hackathon will focus on general data science analyses, including text, image and sequence processing. This event is for researchers, including students and postdocs, who have already engaged in the use of large datasets or in the development of pipelines for analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians, or librarians.

The event is open to anyone selected for the hackathon and willing to travel to the NIH campus (see below).  Applications are due Friday, December 22nd, 2017 by 9 pm EST.

Continue reading

January 10 NCBI Minute: QuickBLASTP — a program for rapidly finding high-scoring protein matches in large databases


In the next NCBI Minute on Wednesday, January 10, 2018, NCBI staff will demonstrate the new QuickBLASTP service that can search large databases at least 10X faster than traditional protein-protein BLAST (blastp). You will learn about the strategy QuickBLASTP uses to speed up the search. You will also see how to use the new QuickBLASTP service on the NCBI web BLAST site and how to access and run the standalone kblastp demonstration release.

Date & time: Wed, Jan 10, 2018 12:00 PM – 12:30 PM EST

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

Summer 2017 NCBI Hackathon Products


This blog post is for researchers, students, and postdocs, as well as non-scientific developers, mathematicians and librarians.

This summer, we were quite busy running and cohosting hackathons. These events educate participants, allow for networking among computational biologists and produce bioinformatics software prototypes.  Read on for a review of products from our Summer 2017 hackathons.

Continue reading

NCBI to assist in Southern California genomics hackathon in January


From January 10-12, 2018, the NCBI will help with a bioinformatics hackathon in Southern California hosted by San Diego State University. The hackathon will focus on advanced bioinformatics analysis of next generation sequencing data, proteomics, and metadata. This event is for researchers, including students and postdocs, who have already engaged in the use of bioinformatics data or in the development of pipelines for bioinformatics analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians, or librarians.

The event is open to anyone selected for the hackathon and willing to travel to SDSU (see below).  Applications are due Monday, December 11th, 2017 by 3 pm PT (6PM EST).

Continue reading

December 6th NCBI Minute: Keeping Current and Getting Help with NCBI Resources


In the next NCBI Minute on Wednesday, December 6th, 2017, NCBI staff will show you the most important ways to get notified of updates and changes at NCBI and  the most efficient ways to find help with using NCBI resources effectively.

Date & time: Wed, Dec 6, 2017 12:00 PM – 12:30 PM EST

After registering, you will receive a confirmation email with information about attending the webinar. After the live presentation, the webinar will be uploaded to the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

November 28th NCBI Minute: An update to “API Keys for Better E-Utilities and EDirect Access to NCBI Data”


On Tuesday, November 28, 2017, NCBI will present an update to the webinar originally presented on November 8, 2017 about the new API keys. In this updated webinar, you will learn about the relationships between API keys, NCBI accounts and IP addresses and see additional details about rates of access and server messages. You will also get answers to many important questions asked in the original webinar.

Date and time: Wednesday, November 8, 2017 12:00-12:30PM EST

After registering, you will receive a confirmation email with information about attending the webinar. After the live presentation, the webinar will be uploaded to the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

RefSeq release 85 is now public


RefSeq release 85 is now accessible online, via FTP and through NCBI’s programming utilities. This full release incorporates genomic, transcript, and protein data available, as of November 6, 2017, and contains 146,710,309 records, including 100,043,962 proteins, 20,905,608 RNAs, and sequences from 73,996 organisms. The release is provided in several directories as a complete dataset and as divided by logical groupings. See the RefSeq release notes for more information.

Continue reading