As promised in our post this past spring, we are now announcing the scheduled release of API keys for the E-utilities API. If you’ve missed some of our original discussion of these keys, or have questions about how to get a key, you may want to check out this post.
In this post, we’ll be discussing three things:
- The current status of API keys
- Upcoming testing periods in September
- Final public release on December 1, 2018.
Applications are open for three NCBI-style hackathons:
- 12th NIH Research Festival Collaborative Data Science and Machine Learning Hackathon (September 10-12)
- Post-Biological Data Science meeting hackathon at Cold Spring Harbor labs (November 10-12)
- U-HACK MED, the pre-SuperComputing hackathon at UTSW (November 9-10)
The application period for each hackathon ends this month, August 2018. See our Biohackathons GitHub page for details on each hackathon, including how to apply.
Next Wednesday, Aug 8, 2018, NCBI staff will show you how to use an NCBI account to help with research and teaching tasks including:
- Making custom collections of important records for use in coursework and research projects
- Creating lists of publications or database records to send to your courses, journal clubs and research teams
- Setting automated updates when new publications or database records are available
- Maintaining your bibliography and sharing it on your Faculty Profile
- Formatting your U.S. Gov’t BioSketch with a click of a mouse
- And keeping track of everything – right on your My NCBI dashboard!
Date and time: Wed, Aug 8, 2018 12:00 PM – 12:30 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
Consistent protein nomenclature is indispensable for communication, literature searching and entry retrieval. NCBI, the European Bioinformatics Institute (EMBL-EBI), the Protein Information Resource (PIR) and the Swiss Institute for Bioinformatics (SIB) revised and reorganized previous guidelines from UniProt and NCBI. This joint effort produced universal guidelines in nomenclature and protein naming to promote clarity in communication and improve consistency in data retrieval across databases.
These guidelines are exclusively focused on nomenclature, providing rules about universal formatting and protein naming choices; they do not include best practices for identifying or predicting function. They cover usage of language, abbreviations, symbols, punctuation, notation, terms and style. Sources of protein names and options for protein naming are also discussed.
During the 2018 INSDC annual meeting, the three collaborating sequence databases (DDBJ, EBI and GenBank) agreed to recommend these guidelines to their submitters. The Protein Naming Guidelines working group plans to write a peer-reviewed publication about protein naming and to track future changes to this document in GitHub.
In an effort to consolidate similar resources and make information easier to find, the National Library of Medicine will be retiring its PubMed Health website, effective October 31, 2018, and providing the same or similar content through more widely used NLM resources, namely PubMed, MedlinePlus, and Bookshelf.
PubMed Health content falls into two general categories: consumer health resources and systematic reviews/comparative effectiveness research (CER). A similar range of consumer health information to that in PubMed Health is available from NLM’s MedlinePlus, while the systematic reviews and CER in PubMed Health are searchable through PubMed, which links to the full text (when available) in Bookshelf, journals, and/or PubMed Central.
As of December 1, 2018, all records from the databases for Expressed Sequence Tags (EST) and Genome Survey Sequences (GSS) will reside in NCBI’s Nucleotide database. This change will provide a single point of access for all GenBank sequence data with a common look and feel.
Read more to learn about how this change affects these resources:
- Websites (Entrez)
- APIs (E-utilities)
- FTP sites
- Submission procedures
- TSA (have a look if you’re not familiar!)
In recent months, the NCBI Eukaryotic Genome Annotation Pipeline released new annotations in RefSeq for the following organisms:
- Alligator sinensis (Chinese alligator)
- Athalia rosae (coleseed sawfly)
- Bubalus bubalis (water buffalo)
- Camponotus floridanus (Florida carpenter ant)
- Canis lupus dingo (dingo)
- Harpegnathos saltator (Jerdon’s jumping ant)
- Melanaphis sacchari (aphid)
- Pelodiscus sinensis (Chinese soft-shelled turtle)
- Pogonomyrmex barbatus (red harvester ant)
- Pomacea canaliculata (gastropod)
- Sipha flava (yellow sugarcane aphid)
- Theropithecus gelada (gelada)
See more details on the Eukaryotic RefSeq Genome Annotation Status page.
In late May, we introduced a new type of search experience in NCBI Labs that uses natural language queries to make common tasks easier. The experience at NCBI Labs – where we experiment with potential new features and tools – proved successful. We’re pleased to announce that we added this simplified search capability to NCBI’s global search page. Some natural language queries now work in the “All Databases” search from the NCBI home page!
As of March 2018, there were 141,000 prokaryotic genomes in the Assembly database. As this database grows, misassigned prokaryotic genomes becomes a serious problem. Taxonomy misassignment can occur through simple submission error or can accumulate as new information adds greater specification to the taxonomic tree.
A paper in the International Journal of Systematic and Evolutionary Microbiology presents the method NCBI scientists used to verify taxonomic identities in prokaryotic genomes. The authors used an Average Nucleotide Identity method with optimum threshold ranges for prokaryotic taxa to review all prokaryotic genome assemblies in GenBank. This method relies on Type strain information and is one outcome of a 2015 workshop involving several important parties in the bacteriology community.
Since 1999, the NCBI Bookshelf has made full-text books and documents on life sciences and health freely available. The most accessed books, viewed by hundreds of thousands of people each month, are textbooks. This blog post explores the NCBI Bookshelf’s free, online textbooks and discusses how publishers, editors, and authors can contribute to this successful resource.