Welcome to PubMed Labs!
PubMed Labs is all about you. It’s a new NCBI initiative for creating innovative and relevant products by involving you, our user community, from the beginning.
PubMed Labs is about experimentation. It’s a place where you’ll find early versions of new tools, experimental content, and proposed features, as well as an opportunity to suggest ideas to us.
PubMed Labs is about learning. It’s a place where the focus is on figuring out what works, where failure is OK because it’s a learning experience, and where any idea is welcome that can improve our services for our users.
PubMed Labs is about conversation. It’s a place where we can share future plans with you, and you can tell us how we’re doing. It’s a place where we all can come together to create resources that will benefit the broader scientific community.
Join the conversation!
We’re introducing a new category on this blog called “PubMed Labs” that will facilitate this conversation. You can follow these posts by RSS. When we have a new feature for you to try out, we’ll post here with a description of it that will contain the following:
- The user need the feature is intended to serve
- How you can activate it
- What you can expect from it
- Our plans for it
Then you can try it out and let us know how it went by commenting on the posts. Like it, hate it, we want to know! Or you can propose some additional functions or ideas.
Our first new features are SmartBLAST, an enhancement to protein BLAST, and an “also-viewed” link in PubMed. Each of these is described in accompanying blog posts:
Let us know what you think!
This blog post is geared toward researchers.
In November, NIH announced a new format for biographical sketches (biosketches); the new format is required for grant applications submitted for due dates after May 24, 2015 (see NOT-OD-15-032). SciENcv, a tool available through My NCBI for creating biosketches, has been updated to reflect the format changes and to help users convert their existing NIH biosketches from the old format to the new.
What changed with the NIH Biosketch?
Differences between the old and new NIH Biosketch formats include:
- Maximum length increased from 4 to 5 pages
- Rearranged data in the table at the top of the Biosketch
- Section A, Personal Statement can now include up to 4 supporting citations
- Section C is now called “Contribution to Science” and should be comprised of up to 5 brief descriptions of your most significant contributions to science, each with up to 4 supporting citations. In addition, you may also provide a URL to a full list of your published work as found in a publicly available digital database such as My Bibliography. This section is the most notable difference in the new format.
Figure 1. Sections A (Personal Statement) and C (Contribution to Science) in the new NIH Biosketch format.
How was SciENcv updated to support the changes?
You’ve seen it before on shopping web site: you load a page displaying an item you want and see a list of other items that people bought with the one you’re viewing.
PubMed is free, but finding the important articles on a topic can cost a lot of time. To help you keep on top of the literature – with a little help from your fellow PubMed users – we are introducing a new type of link called “Articles frequently viewed together”. For some PubMed abstracts, you may see this link in the “Related Information” section in the right column.
Figure 1. The PubMed Also-Viewed feature.
Not all abstracts will have this link; currently, only 1.3 million out of the 24 million records in PubMed do. The calculation is based on anonymous click data for the last year, so older articles will be especially underrepresented. To find all articles with these relationships, search PubMed with the query “pubmed_pubmed_alsoviewed[filter]”. Add additional terms to narrow the focus to your area of interest.
Please give it a try and let us know what you think by adding comments to this blog post.
BLAST (Basic Local Alignment Search Tool) is a popular tool for finding sequences in a given database that are similar to a query sequence. Traditionally, BLAST displays these results as a sorted list of matches between the query and each database sequence. While this display is useful for examining how each subject sequence matches the query, it treats all subject sequences the same, regardless of the quality of the sequence data or its annotation, and also does not allow easy comparisons between different subject sequences. For example, the subject sequences may fall into multiple groups of similar sequences, or all of the subject sequences may be more similar to each other than to the query. A common way to obtain this information is to construct a multiple sequence alignment of the query and some or all of the subject sequences, but to this point, BLAST has not provided such alignments directly.
Enter SmartBLAST! SmartBLAST is a new and experimental NCBI tool that makes it easier to answer common sequence analysis tasks, such as finding a candidate protein name for a sequence, locating regions of high sequence conservation, or identifying regions covered by database sequences but missing from the query. To do this, SmartBLAST performs the following tasks in much less time than it takes to run a typical BLASTp search: Continue reading
This blog post is geared toward genomics professionals.
From January 5th-7th, 2015, NCBI, in conjunction with the NIH Office of Data Science, held a genomics hackathon, where genomics professionals gathered to write useful, efficient pipelines for people new to genomics.
After we announced the hackathon, over 130 qualified applicants expressed interest in attending. Four team leads chose 23 attendees from this pool, then assigned initial predefined roles and provided biological guidance for a product in one of four subject areas: DNA-Seq, RNA-Seq, Epigenomics and Metagenomics. Continue reading
This blog post is aimed toward biomedical researchers.
Antibiotic-resistant bacterial infections account for the deaths of tens of thousands of Americans every year. Over the past twenty years, these difficult to treat infections have become more common. Since traditional antibiotics are ineffective in these cases, biomedical researchers are looking for alternatives. NCBI’s RefSeq project has created a new indexed field, “Protein has antimicrobial activity [prop]“, to assist in this search by retrieving useful sequence annotation showing naturally occurring antimicrobial peptides, or AMPs.
Antimicrobial peptides are naturally occurring peptides from a diverse array of species that are a part of an organism’s innate immune system. The RefSeq team recently gathered a list of over 130 human genes encoding one or more experimentally proven AMPs. These peptides are typically less than 100 amino acids and can display bactericidal, antiviral, antifungal, and even antitumor activities, with a specific AMP usually having a subset of these activities. AMPs may be a suitable alternative to traditional antibiotics because they work quickly, efficiently, and tend to have broad spectrum activity. Moreover, since they are naturally-occurring, AMPs are less likely than other compounds to be toxic to host cells or to give rise to AMP-resistant bacterial strains. Continue reading
This post is geared toward fungi researchers as well as RefSeq and BLAST users.
Fungi have unique characteristics that can make it difficult to identify and classify species based on morphology. To address these issues, Conrad Schoch, NCBI’s fungi taxonomist, and Barbara Robbertse, NCBI’s fungi RefSeq curator, in collaboration with outside mycology experts, are curating a set of fungal sequences from internal transcribed spacer (ITS) regions of the nuclear ribosomal RNA genes. This set of standard DNA sequences for fungal taxa not only addresses these difficulties in identifying and classifying fungal species by morphology, but is also essential for analyzing environmental (metagenomics) sequencing studies. The curated ITS sequences, described in a recent article in Database (PMC Free Article), all have associated specimen data and, when possible, are taken from sequences from type materials, ensuring correct species identification and tracking of name changes. This article will show you how to access these ITS sequences and search them using the specialized Targeted Loci BLAST service.
The fungal ITS sequences are a RefSeq Targeted Loci BioProject (PRJNA177353). As you may know, a BioProject is a collection of biological data related to a single initiative; in this case, the goal is to collect and curate fungal sequences from targeted loci – specific molecular markers such as protein coding or ribosomal RNA genes used for phylogenetic analysis.
If you’re reading this, you probably already know that NIH and some other institutions have public access policies that require that peer-reviewed publications resulting from their funding be made available to the public. But did you know that if you complied with your funding agency’s public access policy by depositing your author manuscript in NIH’s PubMed Central (PMC) archive via the NIH Manuscript Submission (NIHMS) system, you can easily obtain statistics on how frequently your paper is being accessed? Continue reading
The NCBI homepage now has six prominent buttons on it: Submit, Download, Learn, Develop, Analyze, and Research. Each of these buttons leads to an action page devoted to a particular set of services.
Figure 1. The NCBI homepage. The new action buttons are outlined in red.
We’ve also included a blue feedback button on the left side of the Download, Learn, Develop and Analyze pages so that you can tell us what you think.
Entrez Direct is a UNIX/LINUX command-line interface to E-utilities, the API to the NCBI Entrez system. One of Entrez Direct’s most useful features is its ability to parse and reformat complex XML data returns from EFetch. In this post, we will explore how to use these features to parse, reformat and process specific data from PubMed records downloaded in XML using EFetch. Though this post focuses on PubMed, the technique is universal and applies to any XML returned by E-utilities from any database. The example explored here is also presented briefly in the Entrez Direct documentation; here we’ll dive in a bit depeer to see how it works. Let’s get started!