Researchers identify potential alternative to CRISPR-Cas genome editing tools


An international team of CRISPR-Cas researchers has identified three new naturally-occurring systems that show potential for genome editing. The discovery and characterization of these systems is expected to further expand the genome editing toolbox, opening new avenues for biomedical research. The research, published October 22nd in the journal Molecular Cell, was supported in part by the National Institutes of Health.

“This work shows a path to discovery of novel CRISPR-Cas systems with diverse properties, which are demonstrated here in direct experiments,” said Eugene Koonin, Ph.D., senior investigator at the National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), part of the NIH. “The most remarkable aspect of the story is how evolution has achieved a broad repertoire of biological activities, a feat we can take advantage of for new genome manipulation tools.”

Continue reading

The NCBI Minute: quick introductions to NCBI resources


For over two years, NCBI has presented webinars on a wide range of topics to a growing audience. More recently, we began offering shorter webinars in a series called The NCBI Minute.

These presentations introduce a new NCBI tool or resource or provide quick tips for using a popular resource in 5-10 minutes.

screenshot of popular NCBI Minute presentations on YouTube

Figure 1. Examples of popular NCBI Minute presentations; SmartBLAST Introduction presented September 2 (YouTube), and Connecting with PubMed Commons presented May 2 (YouTube).

Each NCBI Minute is recorded and posted on our YouTube channel in the NCBI Minute playlist. Two of our most popular NCBI Minute presentations (Figure 1) are the introduction to the new SmartBLAST service, first described on NCBI Insights in July, and Connecting with PubMed Commons, our public commenting service for PubMed articles described in several NCBI Insights posts.

Missed a presentation? No problem!

If you missed any of The NCBI Minute, there are two ways you can catch up:

Continue reading

SRA Toolkit: the SRA database at your fingertips


The Sequence Read Archive (SRA), NCBI’s largest growing repository of molecular data, archives raw sequencing data and alignment information from high-throughput sequencing platforms, including Roche 454 GS Systems®, Illumina’s Genome Analyzer®, and Complete Genomics® systems.

Researchers commonly use SRA data to make discoveries via comparison of data sets. Data sets can be compared through the SRA web interface, but if you want to integrate these downloads and file conversions into an already existing pipeline, or you simply prefer using a command-line interface, we recommend using the SRA Toolkit.

Continue reading

Fine-tune your web-based search results with SRA Run Selector


Run Selector is a tool available through the Sequence Read Archive (SRA) that allows you to fine-tune your web-based search results. There are over two dozen fields that can be used to filter SRA data in Run Selector. For example, if you need to look at data from a particular sequencing platform and genome assembly, you can use these fields as filters.

After running a web-based search for any keyword in the SRA database, users can dump all the results (up to a maximum of 20,000 experiments) into the Run Selector for fine-tuning. In addition, Run Selector shows you how many runs fall into each of the categories even before a filtering category is selected, allowing you to investigate the data further by noting what is contained within the database.

post 2 fig 1 run results

Figure 1. After searching with SRA, click on “Send to” to open the drop-down menu. Then click on the radio button labeled “Run Selector” to send your search results to Run Selector. Note that you can already see how many runs are in each of the categories to the left.

Continue reading

Troubleshooting GenBank Submissions: Annotating the Coding Region (CDS)


This article is intended for GenBank data submitters with a basic knowledge of BLAST who submit sequence data from protein-coding genes.

One of the most common problems when submitting DNA or RNA sequence data from protein-coding genes to GenBank is failing to add information about the coding region (often abbreviated as CDS) or incorrectly defining the CDS. Incomplete or incorrect CDS information will prevent you from having accession numbers assigned to your submission data set, but there is a procedure that will help you troubleshoot any problems with the CDS feature annotation: doing a BLAST analysis with your sequences before you submit your data.

Here’s how to use nucleotide BLAST (blastn) and the formatting options menu to analyze, interpret and troubleshoot your submissions:

1. To start the BLAST analysis, go to the BLAST homepage and select “nucleotide blast”.

nucleotide blast link. click to start BLAST analysis

Figure 1. Select “nucleotide blast”.

Continue reading

Finding Chemical Probes and Modulators – The Hunt for New Chemical Reagents and Medicines


This blog post is a continuation of last week’s blog on finding biological assay data; it is intended for researchers who use PubChem.

Your research focuses on a protein (receptor or enzyme) for which you’d like to identify a chemical probe or modulator. The probe could help to identify the subcellular location of a protein. A modulator may help to determine the biological effects of a particular protein’s activity. Additionally, finding a novel chemical that binds to your protein might assist you in exploring the use of a new class of therapeutics in drug design.

At NCBI, the PubChem BioAssay database stores biological activity assay information, which makes it possible to find experimentally measured targets for millions of chemicals. This blog post shows a simple workflow to download a table (with raw and kinetic data) of chemicals that have been determined to bind to a particular gene/protein target.

Continue reading

A Fourth Offering of A Librarian’s Guide to NCBI


This blog post is directed toward medical or science librarians in the United States who offer bioinformatics education and support services or are planning to offer such services in the future.

The NCBI, in partnership with the National Library of Medicine Training Center (NTC), will once again offer the Librarian’s Guide to NCBI course on the NIH campus, March 7-11, 2016 (Announcement). This will be the fourth presentation of the course, and there are now 69 graduates of the training program.

These graduates represent 61 libraries, hospitals and government agencies from 27 states and the District of Columbia. Librarian’s Guide graduates now form a core community of NCBI-trained bioinformatics support specialists who maintain collaboration and mutual support through an online forum and monthly NCBI “Office hours” videoconference discussion sessions with course faculty and students. Materials from the 2013, 2014 and 2015 courses are available now, as well as lecture videos for the expression module.

Librarian's Guide 2015 class photo

Figure 1. Participants in the March 2015 A Librarian’s Guide to NCBI course. This class included 29 biomedical and science librarians.

Continue reading

Identifying Chemical Targets – Finding Potential Cross-Reactions and Predicting Side Effects


This blog post is directed toward researchers using PubChem.

You’ve identified a chemical that you’d like to use in your research as a chemical probe for a receptor or an enzyme inhibitor. However, chemicals are known to be able to bind to multiple protein targets, commonly known as “cross-reactivity”. In biological activity assays, this can cause problems with measuring the activity of a specific protein or pathway. If the chemical is employed as a medicant in living organisms, interactions with molecules other than the intended target can cause “side effects”.

At NCBI, the PubChem BioAssay database stores biological activity assay information that makes it possible to find experimentally measured targets for millions of chemicals. This blog post describes a workflow to download a table of gene/protein targets for a particular chemical.

Tamoxifen compound page.

Figure 1. Tamoxifen compound page.

Continue reading

SciENcv Updated to Support New NIH Biosketch Format


This blog post is geared toward researchers.

In November, NIH announced a new format for biographical sketches (biosketches); the new format is required for grant applications submitted for due dates after May 24, 2015 (see NOT-OD-15-032). SciENcv, a tool available through My NCBI for creating biosketches, has been updated to reflect the format changes and to help users convert their existing NIH biosketches from the old format to the new.

What changed with the NIH Biosketch?

Differences between the old and new NIH Biosketch formats include:

  1. Maximum length increased from 4 to 5 pages
  2. Rearranged data in the table at the top of the Biosketch
  3. Section A, Personal Statement can now include up to 4 supporting citations
  4. Section C is now called “Contribution to Science” and should be comprised of up to 5 brief descriptions of your most significant contributions to science, each with up to 4 supporting citations. In addition,  you may also provide a URL to a full list of your published work as found in a publicly available digital database such as My Bibliography. This section is the most notable difference in the new format.

Continue reading

PubMed Also-Viewed: Quickly find related articles


You’ve seen it before on shopping web site: you load a page displaying an item you want and see a list of other items that people bought with the one you’re viewing.

PubMed is free, but finding the important articles on a topic can cost a lot of time. To help you keep on top of the literature – with a little help from your fellow PubMed users – we are introducing a new type of link called “Articles frequently viewed together”. For some PubMed abstracts, you may see this link in the “Related Information” section in the right column.

PubMed Also-Viewed feature

Figure 1. The PubMed Also-Viewed feature.

Continue reading