The 2018 Nucleic Acids Research database issue features several papers from NCBI staff that cover the status and future of databases including CCDS, ClinVar, GenBank and RefSeq. These papers are also available on PubMed. To read an article, click on the PMID number listed below.
Have you ever searched the NCBI Protein database and been overwhelmed with the number of sequences returned? Have you tried searching with a protein name, thinking that would greatly limit the results, only to still be presented with many sequences (all with the same name)? It’s a common problem in this time of greatly expanding sequence databases powered by large-scale genomic sequencing of similar organisms. Redundancy in the sequence databases is high and only getting worse.
To address this, in 2013 NCBI released the WP records, which collect identical protein sequences annotated on bacterial genomes. In 2014, NCBI released the Identical Protein Reports on Protein records, which displays information about all other proteins identical to that protein. Now, we are releasing a new resource: Identical Protein Groups (IPG). IPG offers several features: