We have added a new feature to ClinVar that allows you to follow a particular variant and be notified if the overall clinical interpretation in ClinVar changes, for example from a pathogenic category to a non-pathogenic one. This service will let you know about changes that may require you to update your analysis reports and contact your patients and ordering physicians. The new feature allows you to follow a variant from the variation page (Figure 1). Simply click the “Follow” button to begin receiving notifications.
Figure 1. A ClinVar variant page (VCV000541155.1) showing the ‘Follow’ button. The text on the button changes to ‘Following’ after you add it to your followed variants. Clicking ‘Following’ presents the option to ‘Unfollow’, which removes the variant from the followed list when clicked.
RefSeq release 98 is accessible online, via FTP and through NCBI’s Entrez programming utilities, E-utilities.
This full release incorporates genomic, transcript, and protein data available as of January 6, 2020, and contains 223,560,051 records, including 161,133,441 proteins, 29,134,515 RNAs, and sequences from 98,406 organisms.
The release is provided in several directories as a complete dataset and as divided by logical groupings.
Check out the latest videos on YouTube to learn how to best use NCBI graphical viewers, SRA, PGAP, and other resources.
Genome Data Viewer: Analyzing Remote BAM Alignment Files and Other Tips
This video shows you how to upload remote BAM files, and succinctly demonstrates handy viewer settings, such as Pileup display options, and highlights the very helpful tooltips in the Genome Data Viewer (GDV). There’s also a brief blog post on the same topic.
The complete annotated genome sequence of the novel coronavirus associated with the outbreak of pneumonia in Wuhan, China is now available from GenBank for free and easy access by the global biomedical community. Figure 1 shows the relationship of the Wuhan virus to selected coronaviruses.
Figure 1. Phylogenetic tree showing the relationship of Wuhan-Hu-1 (circled in red) to selected coronaviruses. Nucleotide alignment was done with MUSCLE 3.8. The phylogenetic tree was estimated with MrBayes 3.2.6 with parameters for GTR+g+i. The scale bar indicates estimated substitutions per site, and all branch support values are 99.3% or higher.
Mark your calendars! As announced last month, a new NIH Manuscript Submission (NIHMS) system is coming! We are now pleased to share that the release is scheduled for January 23, 2020. To facilitate the transition, the NIHMS system will be temporarily unavailable beginning January 21.
Researchers are encouraged to take this schedule into consideration when preparing progress reports (e.g., RPPRs) and in completing other public access compliance activities. Papers deposited to NIHMS prior to January 21 will be migrated to the new system and any action(s) required after the release will be taken in the new system. Researchers will have access to information on all submissions made over time upon logging in to the new NIHMS.
Researchers with RPPRs due in February should address any outstanding compliance issues as soon as possible to avoid delays.
We’re constantly making improvements to the NCBI genome Assembly resource. This post points out some recent advances, highlighted in Figure 1 and described in more detail below.Figure 1. New improvements to the Assembly web pages. The results page showing the surveillance project filter (lower left), which excludes 28,220 Klebsiella pneumoniae assemblies from the Pathogen Detection Project, and the Download Assemblies button with a link to the File type description (circled in red, upper right). For other improvements in the Download Assemblies menu see our recent post.
NCBI’s genome browsers and graphical sequence viewers now allow you to view BAM alignments sorted by haplotype tag. This option is useful for analyzing variants within a sequenced sample and can help you detect or validate structural variants.Figure 1. Remote BAM alignment data sorted by haplotype tag in the Genome Data Viewer. The remote BAM file was added through the “User Data and Track Hubs” feature in GDV. You can load the remote BAM for this example through https://go.usa.gov/xpM9c. The sorted display shows that haplotype 1 contains a significant deletion in this region relative to haplotype 2 and the reference genome assembly. Aligned reads not assigned a haplotype tag in the BAM file are grouped under the heading “haplotype not set” (not shown).
GenBank release 235.0 (12/11/2019) is now available on the NCBI FTP site. This release has 7 trillion bases and 1.74 billion records.
The current release has 215,333,020 traditional records containing 388,417,258,009 base pairs of sequence data. There are also 1,127,023,870 WGS records containing 6,277,551,200,690 base pairs of sequence data, 367,193,844 bulk-oriented TSA records containing 325,433,016,129 base pairs of sequence data, and 28,227,180 bulk-oriented TLS records containing 11,280,596,614 base pairs of sequence data.
GenBank submitters, you can now submit mitochondrial COX1 (cytochrome oxidase subunit I; COI) sequence data from multicellular animals (metazoa) using a new workflow (Figure 1) with an improved interface, enhanced validation, and automatic COX1 CDS feature annotation. Once you have submitted mitochondrial COX1 data using this tool, you’ll have a single, helpful page to reference your submission information: accession number(s), COX1 submission status, relevant files and more. Plus, you can also fix any errors from this page.
Figure 1. Submission Portal page with the mitochondrial COX1 submission option selected (boxed in red). The service has options for other targeted submissions including ribosomal RNA (rRNA), rRNA-ITS, Influenza virus, and Norovirus sequences.
ClinVar is proud to announce the submission of the one millionth record to its database.
The millionth submission was published on Friday, December 20, 2019, a milestone achievement for providing open access to human variant data with asserted consequence to the clinical genetics and research communities.
ClinVar extends its thanks to the many laboratories, partners, and members of the community whose efforts and adoption of the practice of data-sharing paved the way for this achievement. All organizations that contributed to ClinVar’s genetics resources share in this accomplishment, with special recognition reserved for ClinGen and several of their members, including EGL Genetic Diagnostics/Eurofins Clinical Diagnostics, GeneDx, Invitae, and Laboratory for Molecular Medicine/Partners HealthCare Personalized Medicine, whose early submissions helped jump-start ClinVar’s database.