New Enhancements to PMC Website

New Enhancements to PMC Website

Legacy view will no longer be available effective late-March 2023 

You asked, we listened! We appreciate your feedback since the launch of the updated PubMed Central (PMC) website in March 2022, and we made several improvements to help you better access PMC. These updates include: 

  • Streamlined functionality to get formatted citation information that includes the PubMed format (NBIB file), that works easily on both web and mobile, and is consistent across the PubMed and PMC sites (see number 1 in the image below).  
  • Updated functionality to easily add an article to your My NCBI collections through PMC’s new “Collections” button (number 2 below).  
  • An improved “Resources” section that allows easy access to articles similar to the one you are viewing, other papers that cite that article, and links to related data records in other NCBI databases (number 3 below).  

New article view in PMC. Updates are illustrated with yellow number squares: 1) “Cite” button provides formatted citation information; 2) “Collection” button adds the article to My NCBI collections; 3) “Resources” button finds related articles; 4) “Feedback” button to communicate with NCBI.  Continue reading “New Enhancements to PMC Website”

What is NCBI and who works here?

What is NCBI and who works here?

Ever wonder who is behind all the data at the National Center for Biotechnology Information (NCBI)? Who is developing and managing the NCBI website, as well as our various products, tools, and resources?

In honor of our upcoming 35th anniversary, we want to tell our story! NCBI, a division of the National Library of Medicine (NLM) at the National Institutes of Health (NIH), was founded on November 4, 1988. The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research and sponsored legislation that established NCBI.

What does NCBI do?

As a national resource for molecular biology information, our mission is to develop new information technologies to aid in the understanding of fundamental molecular and genetic processes that control health and disease. We provide access to biomedical and genomic information by creating, curating, and maintaining medical and scientific databases. Continue reading “What is NCBI and who works here?”

Now Available! More Mammalian Cross-Species Alignments in the Comparative Genome Viewer (CGV)

Now Available! More Mammalian Cross-Species Alignments in the Comparative Genome Viewer (CGV)

In response to your feedback, we’ve made more whole genome cross-species alignments available in NCBI’s Comparative Genome Viewer (CGV). You can use these alignments to explore genome rearrangements between species. You can also zoom in to analyze regions of conserved gene synteny.

There are over 20 new cross-species alignments available, including human-mouse, mouse-rat, human-chimp, human-cattle, dog-cat, and others! These cross-species alignments provide additional opportunities to explore evolutionary relationships at the genomic and gene levels. We will add more cross-species alignments in the coming months.

The latest cross-species alignments added to CGV include imports from the UCSC Genomics Institute, as well as those generated at NCBI.

Check out two examples of cross-species whole-genome alignments in CGV below (Figure 1).

Figure 1. Whole genome alignments between (A) mouse and human (GRCm39 vs. GRCh38.p14)  and (B) cat and dog (F.catus_Fca126_mat1.0 vs. ROS_Cfam_1.0). Colored bands connects aligned regions; green indicates same orientation, blue indicates opposite orientation.

When you zoom in on an alignment (Figure 2), you can compare gene annotation on the two assemblies and see the extent of conservation of synteny. You can also see which genes are missing from one or the other assembly, indicating changes in sequence or differences in annotation.

Continue reading “Now Available! More Mammalian Cross-Species Alignments in the Comparative Genome Viewer (CGV)”

Upcoming changes to influenza virus names in NCBI Taxonomy

Upcoming changes to influenza virus names in NCBI Taxonomy

In order to reflect changes to the International Code of Virus Classification and Nomenclature (ICVCN) made by the International Committee on Taxonomy of Viruses (ICTV), NCBI will introduce new binomial influenza species names like ‘Alphainfluenzavirus influenzae.’ Changes are expected to be in place near summer 2023.

We recognize that the traditional influenza virus names like ‘Influenza A virus’ and ‘Influenza B virus’ are broadly used in public health, educational institutions, and research. To minimize the impact of this change to those who use NCBI resources, the taxonomy schema will keep the former names in the lineages for each species; however, they will be moved below the (new) species taxa in the hierarchy. See example below.

Continue reading “Upcoming changes to influenza virus names in NCBI Taxonomy”

New annotations in RefSeq!

New annotations in RefSeq!

In December and January, the NCBI Eukaryotic Genome Annotation Pipeline released twenty-nine new annotations in RefSeq for the following organisms:

  • Acinonyx jubatus (cheetah)
  • Anopheles cruzii (mosquito)
  • Anopheles moucheti (mosquito)
  • Bicyclus anynana (squinting bush brown)
  • Budorcas taxicolor (takin)
  • Carassius gibelio (silver crucian carp)
  • Citrus sinensis (sweet orange)
  • Crassostrea angulata (Portugese oyster)
  • Culex pipiens pallens (northern house mosquito)
  • Drosophila gunungcola (fruit fly)
  • Galleria mellonella (greater wax moth)
  • Gossypium arboreum (tree cotton)
  • Gossypium raimondii (Peruvian cotton)
  • Harpia harpyja (harpy eagle)
  • Hemicordylus capensis (graceful crag lizard)
  • Lactuca sativa (garden lettuce)
  • Mercenaria mercenaria (northern quahog)
  • Mya arenaria (softshell)
  • Octopus bimaculoides (California two-spot octopus)
  • Oncorhynchus keta (chum salmon)
  • Pangasianodon hypophthalmus (striped catfish)
  • Panonychus citri (citrus red mite)
  • Panthera uncia (snow leopard) (pictured)
  • Peromyscus californicus insignis (California mouse)
  • Podarcis raffonei (Aeolian wall lizard)
  • Populus trichocarpa (black cottonwood)
  • Scomber japonicus (chub mackerel)
  • Tympanuchus pallidicinctus (lesser prairie-chicken)
  • Vigna angularis (adzuki bean)

Continue reading “New annotations in RefSeq!”

Announcing New Names for Eukaryotic Genome Annotations in RefSeq!

Announcing New Names for Eukaryotic Genome Annotations in RefSeq!

The RefSeq eukaryotic genome annotation pipeline (EGAP) is moving to a new annotation naming format that can be used to unambiguously reference both the genome assembly and the RefSeq annotation. This will improve clarity when reporting the data you use and make the data more FAIR (Findable, Accessible, Interoperable, and Reusable). The new naming convention applies to all eukaryotic annotations released after December 15, 2022.

Historically, RefSeq EGAP has used an integer to identify a particular annotation release, such as Homo sapiens Annotation Release 110. This method provides no information on the assembly used for the annotation. In the new RefSeq  naming system, annotation releases are designated by a combination of the assembly identifier (e.g., GCF_000001405.40) and an annotation name (e.g., RS_2022_04). The annotation name consists of an RS prefix to indicate RefSeq annotation, and the year and month that it was generated, RS_YYYY_MM. You should always use the annotation name in combination with the corresponding assembly accession.version, for example, GCF_026419915.1-RS_2022_12 (as shown in Figure 1). This ensures that you’re always using the name that defines a specific annotation for a specific genome assembly. If you use only part of the name, it will be ambiguous.

Figure 1. The annotation section of the Datasets Genome page for the assembly bHarHar1 for the harpy eagle (Harpia harpyja) showing the new annotation release GCF_026419915.1-RS_2022_12. Continue reading “Announcing New Names for Eukaryotic Genome Annotations in RefSeq!”

NCBI at ACMG 2023

NCBI at ACMG 2023

Join us March 14-18 in Salt Lake City, Utah 

We are excited to celebrate ClinVar’s 10th anniversary and look forward to seeing you in-person at the 2023 ACMG Annual Clinical Genetics Meeting, March 14-18, 2023, in Salt Lake City, Utah. We will participate in a variety of events and activities featuring our clinical and human genetic resources.  

Check out NCBI’s schedule: 

Continue reading “NCBI at ACMG 2023”

Now Available! Add your favorite organism(s) to your BLAST ClusteredNR searches

Now Available! Add your favorite organism(s) to your BLAST ClusteredNR searches

Do you currently add an organism name(s) to focus your searches when using the BLAST standard nr database? You can now focus your searches by organism with the BLAST ClusteredNR database and get faster results with a better overview of protein homologs in a wider range of organisms. Your searches will be restricted to protein clusters that contain one or more sequences from the organism(s) you add.  

ClusteredNR results

A search of the ClusteredNR database (results) using human myoglobin (NP_005359.1) as a query and limited to Cetacea (whales & dolphins) returns clusters containing all the whale myoglobin matches present in a search of standard nr, as well as matches to clusters containing cytoglobin (Figure 1 A). These significant cytoglobin matches are not shown in the standard nr results with the Cetacea limit, which are dominated by matches to proteins from a single species, Physeter catodon (sperm whale) (Figure 1 B).  Continue reading “Now Available! Add your favorite organism(s) to your BLAST ClusteredNR searches”

Scrubbing human sequence contamination from Sequence Read Archive (SRA) submissions

Scrubbing human sequence contamination from Sequence Read Archive (SRA) submissions

Do you work with human-derived sequence data? Do you often struggle with the need to determine if your data is free of human sequence and therefore suitable for public distribution? We encourage submitters to screen for and remove contaminating human reads from data files prior to submission to SRA. To support investigators in this effort, we offer a tool to remove human sequence contamination from your SRA submissions!

Human Read Removal Tool (HRRT)

The Human Read Removal Tool (HRRT; also known as the Human Scrubber) is available on GitHub and DockerHub. The HRRT is based on the SRA Taxonomy Analysis Tool (STAT) that will take as input a fastq file and produce as output a fastq.clean file in which all reads identified as potentially of human origin are masked with ‘N’. Continue reading “Scrubbing human sequence contamination from Sequence Read Archive (SRA) submissions”

ClinVar to offer improved support for somatic data

ClinVar to offer improved support for somatic data

We need your input! 

ClinVar is NCBI’s archive of reports of the relationships among human genetic variations and diseases, with supporting evidence. To make ClinVar data more accurate and useful, we are introducing an enhanced data model to better accept and support classifications of somatic variants. 

How you can help 

Do you have somatic variant classifications to submit to ClinVar? We want to hear from you! We are now testing ClinVar’s enhanced data model and support for classifications of somatic variants.   Continue reading “ClinVar to offer improved support for somatic data”