Category: What’s New

April 7 Webinar: Recent and upcoming enhancements to NCBI BLAST and Primer-BLAST services!

April 7 Webinar: Recent and upcoming enhancements to NCBI BLAST and Primer-BLAST services!

Join us on April 7, 2021 at 12PM eastern time to learn about new web BLAST and Primer-BLAST enhancements that improve your BLAST experience. You’ll also see a preview of some planned improvements to the databases that make it easier to find relevant matches.

Recent changes to web BLAST include added data columns on the descriptions table, so you can quickly find and sort your matches. Primer-BLAST now offers direct links from genome assembly pages, so you can easily select the specificity database. Primer-BLAST also now accepts multiple target templates making it easy to design primers that can amplify several similar sequences such as all splice variants of gene or the same target (16S, COI) from different strains or species.

  • Date and time: Wed, April 7, 2021 12:00 PM – 12:45 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI webinars playlist on the NLM YouTube channel. You can learn about future webinars on the Webinars and Courses page.

Conserved Domain Database version 3.19 is available!

Conserved Domain Database version 3.19 is available!

The Conserved Domain Database (CDD) version 3.19 is now available. This version contains 3,148 new or updated NCBI-curated domains and now mirrors Pfam version 33.1 as well as models from the NCBIfam collection. We also included fine-grained classifications of the immunoglobulinRRMcytochrome P4507-transmembrane GPCRsKHcalponin homology and C1 domain superfamilies.

Continue reading “Conserved Domain Database version 3.19 is available!”

January-February 2021 RefSeq annotations include dog, fly, rat

Figure 1. This is Tasha, the female boxer used for one of the assemblies annotated for dog (GCF_000002285.5). Image courtesy of the National Human Genome Research Institute.

This January and February, the NCBI Eukaryotic Genome Annotation Pipeline released new annotations in RefSeq for the following organisms:

  • Benincasa hispida (wax gourd)
  • Canis lupus familiaris (dog)
  • Corvus cornix cornix (hooded crow)
  • Crotalus tigris (tiger rattlesnake)
  • Culex pipiens pallens (northern house mosquito)
  • Dioscorea cayenensis subsp. rotundata (Guinea yam)
  • Drosophila santomea (fly)
  • Drosophila simulans (fly)
  • Drosophila yakuba (fly)
  • Eucalyptus grandis (rose gum)
  • Hibiscus syriacus (Rose-of-Sharon)
  • Hyaena hyaena (striped hyena)
  • Maniola hyperantus (ringlet)
  • Mauremys reevesii (Reeves’s turtle)
  • Nilaparvata lugens (brown planthopper)

Continue reading “January-February 2021 RefSeq annotations include dog, fly, rat”

RefSeq Release 205 is available!

RefSeq Release 205 is available!

RefSeq release 205 is now available online, from the FTP site and through NCBI’s Entrez programming utilities, E-utilities.

This full release incorporates genomic, transcript, and protein data available as of March 1, 2021, and contains 269,975,565 records, including 197,232,209 proteins, 36,514,168 RNAs, and sequences from 108,257  organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings.

Continue reading “RefSeq Release 205 is available!”

New class value and qualifier in GenBank release 242.0 accommodate circular RNA molecules

GenBank release 242.0 (2/16/2021) is now available on the NCBI FTP site and through Entrez and BLAST. This release has 13.49 trillion bases and 2.34 billion records.

Growth between releases

During the 57 days between the close dates for GenBank Releases 241.0 and 242.0, the ‘traditional’ portion of GenBank grew by 53,287,389,099 base pairs and by 4,773,649 sequence records. During that same period, 65,699 records were updated. An average of 84,901 ‘traditional’ records were added and/or updated per day.

Between releases 241.0 and 242.0, the WGS component of GenBank grew by 439,874,781,594 base pairs and by 45,942,354 sequence records. During the same period, the TSA component of GenBank grew by 15,398,434,562 base pairs and by 16,753,622 Sequence records. Finally, the TLS component of GenBank grew by 597,613,549 base pairs and by 2,091,409 sequence records.

Continue reading “New class value and qualifier in GenBank release 242.0 accommodate circular RNA molecules”

ClinicalTrials.gov updates the PRS Guided Tutorials, step-by-step instructions for data providers

The PRS Guided Tutorials provide step-by-step instructions to help data providers submit information to ClinicalTrials.gov and aims to reduce the number of quality-control reviews needed. The ClinicalTrials.gov team has updated the PRS Guided Tutorials to make them more useful in response to user feedback obtained through focus groups and survey responses over the past year.

Continue reading “ClinicalTrials.gov updates the PRS Guided Tutorials, step-by-step instructions for data providers”

March 10 Webinar: Where to find data for your research organism!

March 10 Webinar: Where to find data for your research organism!

Do you work with data from organisms outside the traditional set of model organisms? Join us on March 10, 2021 to learn how to use NCBI resources including NCBI’s Taxonomy and BLAST that can help you find information from your organism and closely related taxa. You will see an example that shows you how to retrieve and download gene sequences for a set of species, generate multiple sequence alignments, and design primers using Primer-Blast.

  • Date and time: Wed, March 10, 2021 12:00 PM – 12:45 PM EST
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

Important Update About How You Log Into your NCBI Accounts

Update:  Based on feedback  we have received, we are moving the date when creating new accounts with 3rd-party credentials is required to June 2021. We will provide more details in the coming weeks.

As mentioned in a previous blog post, we are transitioning to using 3rd party logins for all My NCBI accounts. We are doing this because NIH, NLM, and NCBI take your privacy and security very seriously. Transitioning to 3rd parties who have modern and industry-standard security practices ensures that you have the highest level of security and enables us to focus our resources on improving your experience once you log in.

The next step in this transition process is to disable the ability to create usernames and passwords directly in NCBI. Beginning in June 2021 new accounts must be created with 3rd-party credentials, but don’t worry! We are working hard to make sure you have a variety of sign in options. In the past few months, we’ve added nearly 4000 InCommon sign in options along with Microsoft and Facebook. This is in addition to the options we had previously, like ORCID, eRA Commons, and Login.gov.

If you have questions about this transition or using 3rd-party usernames and passwords for your NCBI account, you can:

NIH’s Sequence Read Archive to be made available on AWS’s Open Data Sponsorship Program

NIH’s Sequence Read Archive to be made available on AWS’s Open Data Sponsorship Program

National Library of Medicine’s (NLM) National Center for Biotechnology Information (NCBI) and Amazon Web Services (AWS) are happy to announce that the controlled- and public-access Sequence Read Archive (SRA)–one of the world’s largest repositories of raw next generation sequencing data–will be freely accessible from Amazon S3 via the Open Data Sponsorship Program (ODP) as of January 2021. The SRA is currently hosted by NLM at the National Institutes of Health (NIH).

Continue reading “NIH’s Sequence Read Archive to be made available on AWS’s Open Data Sponsorship Program”

The Datasets command-line tool now provides ortholog data

Important Note: Please see our latest documentation on how to download gene ortholog data. The commands below have been deprecated in the latest version of the NCBI Datasets command-line tools.

You can now get gene ortholog data using the NCBI Datasets command-line tool using a gene ID, gene symbol, or RefSeq nucleotide or protein accession. Data are available for vertebrates and insects. The vertebrate orthologs includes a specialized set for fish.  (See our recent post for more information on the orthologs for fish and insects.)

You can retrieve metadata for gene orthologs in JSON Format, or you can download a compressed (zip) archive containing both metadata and sequences (Figure 1).

Figure 1. Command-lines  that use a gene symbol (BRCA1) to retrieve mammalian ortholog metadata (top, JSON metadata shown in part in the image) and sequences (bottom). 

Continue reading “The Datasets command-line tool now provides ortholog data”