Do you need a smaller dataset for your analyses of virus data? In response to your feedback, NCBI Virus now allows you to download a randomized subset of your results for nucleotide, protein, or RefSeq genome sequences from any supported virus (Figure 1). This option is useful for viruses such as SARS-CoV-2 or Influenza A that have very large numbers of records, where the entire dataset may present a challenge. In such cases, a smaller representative sample is easier to work with to support your analysis. You can also reduce the bias in a dataset by getting a representative number of records for each country or host (Figure 2). Continue reading “Download Randomized Data Subset from NCBI Virus”
New NCBI Summer Virtual Workshop Series!
Apply to attend interactive, hands-on workshops
Calling all biology students and educators! Want to learn more about NCBI resources and how to use our high-quality data and cutting-edge tools for your research projects or curricula?
We are excited to announce our upcoming virtual workshop series for Summer 2023. Our interactive, hands-on workshops are taught by experienced NCBI Education faculty. Events are free to attend, and applications are open to the public; however, each workshop will accept a limited number of participants to facilitate the best possible educational experience. Continue reading “New NCBI Summer Virtual Workshop Series!”
Updated Design! NCBI Datasets Homepage
The updated NCBI Datasets homepage has a fresh new look and feel, making it easier for you to use. Now more prominent at the top of the page, you can enter and select the scientific or common name of the species you’re interested in and go directly to the NCBI Datasets Taxonomy page for that species.
We added a “How to use NCBI Datasets” section, providing you an overview of what’s available in NCBI Datasets. You can see example species with links to NCBI Datasets pages relevant to that species. For example, for Ursos arctos (brown bear), we include links to the Taxonomy page, the genome table showing all available genomes, the reference genome page for UrsArc1.0, as well as connections to BLAST and the Ursos arctos gene table.
You can still use the tab bar at the top of the homepage to easily navigate to our genome and gene tables or check out our documentation. Continue reading “Updated Design! NCBI Datasets Homepage”
Navigating Between BLAST and iCn3D
Explore protein structures and sequences quickly and easily
Have you ever come across an unfamiliar protein in your BLAST results? With the newly added ‘AlphaFold Structure’ link (Figure 1), you can now explore its structure as predicted by AlphaFold in iCn3D. The iCn3D Structure Viewer is not only a web-based 3D viewer, but also a structure analysis tool with interactive displays of 3D structure, 2D topology, 1D sequence and annotation.
Features & Benefits
- Upload AlphaFold structures to iCn3D directly
- Use the structure search feature to find structures of interest
- Understand important features of the structures, such as disease-associated variations (ClinVar), genetic variations (dbSNP), or chemical modifications (PTM)
- Identify similarities and differences between AlphaFold predictions and experimentally determined structure
- Gain insights into the structural characteristics and properties of the molecules
- Use iCn3D in different platforms (Jupyter Notebook, Virtual Reality, and Augmented Reality)
- Easily integrate iCn3D using scripted workflows (node.js, python) to analyze large sets of structures
Now Available! Faster BLAST Searches with New Nucleotide Databases
NEW in BLAST! We made smaller nucleotide databases to help you find the sequences you need faster and easier. You can now find these databases on the main nucleotide BLAST search page (Figure 1) and even download them (Databases: nt_euk, nt_prok, nt_viruses, nt_others). They are separated by organism type, such as eukaryotes, prokaryotes, viruses, and others (including synthetic sequences).
Figure 1. The database selection section of the main nucleotide BLAST page with the ‘Experimental databases’ radio button selected. You can choose one or more of the organism database subsets for your search. Continue reading “Now Available! Faster BLAST Searches with New Nucleotide Databases”
Join NCBI at ASM Microbe 2023
Houston, TX, June 15-19, 2023
NCBI is looking forward to seeing you in person at the American Society for Microbiology Annual Meeting (ASM Microbe 2023). NCBI staff will participate in a variety of activities and events and will also be available at our booth (#2410) to address your questions. We’re especially excited to share our recent efforts on the NCBI Pathogen Detection Project which integrates bacterial and fungal pathogen genomic sequences from numerous ongoing foodborne illness and environmental surveillance and research efforts.
Check out our schedule of activities and events below (and on our conference webpage). All times are in CST. Continue reading “Join NCBI at ASM Microbe 2023”
Gene Ontology (GO) Terms on 100M+ RefSeq Prokaryotic Protein Sequence Records
Do you work with or study prokaryotic proteins? As previously announced, we’ve been adding Gene Ontology (GO) terms to RefSeq prokaryotic protein sequence records (example below) to standardize the language when describing the functions of genes and their products. Over 100 million RefSeq proteins from prokaryotes now have at least one GO Term, a 55% increase since we started propagating GO terms from Conserved Domains Database (CDD) architectures in March. Continue reading “Gene Ontology (GO) Terms on 100M+ RefSeq Prokaryotic Protein Sequence Records”
RefSeq Release 218
NCBI’s Remap Tool to Retire in November 2023
As of November 2023, NCBI’s Remap tool will no longer be available. Due to low usage of Remap, a tool that projects annotation data from one coordinate system to another, we are focusing our development efforts on our more popular resources and tools.
We encourage you to check out our newest, easy-to-use visualization tool, the Comparative Genome Viewer (CGV), which displays assembly-assembly whole genome alignments to help you quickly compare eukaryotic genome assemblies and easily identify genomic changes that may be significant to biology and evolution.
Stay up to date
Follow us on Twitter @NCBI and join our mailing list to keep up to date with our visualization tools and other NCBI news.
Questions?
Feel free to contact our help desk at info@ncbi.nlm.nih.gov if you have any questions or concerns.
GenBank Release 255.0 is Available!
GenBank release 255.0 (4/21/2023) is now available on the NCBI FTP site. This release has 23.44 trillion bases and 3.48 billion records.
The current release has:
- 242,554,936 traditional records containing 1,826,746,318,813 base pairs of sequence data
- 2,440,470,464 WGS records containing 20,926,504,760,221 base pairs of sequence data
- 678,332,682 bulk-oriented TSA records containing 636,291,358,227 base pairs of sequence data
- 121,186,672 bulk-oriented TLS records containing 46,567,924,833 base pairs of sequence data
Growth between releases
During the 58 days between the close dates for GenBank Releases 254.0 and 255.0, the traditional portion of GenBank grew by 95,444,070,395 base pairs and by 724,301 sequence records. We updated 172,014 records during that same period. We added and/or updated an average of 15,453 traditional records per day! Continue reading “GenBank Release 255.0 is Available!”
RefSeq release 218 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.
What’s included in this release?
As of May 1, 2023, this full release incorporates genomic, transcript, and protein data containing: