To provide a more efficient BLAST experience for everyone, we’re changing some parameters and limits on the web BLAST service on September 8, 2020. The new settings, listed below, will improve overall performance and make search times more consistent.
The Expect Value Threshold default setting will be reduced to 0.05.
The maximum number of target sequences (Max target sequences) limit will be no more than 5,000.
The maximum allowed query length for nucleotide queries (blastn, blastx, and tblastx) will be 1,000,000 and 100,000 for protein queries (blastp and tblastn).
These changes will help keep the BLAST service running smoothly as the already very large databases continue to grow rapidly. If you have any questions or concerns, please email us at email@example.com
You can now download a publication-quality graphic images of the alignment displayed in the NCBI Multiple Sequence Alignment Viewer (Figure 1). Load sequence alignments into the viewer from BLAST or COBALT results or upload alignment files directly. Once you have the the alignment set in the viewer, choose the “Printer-friendly PDF/SVG” option in the Download menu on the toolbar to save the image. The PDF and SVG files contain vector graphics suitable for presentation and publication. Figure 1. The image download options in the MSAV. You can adjust the desired coordinate range and choose to download a PDF or SVG image. You can also preview the PDF download . Choose simplified color shading to improve compatibility with some graphics programs.
The downloaded image will show the coordinate range you requested and will include all the rows in the alignment.
Please contact us through the Feedback link on the MSA Viewer or write to the NCBI Help Desk to provide feedback and let us know how we can make the NCBI Multiple Sequence Viewer work better for you.
We’ve released a new version (1.16.0) of IgBLAST , the popular NCBI package for classifying and analyzing immunoglobulin (IG) and T cell receptor (TCR) variable domain sequences. Version 1.16.0 has three new improvements.
Added the ability to extend the J gene alignment at 3’ the end of the region (Figure 1). This allows you to view the unaligned bases that otherwise would not be included because of low sequence similarity.
We have updated the collection of representative and reference assemblies for Bacteria and Archaea to better reflect the taxonomic breadth of the prokaryotes in RefSeq. We chose the 11,478 representative assemblies in the new collection from the 180,000+ prokaryotic assemblies in RefSeq today. We have selected one representative or reference assembly for every species based on several criteria including contiguity, completeness and whether the assembly is from type material. We have also updated the reference and representative microbial Blast database to reflect these changes. This reference and representative set will be updated three times a year to reflect changes in RefSeq. In addition, as we announced on Feb 14, we have reduced the number of reference genome assemblies — the subset of representative assemblies with annotation provided by outside experts — to 15. See the list in our previous post . We have re-annotated the 104 assemblies that are no longer reference with or Prokaryotic Genome Annotations Pipel (PGAP).
We have a curated set of ribosomal RNA (rRNA) reference sequences (Targeted Loci) with verifiable organism sources and current names. This set is critical for correctly identifying and classifying prokaryotic (bacteria and archaea) and fungal samples (Table 1). To provide easy access to these sequences, we recently added a separate rRNA/ITS databases section on the nucleotide BLAST page for these targeted sequences that makes it convenient to quickly identify source organisms (Figure 1)
As we announced, the new default database version for BLAST+ is dbV5. To complete the transition to the new version, we will modify the directory structure and naming conventions on the BLAST FTP database directory. We expect to make this change around February 4th, 2020.
Here is a list of what we will change:
All databases at the base of the blastdb directory (/ blast/db/) will be the dbV5 versions.
The version 5 databases will no longer have “_v5” as part of the archive or database names.
We will move the dbV4 databases to a v4 subdirectory (/blast/db/v4/).
The now legacy dbV4 database archives will have “_v4” in their names (e.g., nr_v4.00.tar.gz); we will not rename the files within the archive.
We will no longer update the dbV4 databases.
We will freeze the cloud directory (/blast/db/cloud/) with no new entries after January 13, 2020.
We will provide only nr, nt, swissprot, and pdbaa files in the FASTA directory (/blast/db/FASTA/).
Please adjust your scripts or procedures to accommodate the changes!
If you have any questions or concerns, please contact us.
IgBLAST is a popular NCBI package for classifying and analyzing immunoglobulin (IG) and T cell receptor (TCR) variable domain sequences. We’ve released a new version (1.15.0) of IgBLAST with four new improvements / bug fixes:
Support for the new framework region 4 (FWR4) annotation feature in the standard alignment formats and AIRR format.
Renamed the previous “-penalty” parameter to -V_penalty to be consistent with other IgBLAST penalty options.
Restored constant internal BLAST search parameters for domain annotation (i.e., FWR/CDR) so that this process is not influenced by user-provided parameters.
Corrected FWR/CDR annotations for certain mouse VK and rat VH germline genes.
IgBLAST 1.15 is available for download from the BLAST FTP area. See the manual on GitHub for information about setting up and running IgBLAST.
The BLAST+ 2.10.0 release is now available from our FTP site. The new version offers the following improvements:
updated composition-based statistics for protein-protein (including translated BLAST) comparisons to provide stable results when you request fewer than the default number of results
an experimental Adaptive Composition Based Statistics option that increases the likelihood of finding novel results. To enable this option set the environment variable ADAPTIVE_CBS to 1. We welcome your feedback on this new option.
See the release notes for details on more improvements and bug fixes with this release.
The new version fully supports the version 5 (v5) databases with built in taxonomy and other improvements. For more information on v5 databases (download), see the previous NCBI Insights article and the recording of our webinar. If you are still using the older version 4 (v4) databases, we recommend you begin using the v5 version as soon as possible. We will discontinue updates to the older v4 databases in early 2020.
Read the recent publication (PMID: 31427293) on the AMRFinder, a tool that identifies antimicrobial resistance (AMR) genes in bacterial genome sequences using a high-quality curated AMR gene reference database. We use the AMRFinder to identify AMR genes in the hundreds of bacterial genomes that NCBI receives every day, and the results of AMRFinder are used in NCBI’s Isolates Browser to provide accurate assessments of AMR gene content. You can install AMRFinder locally and run it yourself. Follow the instructions on our GitHub site.
Since the publication we have upgraded AMRFinder to AMRFinderPlus. The enhanced tool now
supports searches based on protein annotations, nucleotide sequences, or both for best results
identifies point mutations in Campylobacter, E.coli, Shigella, and Salmonella
optionally identifies many genes involved in biocide, heat, metal, and stress resistance, as well as many antigenicity and virulence genes
provides information about gene function, including resistance to individual antibiotics and other phenotypes