Tag: BLAST

New BLAST default parameters and search limits coming in September

To provide a more efficient BLAST experience for everyone, we’re changing some parameters and limits on the web BLAST service on September 8, 2020. The new settings, listed below, will improve overall performance and make search times more consistent.

  1. The Expect Value Threshold default setting will be reduced to 0.05.
  2. The maximum number of target sequences (Max target sequences) limit will be no more than 5,000.
  3. The maximum allowed query length for nucleotide queries (blastn, blastx, and tblastx) will be 1,000,000 and 100,000 for protein queries (blastp and tblastn).

These changes will help keep the BLAST service running smoothly as the already very large databases continue to grow rapidly. If you have any questions or concerns, please email us at blast-help@ncbi.nlm.nih.gov

Download high-quality graphics from the NCBI Multiple Sequence Alignment Viewer (MSAV)

You can now download a publication-quality graphic images of  the alignment displayed in the NCBI Multiple Sequence Alignment Viewer (Figure 1). Load sequence alignments into the viewer from BLAST or COBALT results or upload alignment files directly. Once you have the the alignment set in the viewer, choose the “Printer-friendly PDF/SVG” option in the Download menu on the toolbar to save the image. The PDF and SVG files contain vector graphics suitable for presentation and publication. MSA_downloadFigure 1. The image download options in the MSAV. You can adjust the desired coordinate range and choose to download a PDF or SVG image. You can also preview the PDF download . Choose simplified color shading to improve compatibility with some graphics programs.

The downloaded image will show the coordinate range you requested and will include all the rows in the alignment.

Please contact us through the Feedback link on the MSA Viewer or write to the NCBI Help Desk to provide feedback and let us know how we can make the NCBI Multiple Sequence Viewer work better for you.

A new version of IgBLAST (1.16.0) is here!

We’ve released a new version (1.16.0) of IgBLAST , the popular NCBI package for classifying and analyzing immunoglobulin (IG) and T cell receptor (TCR) variable domain sequences. Version 1.16.0 has three new improvements.

  1. Added the ability to extend the J gene alignment at 3’ the end of the region (Figure 1). This allows you to view the unaligned bases that otherwise would not be included because of low sequence similarity. IgBLAST_options

Figure 1. The new “extend alignment at the 3′ end” option on the IgBLAST web form. The command line option is ‘-extend_align3end’. Continue reading “A new version of IgBLAST (1.16.0) is here!”

Recalculation of prokaryotic reference and representative genome assemblies

We have updated the collection of representative and reference assemblies for Bacteria and Archaea to better reflect the taxonomic breadth of the prokaryotes in RefSeq.  We chose the 11,478 representative assemblies in the new collection from the 180,000+ prokaryotic assemblies in RefSeq today.  We have selected one representative or reference assembly for every species based on several criteria including contiguity, completeness and whether the assembly is from type material.  We have also updated the reference and representative microbial Blast database to reflect these changes. This reference and representative set will be updated three times a year to reflect changes in RefSeq.  In addition, as we announced on Feb 14, we have reduced the number of reference genome assemblies — the subset of representative assemblies with annotation provided by outside experts —  to 15. See the list in our previous post .  We have re-annotated the 104 assemblies that are no longer reference with or Prokaryotic Genome Annotations Pipel (PGAP).

New ribosomal RNA BLAST databases available on the web BLAST service and for download

We have a curated set of ribosomal RNA (rRNA)  reference sequences (Targeted Loci) with verifiable organism sources and current names. This set is critical for correctly identifying and classifying prokaryotic (bacteria and archaea) and fungal samples (Table 1). To provide easy access to these sequences, we recently added a separate rRNA/ITS databases section on the nucleotide BLAST page for these targeted sequences that makes it convenient to quickly identify source organisms (Figure 1)

Database BioProjects Sequences
16S ribosomal RNA (Bacteria and Archaea) PRJNA33317 , PRJNA33175

 

20,845
18S ribosomal RNA sequences (SSU) from Fungi type and reference material PRJNA39195 2,337
28S ribosomal RNA sequences (LSU) from Fungi type and reference material PRJNA51803 5,185
Internal transcribed spacer region (ITS) from Fungi and Oomycete type and reference material PRJNA177353, PRJNA362621

 

10,874

Table 1.  NCBI curated targeted rRNA sequences now available as BLAST databases. Continue reading “New ribosomal RNA BLAST databases available on the web BLAST service and for download”

Adjust your scripts: new arrangement and naming for BLAST databases on the FTP site!

As we announced, the new default database version for BLAST+ is dbV5.  To complete the transition to the new version, we will modify the directory structure and naming conventions on the BLAST FTP database directory.  We expect to make this change around February 4th, 2020.

Here is a list of what we will change:

  1. All databases at the base of the blastdb directory (/ blast/db/) will be the dbV5 versions.
  2. The version 5 databases will no longer have “_v5” as part of the archive or database names.
  3. We will move the dbV4 databases to a v4 subdirectory (/blast/db/v4/).
  4. The now legacy dbV4 database archives will have “_v4” in their names (e.g., nr_v4.00.tar.gz); we will not rename the files within the archive.
  5. We will no longer update the dbV4 databases.
  6. We will freeze the cloud directory (/blast/db/cloud/) with no new entries after January 13, 2020.
  7. We will provide only nr, nt, swissprot, and pdbaa files in the FASTA directory (/blast/db/FASTA/).

Please adjust your scripts or procedures to accommodate the changes!

If you have any questions or concerns, please contact us.

A new version of IgBLAST (1.15.0) is here!

IgBLAST is a popular NCBI package for classifying and analyzing immunoglobulin (IG) and T cell receptor (TCR) variable domain sequences. We’ve released a new version (1.15.0) of IgBLAST with four new improvements / bug fixes:

  1. Support for the new framework region 4 (FWR4) annotation feature in the standard alignment formats and AIRR format.
  2. Renamed the previous “-penalty” parameter to -V_penalty to be consistent with other IgBLAST penalty options.
  3. Restored constant internal BLAST search parameters for domain annotation (i.e., FWR/CDR) so that this process is not influenced by user-provided parameters.
  4. Corrected FWR/CDR annotations for certain mouse VK and rat VH germline genes.

IgBLAST 1.15 is available for download from the BLAST FTP area. See the manual on GitHub for information about setting up and running IgBLAST.

BLAST+ 2.10.0 now available with improved composition-based statistics

The BLAST+ 2.10.0 release is now available from our FTP site.  The new version offers the following improvements:

  • updated composition-based statistics for protein-protein (including translated BLAST) comparisons to provide stable results when you request fewer than the default number of results
  • an experimental Adaptive Composition Based Statistics option that increases the likelihood of finding novel results.  To enable this option set the environment variable ADAPTIVE_CBS to 1.  We welcome your feedback on this new option.

See the release notes for details on more  improvements and bug fixes with this release.

The new version fully supports the version 5 (v5) databases with built in taxonomy and other improvements. For more information on v5 databases (download), see the previous NCBI Insights article and the recording of our webinar.  If you are still using the older version 4 (v4) databases, we recommend you begin using the v5 version as soon as possible.  We will discontinue updates to the older v4 databases in early 2020.

New publication on AMRFinder, a tool that identifies resistance genes in pathogen genomes!

Read the recent publication (PMID: 31427293) on the AMRFinder, a tool that identifies antimicrobial resistance (AMR) genes in bacterial genome sequences using a high-quality curated AMR gene reference database.  We use the AMRFinder to identify AMR genes in the hundreds of bacterial genomes that NCBI receives every day, and the results of AMRFinder are used in NCBI’s Isolates Browser to provide accurate assessments of AMR gene content. You can install AMRFinder locally and run it yourself. Follow the instructions on our GitHub site.

Since the publication we have upgraded AMRFinder to AMRFinderPlus. The enhanced tool now

  • supports searches based on protein annotations, nucleotide sequences, or both for best results
  • identifies point mutations in CampylobacterE. coli, Shigella, and Salmonella
  • optionally identifies many genes involved in biocide, heat, metal, and stress resistance, as well as many antigenicity and virulence genes
  • provides information about gene function, including resistance to individual antibiotics and other phenotypes

You can learn more about NCBI’s role in helping to combat antimicrobial resistance at the National Database of Antibiotic Resistant Organisms.

Magic-BLAST version 1.5.0 is here!

Magic-BLAST version 1.5.0 is here!

We’ve just released a new version of Magic-BLAST with several new, user-driven enhancements like:

  • Nanopore sequence alignment
  • Improved multithreading performance
  • Support for the new BLAST database version, BLASTDBv5, that allows you to limit your search by taxonomy
  • More reliable placements of reads

The new executables are available on the NCBI FTP site.

graphic.png

A new paper (PMID: 31345161), published in July 2019 by BMC Bioinformatics, presents the usage accuracy of Magic-BLAST.

Magic-BLAST aligns next generation DNA- and RNA-Seq sequencing reads. Read more about the latest version of Magic-BLAST in the release notes.