Tag: Basic Local Alignment Search Tool (BLAST)

Try out ElasticBLAST at the BOSC2021 CoFest!

Try out ElasticBLAST at the BOSC2021 CoFest!

Join the BLAST team at the virtual CollaborationFest (July 31 -August 1, 2021) after the BOSC 2021 conference to help test and improve ElasticBLAST, a new cloud-based tool designed to speed up high throughput BLAST searches. We would love to have your help with real world testing of our alpha release of ElasticBLAST with you own workflows and data. You may sign up for the CoFest even if you aren’t registered for BOSC 2021.

Here are suggestions for how you can participate. See the FAQs below for additional information.

  1. Try it out and let us know how well it works. You can be blunt.
  2. Help us improve the documentation.
  3. Write a script to make ElasticBLAST part of your workflow.
  4. Try to process ElasticBLAST results with cloud-native tools. Here is an example.
  5. Bring your own high throughput BLAST search problem to use with ElasticBLAST!  Please discuss it with us first to make sure you don’t blow our budget and get the ElasticBLAST team in trouble!

Continue reading “Try out ElasticBLAST at the BOSC2021 CoFest!”

Introducing the new NCBI Datasets Genomes page

The updated NCBI Datasets Genomes page now has genome data for all domains of life, including bacterial and viral genomes.

The genomes table (Figure 1) now offers filters for:

  • Reference genomes — switch it on to only show reference or representative genomes
  • Annotated — switch it on to only show annotated genomes
  • Assembly level — use the assembly level slider to select higher-quality genomes
  • Year released — use the slider to limit your results to recent genomes

In addition, the new Actions column connects you to NCBI’s Genome Data Viewer, BLAST, and Assembly. The Text filter box lets you search by the name of the assembly, species/infraspecies, or submitter.Figure 1. The new Datasets Genomes page with primate assemblies showing the STATUS switches (reference genomes, annotated); expanded filters section with ASSEMBLY LEVEL and YEAR RELEASED sliding selectors; and the Actions column menu with access to Assembly details, BLAST, the Genome Data Viewer, and Download options. Continue reading “Introducing the new NCBI Datasets Genomes page”

BLAST+ 2.12.0 now available with more efficient multithreaded searches

BLAST+ 2.12.0  programs feature better multithreaded searches and support a different threading model, threading by query, that can be more efficient in some situations.  The new release is also fully compatible the increase in the numeric range for the GI identifier, which will take effect in the nucleotide database later this year.  The list below shows details of the new features and bug fixes.  You can download the new BLAST release from the FTP site.

Continue reading “BLAST+ 2.12.0 now available with more efficient multithreaded searches”

June 2 Webinar: Quickly upload and view your own data in genomic context at NCBI

June 2 Webinar: Quickly upload and view your own data in genomic context at NCBI

Join us on June 2, 2021 at 12PM eastern time to learn how to how to upload and display your own genomic data in the context of annotated genome assemblies. You will use the Genome Data Viewer and the Sequence viewer to visualize your own uploaded data (indexed BAM, VCF, BED, wig, GFF formats), data from public track hubs, and your BLAST and Primer-BLAST results. You will also learn to take advantage of features of the viewers including optimizing display settings, sharing a view with collaborators, exporting images, and downloading genes or other features in the view.

  • Date and time: Wed, June 2, 2021 12:00 PM – 12:45 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI webinars playlist on the NLM YouTube channel. You can learn about future webinars on the Webinars and Courses page.

Magic-BLAST version 1.6.0 is here!

Magic-BLAST version 1.6.0 is here!

We’ve just released  a new version (1.6.0) of Magic-BLAST, the BLAST-based next-gen alignment tool, with these improvements:

  • Usage reporting — you can help improve Magic-BLAST by sharing limited information about your search. The BLAST User Manual has details on the information collected, how it is used, and how to opt-out.
  • Magic BLAST can access NCBI SRA next-gen reads from the cloud when  you use the -sra or -sra_batch options.  See the Magic-BLAST cookbook for more details.
  • NCBI taxonomy IDs are reported in SAM output if they are present in the target BLAST database.
  • You can get unaligned reads reported separately from the aligned ones by using the -out_unaligned <file name> option.  You can also select the format ( SAM, tabular, or FASTA) with the -unaligned_fmt option. The default format is the same as one for the main report .

The version 1.6.0 executables are available from the NCBI FTP site.  See the release notes , the NCBI GitHub site , and the Magic-BLAST publication for more information.

NCBI on YouTube: Tips for My Bibliography, Genome Data Viewer and more

Missed our latest YouTube videos? Scroll down to see what we’ve been up to.

Add Preprint Citations in My Bibliography

The National Institutes of Health encourages investigators to post preprints to public repositories in order to speed the dissemination and enhance the rigor of their work. This video demonstrates how to add preprint citations to My Bibliography.

Continue reading “NCBI on YouTube: Tips for My Bibliography, Genome Data Viewer and more”

Remembering Mark Boguski

It is with much sadness that we recently learned of the passing of Mark Boguski, MD, PhD, a former Senior Investigator in the Computational Biology Branch at NCBI. Mark worked at the NCBI from 1989-2000 and made a lasting impression on the staff who are still with NCBI and who overlapped with his time here. Many of them have commented on social media about their personal interactions and fond memories of Mark.

Figure 1. Part of an alignment from a translating BLAST (blastx) search of a modified chicken translation factor sequence that Mark provided to Michael Crichton for The Lost World. Mark had edited the sequence by inserting DNA codons that BLAST translates to ‘MARK WAS HERE NIH’ thus leaving his autograph. Continue reading “Remembering Mark Boguski”

April 7 Webinar: Recent and upcoming enhancements to NCBI BLAST and Primer-BLAST services!

April 7 Webinar: Recent and upcoming enhancements to NCBI BLAST and Primer-BLAST services!

Join us on April 7, 2021 at 12PM eastern time to learn about new web BLAST and Primer-BLAST enhancements that improve your BLAST experience. You’ll also see a preview of some planned improvements to the databases that make it easier to find relevant matches.

Recent changes to web BLAST include added data columns on the descriptions table, so you can quickly find and sort your matches. Primer-BLAST now offers direct links from genome assembly pages, so you can easily select the specificity database. Primer-BLAST also now accepts multiple target templates making it easy to design primers that can amplify several similar sequences such as all splice variants of gene or the same target (16S, COI) from different strains or species.

  • Date and time: Wed, April 7, 2021 12:00 PM – 12:45 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI webinars playlist on the NLM YouTube channel. You can learn about future webinars on the Webinars and Courses page.

March 10 Webinar: Where to find data for your research organism!

March 10 Webinar: Where to find data for your research organism!

Do you work with data from organisms outside the traditional set of model organisms? Join us on March 10, 2021 to learn how to use NCBI resources including NCBI’s Taxonomy and BLAST that can help you find information from your organism and closely related taxa. You will see an example that shows you how to retrieve and download gene sequences for a set of species, generate multiple sequence alignments, and design primers using Primer-Blast.

  • Date and time: Wed, March 10, 2021 12:00 PM – 12:45 PM EST
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

The Protein Family Model resource is now available!

The new Protein Family Model resource  (Figure 1) provides a way for you to search across the evidence used by the NCBI annotation pipelines to name and classify proteins. You can find protein families by gene symbol, protein function, and many other terms. You have access to related proteins in the family and publications describing members. Protein Family Models includes protein profile hidden Markov models (HMMs) and BlastRules for prokaryotes, and conserved domain architectures for prokaryotes and eukaryotes. The HMMs in the collection include Pfam models, TIGRFAMs as well as models developed at NCBI either de novo, or from NCBI protein clusters.  Each of the BlastRules (PMCID: 5753331) consists of one or more model proteins of known biological function with BLAST identity and coverage cutoffs.  The conserved domain architectures are based on BLAST-compatible Position Specific Score Matrices  (PSSMs) that constitute the NCBI Conserved Domain database.Figure 1. Protein Family Model resource pages. Top panel.  Home page. Middle  panel, selected results summaries from a fielded search for the DnaK gene product (DnaK[Gene Symbol]). Bottom panel, a portion of an HMM record for DnaK derived from NCBI Protein Clusters (NF009946). The record also includes PubMed citations and HMMER analyses showing the RefSeq proteins named by this method.

Continue reading “The Protein Family Model resource is now available!”