Join us on June 2, 2021 at 12PM eastern time to learn how to how to upload and display your own genomic data in the context of annotated genome assemblies. You will use the Genome Data Viewer and the Sequence viewer to visualize your own uploaded data (indexed BAM, VCF, BED, wig, GFF formats), data from public track hubs, and your BLAST and Primer-BLAST results. You will also learn to take advantage of features of the viewers including optimizing display settings, sharing a view with collaborators, exporting images, and downloading genes or other features in the view.
Date and time: Wed, June 2, 2021 12:00 PM – 12:45 PM EDT
We’ve just released a new version (1.6.0) of Magic-BLAST, the BLAST-based next-gen alignment tool, with these improvements:
Usage reporting — you can help improve Magic-BLAST by sharing limited information about your search. The BLAST User Manual has details on the information collected, how it is used, and how to opt-out.
Magic BLAST can access NCBI SRA next-gen reads from the cloud when you use the -sra or -sra_batch options. See the Magic-BLAST cookbook for more details.
NCBI taxonomy IDs are reported in SAM output if they are present in the target BLAST database.
You can get unaligned reads reported separately from the aligned ones by using the -out_unaligned <file name> option. You can also select the format ( SAM, tabular, or FASTA) with the -unaligned_fmt option. The default format is the same as one for the main report .
Missed our latest YouTube videos? Scroll down to see what we’ve been up to.
Add Preprint Citations in My Bibliography
The National Institutes of Health encourages investigators to post preprints to public repositories in order to speed the dissemination and enhance the rigor of their work. This video demonstrates how to add preprint citations to My Bibliography.
It is with much sadness that we recently learned of the passing of Mark Boguski, MD, PhD, a former Senior Investigator in the Computational Biology Branch at NCBI. Mark worked at the NCBI from 1989-2000 and made a lasting impression on the staff who are still with NCBI and who overlapped with his time here. Many of them have commented on social media about their personal interactions and fond memories of Mark.
Figure 1. Part of an alignment from a translating BLAST (blastx) search of a modified chicken translation factor sequence that Mark provided to Michael Crichton for The Lost World. Mark had edited the sequence by inserting DNA codons that BLAST translates to ‘MARK WAS HERE NIH’ thus leaving his autograph.Continue reading “Remembering Mark Boguski”→
Join us on April 7, 2021 at 12PM eastern time to learn about new web BLAST and Primer-BLAST enhancements that improve your BLAST experience. You’ll also see a preview of some planned improvements to the databases that make it easier to find relevant matches.
Recent changes to web BLAST include added data columns on the descriptions table, so you can quickly find and sort your matches. Primer-BLAST now offers direct links from genome assembly pages, so you can easily select the specificity database. Primer-BLAST also now accepts multiple target templates making it easy to design primers that can amplify several similar sequences such as all splice variants of gene or the same target (16S, COI) from different strains or species.
Date and time: Wed, April 7, 2021 12:00 PM – 12:45 PM EDT
Do you work with data from organisms outside the traditional set of model organisms? Join us on March 10, 2021 to learn how to use NCBI resources including NCBI’s Taxonomy and BLAST that can help you find information from your organism and closely related taxa. You will see an example that shows you how to retrieve and download gene sequences for a set of species, generate multiple sequence alignments, and design primers using Primer-Blast.
Date and time: Wed, March 10, 2021 12:00 PM – 12:45 PM EST
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
The new Protein Family Model resource (Figure 1) provides a way for you to search across the evidence used by the NCBI annotation pipelines to name and classify proteins. You can find protein families by gene symbol, protein function, and many other terms. You have access to related proteins in the family and publications describing members. Protein Family Models includes protein profile hidden Markov models (HMMs) and BlastRules for prokaryotes, and conserved domain architectures for prokaryotes and eukaryotes. The HMMs in the collection include Pfam models, TIGRFAMs as well as models developed at NCBI either de novo, or from NCBI protein clusters. Each of the BlastRules (PMCID: 5753331) consists of one or more model proteins of known biological function with BLAST identity and coverage cutoffs. The conserved domain architectures are based on BLAST-compatible Position Specific Score Matrices (PSSMs) that constitute the NCBI Conserved Domain database.Figure 1. Protein Family Model resource pages. Top panel. Home page. Middle panel, selected results summaries from a fielded search for the DnaK gene product (DnaK[Gene Symbol]). Bottom panel, a portion of an HMM record for DnaK derived from NCBI Protein Clusters (NF009946). The record also includes PubMed citations and HMMER analyses showing the RefSeq proteins named by this method.
Primer-BLAST now has a “Primers common for a group of sequences” submission tab that allows you to design primers for a group of highly similar sequences. For example, you may want test for expression of any transcript of gene rather than a specific splice variant, so you want to design primers to cover all transcript variants. Or you may want to design primers that will amplify the same gene in closely related bacteria strains. To find primers for a group of related sequences, Primer-BLAST aligns the longest sequence to the rest to find common regions. It uses these to limit the locations of primers. The longest sequence is also used as the representative template sequence in the results. Figure 1 shows an example search for primers that will amplify all of the 15 splice variants for the human TP53 gene.
Figure 1. Primer-BLAST submission page and results for primers designed for the human TP53 transcripts. Top panel: The submission form with the “Primers common for a group of sequences” selected and the 15 RefSeq transcript accessions for TP53. Middle panel: The graphical results showing the longest sequence (NM_001126114.3) as the representative template, the locations of the primer pairs, and the alignment of the other template sequences. Bottom panel: An individual primer pair showing the locations on each of the template sequences.
Please try out this new feature and let us know what you think!
To provide a more efficient BLAST experience for everyone, we’re changing some parameters and limits on the web BLAST service on September 8, 2020. The new settings, listed below, will improve overall performance and make search times more consistent.
The Expect Value Threshold default setting will be reduced to 0.05.
The maximum number of target sequences (Max target sequences) limit will be no more than 5,000.
The maximum allowed query length for nucleotide queries (blastn, blastx, and tblastx) will be 1,000,000 and 100,000 for protein queries (blastp and tblastn).
These changes will help keep the BLAST service running smoothly as the already very large databases continue to grow rapidly. If you have any questions or concerns, please email us at firstname.lastname@example.org
You can now download a publication-quality graphic images of the alignment displayed in the NCBI Multiple Sequence Alignment Viewer (Figure 1). Load sequence alignments into the viewer from BLAST or COBALT results or upload alignment files directly. Once you have the the alignment set in the viewer, choose the “Printer-friendly PDF/SVG” option in the Download menu on the toolbar to save the image. The PDF and SVG files contain vector graphics suitable for presentation and publication. Figure 1. The image download options in the MSAV. You can adjust the desired coordinate range and choose to download a PDF or SVG image. You can also preview the PDF download . Choose simplified color shading to improve compatibility with some graphics programs.
The downloaded image will show the coordinate range you requested and will include all the rows in the alignment.
Please contact us through the Feedback link on the MSA Viewer or write to the NCBI Help Desk to provide feedback and let us know how we can make the NCBI Multiple Sequence Viewer work better for you.