IgBLAST 1.7.0 release
A new version of IgBLAST is now available on FTP, with the following new features:
- Specify whether overlapping nucleotides at VDJ junctions are allowed in matching V, D, and J genes.
- Set a custom J gene mismatch penalty
- Report the CDR3 start and stop positions in the sub-region table
- Use alignment length instead of percent identity as the tie-breaker for hits with identical blast scores, improving accuracy in the V, D, J gene assignment.
IgBLAST was developed at the NCBI to facilitate the analysis of immunoglobulin and T cell receptor variable domain sequences.
Sequence Viewer 3.21
Sequence Viewer 3.21 has several new features, improvements and bug fixes, including improved track hiding, and parameters to define pile-up praph size in BAM files (demo).For a full list of changes, see the Sequence Viewer release notes.
Sequence Viewer is a graphical view of sequences and color-coded annotations on regions of sequences stored in the Nucleotide and Protein databases.
dbGaP (the NIH database of Genotypes and Phenotypes) is celebrating its 10th Anniversary this year! We are proud to support over 850 studies and 1.6 million samples.
We invite you to join us at the dbGaP 10th Anniversary Symposium to be held on June 9, 2017; 1:30-3:00 PM Wilson Hall, Building-1 on the NIH Bethesda campus. The symposium includes six lightning talks highlighting the past, present and future of the dbGaP resource followed by a social hour.
Feel free to distribute this flyer (click to enlarge). We hope to see you at the Symposium!
Please contact firstname.lastname@example.org if you have any questions.
Reasonable accommodations will be provided for individuals with disabilities.
Read on for a list of speakers and abstracts.
NCBI is discontinuing the BLink protein similarity service effective immediately. BLink provided graphical access to related proteins from protein records in the Entrez system. Because of the increasing volume of data in the protein database, BLink has become less useful as a tool for finding related sequences and is no longer maintainable.
Temporary replacement for BLink
The BLink service will redirect to a live protein-protein BLAST search against the Landmark database used by SmartBLAST. The Landmark database, described in the SmartBLAST documentation , provides matches from 27 selected cellular organisms with well-annotated complete genomes representing a broad taxonomic range. The results from the redirected BLink search will be shown as a Tax BLAST report as shown in the figure below. The Tax BLAST report emphasizes the taxonomic source of the protein matches as did the BLink output. From this new starting point, you can explore additional protein similarities through the BLAST service by re-submitting the search against other blast databases including the non-redundant (nr) database.
Figure 1. The Tax BLAST report for proton ATPase A. (click to enlarge)
Figure 1. The QuickBLASTP option is available under “Program Selection”.
QuickBLASTP, an accelerated version of BLASTP, adds a new pre-processing step to the non-redundant (nr) protein database. In a matter of seconds, QuickBLASTP will find approximately 97% of the database sequences with 70% or more identity to your query and around 98% of the database sequence with 80% or more identity to your query.
Currently, QuickBLASTP will only accept searches with a total query length less than 10,000 residues. You may only search the nr database with QuickBLASTP.
RefSeq release 82 is accessible online, via FTP and through NCBI’s programming utilities. This full release incorporates genomic, transcript, and protein data available as of May 8, 2017 and contains 127,098,289 records, including 84,756,971 proteins, 18,901,573 RNAs, and sequences from 69,035 organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings.
This blog post is directed toward people who use dbSNP and dbVar, particularly those who submit non-human data to the two databases.
dbSNP and dbVar archive, process, display and report information related to germline and somatic variations from multiple species. These two databases have grown rapidly as sequencing and other discovery technologies have evolved, and now contain nearly two billion variants from over 360 species.
Based on projected growth and the resources required to archive and distribute the data, continued support for all organisms will become unsustainable for NCBI in the near future. Therefore, NCBI will phase out support for all non-human organisms in dbSNP and dbVar, and will support only human variation.
NCBI will phase out support for non-human organisms in dbSNP and dbVar following this timeline:
- September 1, 2017 – dbSNP and dbVar stop accepting non-human variant data submissions
- November 1, 2017 – dbSNP and dbVar interactive websites and related NCBI services stop presenting non-human variant data. The data will, however, continue to be available for download on the dbSNP and dbVar FTP sites.
Any non-human data that is already in the databases or that is submitted before September 1, 2017 will continue to be available via the dbSNP and dbVar FTP download sites.
If you want to submit non-human variation data now or after September 1, 2017, European Bioinformatics Institute (EBI) – one of our partners in the International Nucleotide Sequence Database (INSDC) – is accepting these data in the European Variation Archive.
Finally, we would like to thank all the submitters and users who have supported dbSNP and dbVar throughout the years.
This blog post is directed toward Assembly users.
A new “Download assemblies” button is now available in the Assembly database. This makes it easy to download data for multiple genomes without having to write scripts.
For example, you can run a search in Assembly and use check boxes (see left side of screenshot below) to refine the set of genome assemblies of interest. Then, just open the “Download assemblies” menu, choose the source database (GenBank or RefSeq), choose the file type, and start the download. An archive file will be saved to your computer that can be expanded into a folder containing your selected genome data files.
Figure 1. The “Download Assemblies” button is at the top right of the Assembly page. When you click on it, you will see options for source database and file type, and a download button. There are several options for file type, including Genomic GFF.
dbSNP’s Human Build 150 includes a large number of new submissions from the Human Longevity, Inc. (HLI) and TopMed, increasing the total number of Human RefSNPs in the database from 154 to 324 million. TopMed has also provided new allele frequency data for 163 million RefSNPs.
Central Bearded Dragon (Pogona vitticeps)
(Credit: Mark Sum, USGS. Public domain.)
In April, the NCBI Eukaryotic Genome Annotation Pipeline released new annotations in RefSeq for the following eleven organisms:
- Bombus terrestris (buff-tailed bumblebee)
- Ceratitis capitata (Mediterranean fruit fly)
- Athalia rosae (coleseed sawfly)
- Dendrobium catenatum (a monocot)
- Phalaenopsis equestris (a monocot)
- Orbicella faveolata (stony coral)
- Pogona vitticeps (central bearded dragon)
- Oryzias latipes (Japanese medaka)
- Sesamum indicum (sesame)
- Jatropha curcas (a eudicot)
- Amborella trichopoda (a flowering plant)
See more details on the Eukaryotic RefSeq Genome Annotation Status page.
From June 19-21, 2017, the NCBI will assist in a bioinformatics hackathon at the New York Genome Center (NYGC). This hackathon will focus on advanced bioinformatics analysis of next generation sequencing (NGS) data, proteomics and metadata. To apply for this hackathon, complete this application (approximately 10 minutes to complete). Applications are due Monday, May 22, 2017 by 5 PM ET.
This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for bioinformatics analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians or librarians.
The event is open to anyone selected for the hackathon and able to travel to the NYGC (see address below).