NCBI will be attending and presenting at ASM Microbe 2018 this June. Read on for all NCBI activities.
Continue reading “NCBI attending ASM Microbe 2018 June 7-11”
Month: May 2018
NCBI will be attending and presenting at ASM Microbe 2018 this June. Read on for all NCBI activities.
Continue reading “NCBI attending ASM Microbe 2018 June 7-11”
RefSeq release 88 is now accessible online, via FTP and through NCBI’s programming utilities. This full release incorporates genomic, transcript, and protein data available, as of May 14, 2018. It contains 160,224,355 records, including 110,333,800 proteins, 22,461,378 RNAs, and sequences from 79,448 organisms. The release is in several directories as a complete dataset and as divided by logical groupings.
This release incorporates dbSNP release 151, which nearly doubles the number of SNPs annotated on the human GRCh38 genome, with matching increases in the size of the human nucleotide flatfile (.gbff) records.
Starting in November 2018, SNP variation features will no longer be in RefSeq genome assembly records. The RefSeq release notes have information about this change.
IgBLAST 1.9.0 supports the adaptive immune receptor repertoire (AIRR) standard for sequence analysis results. The AIRR format is available on web IgBLAST as well as in the standalone IgBLAST tool, with the -outfmt 19 option.
Supporting this new schema in IgBLAST will enhance the increasing amount of repertoire studies that use next-generation sequencing technology to generate very large sets of Ig/T-cell receptor rearrangement analysis data.
IgBLAST facilitates the analysis of immunoglobulin and T cell receptor variable domain sequences. Get IgBLAST on FTP. A new manual is on GitHub.
We know it’s not always easy to find the sequence data you’re after at NCBI. Maybe it’s because you’re no expert at constructing queries, and you end up with no results or too many results. Or maybe you’re an Entrez wizard, but creating a query full of Booleans and filters seems like overkill when you could just write a short natural language query, like you’re used to doing in Google. The next time you search for a gene, transcript or genome assembly for a given organism, try the new search experience we’re piloting in NCBI Labs.
In NCBI Labs, you can now search for sequences using natural language and get the best results.
The improved search experience now available in NCBI Labs addresses 3 types of queries that commonly fail in searches at NCBI: organism-gene (e.g. human BRCA1), organism-transcript (e.g. Mouse p53 transcripts) and organism-assembly (e.g. dog reference genome). For each of these query types in NCBI Labs, we now return NCBI’s highest quality sequence sets or reference and representative assemblies in an easy-to-view panel.
Example queries are shown below to get you started.
Continue reading “Test drive a new sequence search experience at NCBI Labs”
GenBank release 225.0 (4/14/2018) has 208,452,303 traditional records (including non-bulk-oriented TSA) containing 260,189,141,631 base pairs of sequence data. In addition, there are 621,379,029 WGS records containing 2,784,740,996,536 base pairs of sequence data, 227,364,990 TSA records containing 205,232,396,043 base pairs of sequence data, and 14,782,654 TLS records containing 5,612,769,448 base pairs of sequence data.
During the 60 days between the close dates for GenBank releases 224.0 and 225.0, the traditional portion of GenBank grew by 6,558,433,533 base pairs and by 1,411,748 sequence records. During that same period, 86,960 records were updated – an average of 24,978 records added or updated per day.
Continue reading “GenBank release 225: Over 1 billion sequence records stored!”
From July 11-13, 2018, NCBI will help with a data science hackathon at the Northwestern Feinberg School of Medicine campus in downtown Chicago. This hackathon focuses on genomics and general data science analyses including text, image, and sequence processing. The event is for researchers, including students and postdocs, who already use large datasets or develop pipelines for analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians or librarians. The hackathon is open to anyone selected for the hackathon and willing to travel to Chicago.
Next Wednesday, May 16, 2018, we’ll show you how to download and use the latest standalone BLAST databases, dbv5. You’ll learn how to use BLASTdbv5 and the new BLAST programs to limit searches to taxonomic groups and to retrieve sequences from the database by taxonomy.
Date and time: Wed, May 16, 2018 12:00 PM – 12:30 PM EDT
Register here: https://bit.ly/2qW7LLy
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
Next Wednesday, May 9, 2018, NCBI staff will show you how to use PubChem as a cheminformatics education resource. In addition to learning about tools and services for chemical information search, analysis, and download, you will also see examples of how instructors incorporate PubChem in Cheminformatics OLCC (On-Line Chemistry Courses), an intercollegiate hybrid course.
Date and time: Wednesday, May 9, 2018 12:00 – 12:30 PM EDT
Register here: https://bit.ly/2q5wtsF
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
We recently updated the BLAST AMI on Amazon Web Services (AWS). The AMI is preconfigured with BLAST+ 2.7.1 and supports a subset of the NCBI BLAST URL API. The latest version also addresses long download times and preservation of BLAST databases and results between reboots.
Useful Links:
The NCBI Eukaryotic Genome Annotation Pipeline has recently released new annotations in RefSeq for the following organisms:
See more details on the Eukaryotic RefSeq Genome Annotation Status page.