We’ve released a new version of IgBLAST, v1.12. This new version increases the allowed distance between V gene end and J gene start positions (from 90 bp to 150 bp) as well as between V gene end and D gene start positions (from 55 to 120 bp) to accommodate extremely long VDJ junctions found in some antibodies.
IgBLAST 1.12 uses the 1-based sequence coordinate system that reflects the change in the new AIRR Rearrangement Schema. Also, it includes fixes for minor bugs found in previous versions.
The NCBI will host a collaborative biodata science hackathon on the NIH Campus in Bethesda, Maryland February 20-22.
We are now collecting project proposals focusing on building tools and pipelines for advanced analysis of biomedical datasets including text, images, next generation sequencing data, proteomics, and metadata. Proposals for tutorial pipelines and educational tools for advanced analysis are also welcome.
Submit your project proposal here! Submissions are due January 7, 2019.
This month, the NCBI Eukaryotic Genome Annotation Pipeline annotated its 500th organism! The lucky winner is Pocillopora damicornis, a stony reef-building coral frequently used as an experimental model, whose larval dispersal and development are affected by environmental changes in the oceans.
Going to the ASCB | EMBO meeting? Stop by the NCBI booth (#327) to learn about all that NCBI has to offer, ask questions, and provide feedback on how we can better meet your needs for research and teaching.
Booth #327, Exhibit Hall:
Sunday, December 9, 9:30 AM – 4:00 PM
Monday, December 10, 9:30 AM – 4:00 PM
Tuesday, December 11, 9:30 AM – 4:00 PM
Visit the booth anytime during exhibit hours to discuss any topic or just to say hello. We’re also offering specific times at the booth for focused conversations about using specific sets of NCBI resources in your research and teaching.
12:30 PM NCBI BLAST in research and teaching
12:30 PM Jupyter notebooks to teach scripting and NCBI resources
12:30 PM EDirect for command-line access to NCBI databases
2:00 PM Jupyter notebooks to teach scripting and NCBI resources
To stay up-to-date about NCBI at ASCB or in general, follow us on Twitter at @NCBI .
As previously announced, GenBank and other INSDC members will expand the accession formats used for sequencing projects by the end of this year. We’re introducing these new formats to accommodate the growth of Whole Genome Shotgun (WGS), Transcriptome Shotgun Assembly (TSA), and Targeted Locus Study (TLS) sequencing sequences. More details about those changes are available on NCBI Insights.
You may have to adjust your code and databases to accommodate the new formats’ longer length. In particular, the first line of the flatfile format, referred to as the LOCUS line, includes the “Locus Name” (usually identical to the accession number), which may now grow to as long as 20 characters. See section 3.4.4 of the GenBank release notes for examples of how the LOCUS line might change.
Since 2003, the GenBank release notes have recommended that flatfile parsers use a whitespace-separated tokens approach to accommodate changes like the one described in section 3.4.4. If your flatfile parsers rely solely on position, you may have to make modifications. From our internal testing, it appears BioPython and BioPerl properly handle most of the examples shown in section 3.4.4, and only have issues with the last theoretical examples where the sequence length no longer ends at position 40. We do recommend adjusting code to accommodate those theoretical examples for future-proofing.
Please write to the helpdesk with any questions about the new formats.