An Introduction to Molecular Evolutionary Analysis with NCBI Datasets and Python
As they diverge from a common ancestor, species accumulate differences in their DNA sequences. Differences within a protein-coding region are classified in two types. Non-synonymous substitutions change the amino acid sequence of the protein, while synonymous substitutions do not. Synonymous substitutions are largely invisible to natural selection and tend to accumulate at a constant rate. On the other hand, non-synonymous substitutions whose effects are beneficial accumulate at a faster rate, while those that are deleterious are suppressed. By comparing the rates of non-synonymous and synonymous substitutions, we can infer whether natural selection has primarily acted to conserve the protein sequence or to adapt it to a new environment or function.
In this workshop you will learn to compare the protein-coding sequences of two species to estimate which proteins show signs of adaptation. Working in a Jupyter notebook with bash and Python, you will use the NCBI Datasets command line interface (CLI) to download sequence data, then perform analysis with a few popular Python packages. The workshop assumes basic familiarity with a scripting language such as Python or R at a level equivalent to a semester course or programming bootcamp.
In this workshop, you will learn how to:
- Search for and download protein ortholog sequences with NCBI Datasets CLI
- Parse the downloaded files with BioPython
- Identify synonymous and non-synonymous substitutions and calculate substitution rates
- Plot the results with Matplotlib
Due to curricular and technical limits, we’ve capped the number of spots to provide the best workshop experience. If you register to apply, you will be notified of your application status 2 weeks before the scheduled event.
This workshop requires a stable internet connection and the use of a modern web browser. We will be presenting using Zoom. For some tips about this platform, please refer to Zoom Support Documents.
Our workshops are intended to provide hands-on experience, so we will encourage you to follow along and perform practice exercises during the event. While many users are able to move back and forth between Zoom and practice exercises on a single screen, it is helpful to have two screens available for the event. If you only have access to one computer screen, viewing the Zoom session on a tablet or phone and using your computer for practice exercises can be helpful.