Monkeypox virus: Complete genome from the current outbreak now available in GenBank

The first complete genome sequence of the current monkeypox virus (MPXV) outbreak (isolate name MPXV_USA_2022_MA001) is now available with accession ON563414 in GenBank, a public database of DNA sequences hosted by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM).

Several cases of monkeypox have been identified in geographically widespread countries. Monkeypox is classified as a zoonotic disease where transmission of the virus is usually due to animal-human contact. Genetically, monkeypox viruses cluster into two groups: the Congo basin and the west African clade. This particular outbreak has been identified as due to a virus from the west African clade which is often associated with milder disease and, in this case, human-to-human spread is suspected.

Having viral genome data freely and widely available in GenBank enables researchers to explore how this virus differs from viruses isolated and sequenced in the past. This new genome sequence, produced from a Massachusetts isolate, was submitted to GenBank by the Division of High-Consequence Pathogens and Pathology of the US Centers for Disease Control and Prevention. It is most similar to monkeypox virus genomes collected from a small international outbreak in 2017-18 (see Figure 1), and only differs from one of these sequences, MT903343.1, by fewer than 100 out of over 197,000 nucleotide bases. 

Figure 1: Phylogenomic tree of monkeypox virus genomes. Representative monkeypox virus genomes were selected from both the Congo basin and the west African clades. Nucleotide sequences were aligned with MAFFT (FFT-NS-2, v. 7.450), the phylogeny was built with the IQ-TREE v. 1.6.12 web server, and the consensus tree was visualized with the iTOL v. 6.5.6 web server. The tree scale is shown as expected number of substitutions per site.

MPXV sequences can be accessed through NCBI Virus, where several filters are available and sequence data or sample-descriptor data can be downloaded, shortly after being released to the public. Sequences from numerous other countries have also subsequently been released. Unassembled sequence read data can be accessed through the Sequence Read Archive, and searching for “txid10244[Organism].”

If you have monkeypox virus sequences to submit, please include at least the collection date and location, and the sequencing methodology in as much detail as possible. Learn more about submitting viral sequences.

Feel free to reach out to us if you need any assistance along the way.

