We have updated the bacterial and archaeal representative genome collection! The current collection contains over 13,000 assemblies selected from the 203,000 prokaryotic RefSeq assemblies to represent their respective species. The collection has increased by 11% since August 2020. We’ve included about 1,400 species for the first time, have used better assemblies for 1,177 species, and have removed 65 species because of changes in NCBI Taxonomy or uncertainty in their species assignment.
We have also updated the Representative Genomes Database on the Microbial Nucleotide BLAST page as well as the RefSeq Representative Genome Database on basic nucleotide BLAST, to reflect these changes.
You can download the reference and representative set from the Assembly resource. If you are interested in the annotation on these genomes, you can limit searches to proteins annotated on representative genomes by adding “refseq_select[filter]” to any query in the Protein database. For example, you can find all proteins annotated on representative genomes in the genus Klebsiella by using the query: “Klebsiella[organism] AND refseq_select[filter]”.
And you can now run Blast searches against proteins annotated on representative genomes. Go to the protein-protein BLAST page and choose the ‘RefSeq Select (refseq_select)’ database . See our recent post for more information.