A new alignment-free method for supporting bacterial identification and typing on a personal computer

The concept of bacterial species has long been defined through phenotypic characteristics, requiring cultivation. With the decreasing costs of sequencing, comparisons based on the whole genome of bacteria became more commonplace, and “average nucleotide identity” is currently the de facto standard for whole genome comparisons. However, average nucleotide identity is based on aligning the genome of interest to a known reference, and performing this process against all type strains is prohibitively time-consuming. We addressed this problem by representing genomes as signatures based on oligonucleotide frequencies. A comparison of the proposed method against recent tools showed that it performs similarly when used on balanced datasets, both in speed and accuracy. Moreover, since it does not rely on the same principle as existing methods, it can be used to complement other genome analysis pipelines. Here we present results based on various datasets, and demonstrate how our method can be used to infer whether two genomes belong to the same bacterial species or not, as well as to discriminate between strains. Additionally, we show known edge cases to illustrate some limits of the proposed method.


Gleb Goussarov (1,2)
Ilse Cleenwerck (1)
Mohamed Mysara (2)
Rob Van Houdt (2)
Natalie Leys (2)
Peter Vandamme (1,3)
Pieter Monsieurs (2)


Universiteit Gent(1)
Belgian Nuclear Research Center SCK-CEN(2)
Belgian Coordinated Collection of Microorganisms BCCM(3)

Presenting author

Gleb Goussarov, PhD Student, Universiteit Gent
Contact us now

Flanders.bio Strategic Partners