YgorClustering is a header-only C++ implementation of the DBSCAN clustering algorithm. The implementation is specifically based on the article "A Density-Based Algorithm for Discovering Clusters" by Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu in 1996. DBSCAN is generally regarded as a reliable clustering technique compared with techniques such as k-means (which is, for example, unable to cluster concave clusters).

Implementations of other clustering techniques are planned. At the moment only (vanilla) DBSCAN is implemented. It uses Boost.Geometry R*-trees for fast indexing. It can cluster 20 million 2D datum in around an hour, or 20 thousand datum in seconds.


The source is available here and is released under a GPLv3 or later license. Please send questions or comments to . Or, even better, send a pull request ☺.


The included test programs perform clustering on various types of data. The fourth example uses Boost.Filesystem and Boost.DateTime to cluster a collection of photos based on the modification time. It can detect clusters of photos including vacations, rapid-fire photos, or multi-year photo-taking behaviour depending on the choice of tuning parameters.

The below video shows a simple 2D clustering example using random data. Different types of clusters are detected when the tuning parameters are tweaked; the video transitions between epsilon 4.5 and 6.0 (arbitrary spatial units).