# Agglomerative Clustering of Growing Squares: Theory and Practice

### Wednesday, March 6th, 2019, 16:10

### Schreiber 309

### Agglomerative Clustering of Growing Squares: Theory and Practice

### Bettina Speckmann, TU Eindhoven

### Abstract:

We study an agglomerative clustering problem motivated by interactive glyphs in geo-visualization. Consider a set of disjoint square glyphs on an interactive map. When the user zooms out, the glyphs grow in size relative to the map, possibly with different speeds. When two glyphs intersect, we wish to replace them by a new glyph that captures the information of the intersecting glyphs.

We present a fully dynamic kinetic data structure that maintains a set of n disjoint growing squares. Our data structure uses O(n (\log n \log\log n)^2) space, supports queries in worst case O(\log^2 n) time, and updates in O(\log^5 n) amortized time. This leads to an O(n\alpha(n)\log^5 n) time algorithm to solve the agglomerative clustering problem, which significantly improves upon the current best O(n^2) time algorithms.

From a practical point of view, our new algorithm does not perform better than the naive algorithms for n < 10^6. We hence also present an alternative agglomerative clustering algorithm which works efficiently in practice. Our algorithm relies on the use of quadtrees to speed up spatial computations. Interestingly, even in non-pathological datasets, we can encounter large glyphs that intersect many quadtree cells and that are involved in many clustering events. We therefore devise a special strategy to handle such large glyphs. We test our algorithm on several synthetic and real-world datasets and show that it performs well in practice.

Joint work with Thom Castermans, Frank Staals, and Kevin Verbeek