Personal tools
You are here: Home CG seminar 2020 k-Means Clustering of Lines for Big Data
« November 2019 »
Log in

Forgot your password?

k-Means Clustering of Lines for Big Data

Wednesday, November 13th, 2019, 16:10

Checkpoint 480


k-Means Clustering of Lines for Big Data

Yair Marom, University of Haifa


The input to the k-mean for lines problem is a set L of n lines in R^d, and the goal is to compute a set of k centers (points) in R^d that minimizes the sum of squared distances over every line in L and its nearest center. This is a straightforward generalization of the k-mean problem where the input is a set of n points instead of lines
We suggest and the first PTAS that computes a (1+epsilon)-approximation to this problem in time O(n log n) for any constant epsilon>0, and k, d > 1.
Streaming input using ~log(n) memory, and distributed parallel computations are also provided.

This is by proving that there is always a weighted subset (called coreset) of dk^{O(k)} log (n)/epsilon^2 lines in L that approximates the sum of squared distances from L to any given set of k points.
Experimental results, open source, and applications for anomaly detection and localization via standard (2D) cameras are also provided.

Based on a NIPS'19 paper, co-authored with Dan Feldman
Document Actions