IIT Home Page CNR Home Page

Scale parameter selection for the spectral clustering method

Spectral clustering is a powerful method for finding structure in data through the eigenvectors of a similarity matrix. It often out-performs traditional clustering algorithms such as k-means when the structure of the individual clusters is highly non-convex. Its accuracy depends on how the similarity between pairs of data points is defined. When a Gaussian similarity function is used, the choice of a scale parameter o is crucial. It is often suggested to select o by running the spectral algorithm repeatedly for different values of o and selecting the one that provides the best clustering according to some criterium. In this paper we propose a low cost technique for selecting a suitable o based on the minimal spanning tree (MST) associated to the graph of the distances between pairs of points. A numerical experimentation on both artificial and real-world datasets validates the effectiveness of the proposed technique.

Autori esterni: Grazia Lotti (Dipartimento di Matematica e Informatica, University of Parma, Italy), Oriana Menchi (Dipartimento di Informatica, University of Pisa, Italy), Francesco Romani (Dipartimento di Informatica, University of Pisa, Italy)
Autori IIT:

Tipo: Rapporto Tecnico
Area di disciplina: Mathematics
IIT TR-11/2016

File: TR 011_2016.pdf

Attività: Algoritmica per tecnologie web