Performance On Toy Problems

The performance of the Nuclear Potential Clustering approach has been tested on different synthetic datasets with varying number of data points, varying cluster shape and different feature space dimensions (up to 100). The following videos show the temporal evolution of energy and spatial location for all points on three different toy problems:

 

1. A three-dimensional test data set comprised of a ‘horseshoe’ (turquoise, 461 points) and two spheroids (red and blue with 36 and 79 points, respectively), overlaid by a background of random noise (yellow, 57 points) :

Download Spatial evolution Video (MPEG-Datei) and Download Energy evolution Video (MPEG-Datei)

 

2. A three-dimensional test data set consisting of two dense regions positioned around skew lines (green and blue with 70 and 90 points, respectively), overlaid by random noise (red, 35 points):

Download Spatial evolution Video (MPEG-Datei) and Download Energy evolution Video (MPEG-Datei)

 

3. A 100-dimensional test data set consisting of two dense regions positioned around skew lines (200 and 320 points, respectively) overlaid by random noise (100 points). Here, the video shows only the temporal energy evolution as it is not possible to depict the spatial distribution of points in 100-dimensional space:

Download Energy evolution Video (MPEG-Datei)

 

As it can be seen from the videos, points do not directly form stable nuclei but rather oscillate towards a stable equalibrium state. In order to choose a reasonable stopping condition for our algorithm, we determined the average kinetic energy of the system as a function of time. The algorithm is stopped when the system temperature is less than 1% of the value it takes at the corresponding maximum:

The final grouping of particles is shown here:

Quantitative results obtained for the test data sets are given in the following tables