Project details for MPIKmeans

Logo MPIKmeans 1.5

by pgehler - January 16, 2009, 15:48:47 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view (3 today), download ( 0 today ), 8 comments, 0 subscriptions

OverallWhole StarWhole StarWhole StarWhole StarEmpty Star
FeaturesWhole StarWhole StarWhole StarEmpty StarEmpty Star
UsabilityWhole StarWhole StarWhole StarWhole StarWhole Star
DocumentationWhole StarWhole StarWhole StarWhole StarEmpty Star
(based on 1 vote)

A K-means clustering implementation for command-line, Python, Matlab and C. This algorithm yields the very same solution as standard Kmeans, even after each iteration. However it uses some triangle inequalities and stores some distances. Therefore it is MUCH faster than standard Kmeans but uses more memory. See the corresponding paper for more details if you are interested.

Changes to previous version:

Initial Announcement on

BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
Supported Operating Systems: Agnostic
Data Formats: None
Tags: Clustering, Kmeans
Archive: download here


Dmitrey Kroshko (on September 2, 2008, 20:20:15)
I try to run makefile (from linux KUBUNTU) but it seems like it requires MATLAB installed. Is it true?
Peter Gehler (on September 3, 2008, 09:12:40)
You can als run "make shared" and use the python version or bind your own C code.
Jerry (on October 16, 2008, 14:18:22)
"help mpi_kmeans_mex" does not work! I think there is no mpi_kmeans_mex.m file in the Archive:)
wang haoxue (on November 25, 2008, 02:43:11)
this means is very good!
zzinwhu (on February 15, 2009, 19:49:23)
How can I download it?
Bill (on March 21, 2009, 04:04:41)
Hello! I need to use this function for my research experiment in matlab. But the archive was missing file "mpi_kmeans_mex", so I couldn't run "help mpi_kmeans_mex". please help me! thx!
Peter Gehler (on April 7, 2009, 16:04:58)
There was a typo in the README. To get help in matlab simply type "help mpi_kmeans"
Janosch Peters (on June 24, 2010, 14:42:57)
mpi kmeans is causing a segfault when the number of clusters is too close to the number of features. See the gdb output below: Program received signal SIGSEGV, Segmentation fault. 0x00007fffef49e465 in add_point_to_cluster (cluster_ind=71796994, CX=0x8f2eb70, px=0xb90beb0, nr_points=0x6c235a0, dim=128) at /home/zaphod/Code/Recognosco/src/mpi_kmeans-1.5/mpi_kmeans.cxx:100 100 if (nr_points[cluster_ind]==0) (gdb) backtrace #0 0x00007fffef49e465 in add_point_to_cluster (cluster_ind=71796994, CX=0x8f2eb70, px=0xb90beb0, nr_points=0x6c235a0, dim=128) at /home/zaphod/Code/Recognosco/src/mpi_kmeans-1.5/mpi_kmeans.cxx:100 #1 0x00007fffef49e68b in remove_identical_clusters (CX=0x8f2eb70, cluster_distance=0xbad9ac0, X=0xa8706b0, cluster_count=0x6c235a0, c=0x6b1ea80, dim=128, nclus=2000, npts=18853) at /home/zaphod/Code/Recognosco/src/mpi_kmeans-1.5/mpi_kmeans.cxx:138 #2 0x00007fffef49f12f in kmeans_run (CX=0x8f2eb70, X=0xa8706b0, c=0x6b1ea80, dim=128, npts=18853, nclus=2000, maxiter=0) at /home/zaphod/Code/Recognosco/src/mpi_kmeans-1.5/mpi_kmeans.cxx:373 #3 0x00007fffef49fc78 in kmeans (CX=0x8f2eb70, X=0xa8706b0, assignment=0x6b1ea80, dim=128, npts=18853, nclus=2000, maxiter=0, restarts=20) at /home/zaphod/Code/Recognosco/src/mpi_kmeans-1.5/mpi_kmeans.cxx:584

Leave a comment

You must be logged in to post comments.