Project details for CARP

Logo JMLR CARP 3.3

by volmeln - November 7, 2013, 15:48:06 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view (2 today), download ( 0 today ), 0 subscriptions


The C-package CARP is a convenient and easy tool for evaluating performance of clustering algorithms. The underlying methodology is based on first simulating Gaussian mixture models according to prespecified levels of average and maximum pairwise overlaps. The concept of overlap is defined as the sum of two misclassification probabilities (Maitra and Melnykov, 2010). Datasets are then simulated from the realized Gaussian mixtures. The software implementing this phase is called C-MixSim and can be invoked standalone. This concludes the first phase of the procedure. In the second phase, the clustering algorithm being evaluated is run on the generated datasets. We provide an example here using an agglomerative hierarchical clustering algorithm hierclust which is included. The third phase compares obtained and true groupings. By default, the comparison measure is the Adjusted Rand index of Hubert and Arabie (1985) but the user can also provide some other measure in executable form. Upon conclusion, CARP provides a distribution of the desired performance measure for the clustering method being evaluated at the preferred setting. This provides for a detailed understanding of the performance of the clustering algorithm being evaluated. CARP is released under the GNU GPL license.

Changes to previous version:

Generalized overlap error and some bugs have been fixed

BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
Supported Operating Systems: Cygwin, Linux, Macosx, Windows
Data Formats: Ascii
Tags: Overlap, Clustering Algorithm, Gaussian Mixture
Archive: download here

Other available revisons

Version Changelog Date

Generalized overlap error and some bugs have been fixed

November 7, 2013, 15:48:06

Added generalized overlap, more metrics for comparing partitionings

August 28, 2011, 02:14:17

Added an option for producing homogeneous clusters

December 6, 2010, 19:41:29

Command-line interface, improved reliability, detailed manual.

November 8, 2010, 06:41:09

Command-line interface, improved reliability, detailed manual.

August 27, 2010, 06:16:07

Initial Announcement on

April 10, 2010, 02:32:44


No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.