About: ELKI is a framework for implementing data-mining algorithms with support for index structures, that includes a wide variety of clustering and outlier detection methods.Changes:
Additions and Improvements from ELKI 0.6.0:
ELKI is now available on Maven: https://search.maven.org/#artifactdetails|de.lmu.ifi.dbs.elki|elki|0.7.0|jar
Please clone https://github.com/elki-project/example-elki-project for a minimal project example.
Uncertain data types, and clustering algorithms for uncertain data.
Major refactoring of distances - removal of Distance values and removed support for non-double-valued distance functions (in particular DoubleDistance was removed). While this reduces the generality of ELKI, we could remove about 2.5% of the codebase by not having to have optimized codepaths for double-distance anymore. Generics for distances were present in almost any distance-based algorithm, and we were also happy to reduce the use of generics this way. Support for non-double-valued distances can trivially be added again, e.g. by adding the specialization one level higher: at the query instead of the distance level, for example. In this process, we also removed the Generics from NumberVector. The object-based get was deprecated for a good reason long ago, and e.g. doubleValue are more efficient (even for non-DoubleVectors).
Dropped some long-deprecated classes.
EM clustering was refactored and moved into its own package. The new version is much more extensible.
Parallel computation framework, and some parallelized algorithms
LibSVM format parser.
kNN classification (with index acceleration).
Internal cluster evaluation:
Statistical dependence measures:
Frequent Itemset Mining:
MiniGUI has two "secret" new options: -minigui.last -minigui.autorun to load the last saved configuration and run it, for convenience.
Logging API has been extended, to make logging more convenient in a number of places (saving some lines for progress logging and timing).
About: Hubness-aware Machine Learning for High-dimensional DataChanges:
About: GritBot is an data cleaning and outlier/anomaly detection program.Changes:
Initial Announcement on mloss.org.