About: The Cognitive Foundry is a modular Java software library of machine learning components and algorithms designed for research and applications. Changes:

About: This project is a C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems. Changes:This release adds an implementation of spectral clustering as well as a few bug fixes and usability improvements.

About: The autoencoder based data clustering toolkit provides a quick start of clustering based on deep autoencoder nets. This toolkit can cluster data in feature space with a deep nonlinear nets. Changes:Initial Announcement on mloss.org.

About: Hubnessaware Machine Learning for Highdimensional Data Changes:

About: The Weka workbench contains a collection of visualization tools and algorithms for data analysis and predictive modelling, together with graphical user interfaces for easy access to this [...] Changes:In core weka:
In packages:

About: The apcluster package implements Frey's and Dueck's Affinity Propagation clustering in R. The package further provides leveraged affinity propagation, exemplarbased agglomerative clustering, and various tools for visual analysis of clustering results. Changes:

About: The Accord.NET Framework is a .NET machine learning framework combined with audio and image processing libraries completely written in C#. It is a complete framework for building productiongrade computer vision, computer audition, signal processing and statistics applications even for commercial use. A comprehensive set of sample applications provide a fast start to get up and running quickly, and an extensive online documentation helps fill in the details. Changes:Adding a large number of new distributions, such as AndersonDaring, ShapiroWilk, Inverse ChiSquare, Lévy, Folded Normal, Shifted LogLogistic, Kumaraswamy, Trapezoidal, Uquadratic and BetaPrime distributions, BirnbaumSaunders, Generalized Normal, Gumbel, Power Lognormal, Power Normal, Triangular, Tukey Lambda, Logistic, Hyperbolic Secant, Degenerate and General Continuous distributions. Other additions include new statistical hypothesis tests such as AndersonDaring and ShapiroWilk; as well as support for all of LIBLINEAR's support vector machine algorithms; and format reading support for MATLAB/Octave matrices, LibSVM models, sparse LibSVM data files, and many others. For a complete list of changes, please see the full release notes at the release details page at: https://github.com/accordnet/framework/releases

About: C++ software for statistical classification, probability estimation and interpolation/nonlinear regression using variable bandwidth kernel estimation. Changes:New in Version 0.9.8:

About: STK++: A Statistical Toolkit Framework in C++ Changes:Inegrating openmp to the current release. Many enhancement in the clustering project. bug fix

About: An extensible C++ library of Hierarchical Bayesian clustering algorithms, such as Bayesian Gaussian mixture models, variational Dirichlet processes, Gaussian latent Dirichlet allocation and more. Changes:Initial Announcement on mloss.org.

About: A generalized version of spectral clustering using the graph pLaplacian. Changes:

About: DRVQ is a C++ library implementation of dimensionalityrecursive vector quantization, a fast vector quantization method in highdimensional Euclidean spaces under arbitrary data distributions. It is an approximation of kmeans that is practically constant in data size and applies to arbitrarily high dimensions but can only scale to a few thousands of centroids. As a byproduct of training, a tree structure performs either exact or approximate quantization on trained centroids, the latter being not very precise but extremely fast. Changes:Initial Announcement on mloss.org.

About: ELKI is a framework for implementing datamining algorithms with support for index structures, that includes a wide variety of clustering and outlier detection methods. Changes:Additions and Improvements from ELKI 0.5.5: Algorithms Clustering:
Outlier detection
Distances
Database Layer and Data Types Projection layer * Parser for simple textual data (for use with Levenshtein distance) Various random projection families (including Feature Bagging, Achlioptas, and pstable) Latitude+Longitude to ECEF Sparse vector improvements and bug fixes New filter: remove NaN values and missing values New filter: add histogrambased jitter New filter: normalize using statistical distributions New filter: robust standardization using Median and MAD New filter: Linear discriminant analysis (LDA) Index Layer
Mathematics and Statistics
Visualization
Other

About: Automatic Analysis of Malware Behavior using Machine Learning Changes:Support for new version of libarchive. Minor bug fixes.

About: The Gesture Recognition Toolkit (GRT) is a crossplatform, opensource, c++ machine learning library that has been specifically designed for realtime gesture recognition. It features a large number of machinelearning algorithms for both classification and regression in addition to a wide range of supporting algorithms for preprocessing, feature extraction and dataset management. The GRT has been designed for realtime gesture recognition, but it can also be applied to more general machinelearning tasks. Changes:Added Decision Tree and Random Forests.

About: FABIA is a biclustering algorithm that clusters rows and columns of a matrix simultaneously. Consequently, members of a row cluster are similar to each other on a subset of columns and, analogously, members of a column cluster are similar to each other on a subset of rows. Biclusters are found by factor analysis where both the factors and the loading matrix are sparse. FABIA is a multiplicative model that extracts linear dependencies between samples and feature patterns. Applications include detection of transcriptional modules in gene expression data and identification of haplotypes/>identity by descent< consisting of rare variants obtained by next generation sequencing. Changes:CHANGES IN VERSION 2.8.0NEW FEATURES
CHANGES IN VERSION 2.4.0
CHANGES IN VERSION 2.3.1NEW FEATURES
2.0.0:
1.4.0:

About: Apache Mahout is an Apache Software Foundation project with the goal of creating both a community of users and a scalable, Javabased framework consisting of many machine learning algorithm [...] Changes:Apache Mahout 0.8 contains, amongst a variety of performance improvements and bug fixes, an implementation of Streaming KMeans, deeper Lucene/Solr integration and new scalable recommender algorithms. For a full description of the newest release, see http://mahout.apache.org/.

About: This toolbox implements a novel visualization technique called Sectors on Sectors (SonS), and a extended version called Multidimensional Sectors on Sectors (MDSonS), for improving the interpretation of several data mining algorithms. The MDSonS method makes use of Multidimensional Scaling (MDS) to solve the main drawback of the previous method, namely, the lack of representing distances between pairs of clusters. These methods have been applied for visualizing the results of hierarchical clustering, Growing Hierarchical SelfOrganizing Maps (GHSOM), classification trees and several manifolds. These methods make possible to extract all the existing relationships among centroids’ attributes at any hierarchy level. Changes:Initial Announcement on mloss.org.

About: Cluster quality Evaluation software. Implements cluster quality metrics based on ground truths such as Purity, Entropy, Negentropy, F1 and NMI. It includes a novel approach to correct for pathological or ineffective clusterings called 'Divergence from a Random Baseline'. Changes:Initial Announcement on mloss.org.

About: MLDemos is a userfriendly visualization interface for various machine learning algorithms for classification, regression, clustering, projection, dynamical systems, reward maximisation and reinforcement learning. Changes:New Visualization and Dataset Features Added 3D visualization of samples and classification, regression and maximization results Added Visualization panel with individual plots, correlations, density, etc. Added Editing tools to drag/magnet data, change class, increase or decrease dimensions of the dataset Added categorical dimensions (indexed dimensions with nonnumerical values) Added Dataset Editing panel to swap, delete and rename dimensions, classes or categorical values Several bugfixes for display, import/export of data, classification performance New Algorithms and methodologies Added Projections to preprocess data (which can then be classified/regressed/clustered), with LDA, PCA, KernelPCA, ICA, CCA Added GridSearch panel for batchtesting ranges of values for up to two parameters at a time Added OnevsAll multiclass classification for nonmulticlass algorithms Trained models can now be kept and tested on new data (training on one dataset, testing on another) Added a dataset generator panel for standard toy datasets (e.g. swissroll, checkerboard,...) Added a number of clustering, regression and classification algorithms (FLAME, DBSCAN, LOWESS, CCA, KMEANS++, GP Classification, Random Forests) Added Save/Load Model option for GMMs and SVMs Added Growing Hierarchical Self Organizing Maps (original code by Michael Dittenbach) Added Automatic Relevance Determination for SVM with RBF kernel (Thanks to Ashwini Shukla!)
