Projects supporting the various data format.

Logo Apache Mahout 0.11.1

by gsingers - November 9, 2015, 16:12:06 CET [ Project Homepage BibTeX Download ] 18864 views, 4960 downloads, 3 subscriptions

About: Apache Mahout is an Apache Software Foundation project with the goal of creating both a community of users and a scalable, Java-based framework consisting of many machine learning algorithm [...]


Apache Mahout introduces a new math environment we call Samsara, for its theme of universal renewal. It reflects a fundamental rethinking of how scalable machine learning algorithms are built and customized. Mahout-Samsara is here to help people create their own math while providing some off-the-shelf algorithm implementations. At its core are general linear algebra and statistical operations along with the data structures to support them. You can use is as a library or customize it in Scala with Mahout-specific extensions that look something like R. Mahout-Samsara comes with an interactive shell that runs distributed operations on a Spark cluster. This make prototyping or task submission much easier and allows users to customize algorithms with a whole new degree of freedom. Mahout Algorithms include many new implementations built for speed on Mahout-Samsara. They run on Spark 1.3+ and some on H2O, which means as much as a 10x speed increase. You’ll find robust matrix decomposition algorithms as well as a Naive Bayes classifier and collaborative filtering. The new spark-itemsimilarity enables the next generation of cooccurrence recommenders that can use entire user click streams and context in making recommendations.

Logo Recur 1

by douglasbagnall - June 16, 2015, 12:06:05 CET [ Project Homepage BibTeX Download ] 1107 views, 298 downloads, 2 subscriptions

About: Recur is a collection of Gstreamer plugins and language modelling tools based on recurrent neural networks.


Initial Announcement on

Logo GESL v1.01

by bellet - May 15, 2015, 11:54:04 CET [ BibTeX BibTeX for corresponding Paper Download ] 2196 views, 877 downloads, 1 subscription

About: Learning string edit distance / similarity from data


Added datasets used in the experiments of the paper

Logo BLOG 0.9.1

by jxwuyi - April 27, 2015, 06:52:05 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 1394 views, 302 downloads, 3 subscriptions

About: Bayesian Logic (BLOG) is a probabilistic modeling language. It is designed for representing relations and uncertainties among real world objects.


Initial Announcement on

Logo Accord.NET Framework 2.14.0

by cesarsouza - December 9, 2014, 23:04:04 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 23448 views, 4729 downloads, 2 subscriptions

About: The Accord.NET Framework is a .NET machine learning framework combined with audio and image processing libraries completely written in C#. It is a complete framework for building production-grade computer vision, computer audition, signal processing and statistics applications even for commercial use. A comprehensive set of sample applications provide a fast start to get up and running quickly, and an extensive online documentation helps fill in the details.


Adding a large number of new distributions, such as Anderson-Daring, Shapiro-Wilk, Inverse Chi-Square, Lévy, Folded Normal, Shifted Log-Logistic, Kumaraswamy, Trapezoidal, U-quadratic and BetaPrime distributions, Birnbaum-Saunders, Generalized Normal, Gumbel, Power Lognormal, Power Normal, Triangular, Tukey Lambda, Logistic, Hyperbolic Secant, Degenerate and General Continuous distributions.

Other additions include new statistical hypothesis tests such as Anderson-Daring and Shapiro-Wilk; as well as support for all of LIBLINEAR's support vector machine algorithms; and format reading support for MATLAB/Octave matrices, LibSVM models, sparse LibSVM data files, and many others.

For a complete list of changes, please see the full release notes at the release details page at:

Logo MLlib 0.8

by atalwalkar - October 10, 2013, 00:56:25 CET [ Project Homepage BibTeX Download ] 2948 views, 564 downloads, 1 subscription

About: MLlib provides a distributed machine learning (ML) library to address the growing need for scalable ML. MLlib is developed in Spark (, a cluster computing system designed for iterative computation. Moreover, it is a component of a larger system called MLbase ( that aims to provide user-friendly distributed ML functionality both for ML researchers and domain experts. MLlib currently consists of scalable implementations of algorithms for classification, regression, collaborative filtering and clustering.


Initial Announcement on

Logo Orange 2.6

by janez - February 14, 2013, 18:15:08 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 14937 views, 2840 downloads, 1 subscription

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 1 vote)

About: Orange is a component-based machine learning and data mining software. It includes a friendly yet powerful and flexible graphical user interface for visual programming. For more advanced use(r)s, [...]


The core of the system (except the GUI) no longer includes any GPL code and can be licensed under the terms of BSD upon request. The graphical part remains under GPL.

Changed the BibTeX reference to the paper recently published in JMLR MLOSS.

Logo MLFlex 02-21-2012-00-12

by srp33 - April 3, 2012, 16:44:43 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 2995 views, 625 downloads, 1 subscription

About: Motivated by a need to classify high-dimensional, heterogeneous data from the bioinformatics domain, we developed ML-Flex, a machine-learning toolbox that enables users to perform two-class and multi-class classification analyses in a systematic yet flexible manner. ML-Flex was written in Java but is capable of interfacing with third-party packages written in other programming languages. It can handle multiple input-data formats and supports a variety of customizations. MLFlex provides implementations of various validation strategies, which can be executed in parallel across multiple computing cores, processors, and nodes. Additionally, ML-Flex supports aggregating evidence across multiple algorithms and data sets via ensemble learning. (See


Initial Announcement on

Logo SGD 2.0

by leonbottou - October 11, 2011, 20:59:41 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 12437 views, 1990 downloads, 5 subscriptions

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 2 votes)

About: The SGD-2.0 package contains implementations of the SGD and ASGD algorithms for linear SVMs and linear CRFs.


Version 2.0 features ASGD.