Projects supporting the lucene data format.

Logo Apache Mahout 0.11.1

by gsingers - November 9, 2015, 16:12:06 CET [ Project Homepage BibTeX Download ] 22718 views, 5860 downloads, 3 subscriptions

About: Apache Mahout is an Apache Software Foundation project with the goal of creating both a community of users and a scalable, Java-based framework consisting of many machine learning algorithm [...]


Apache Mahout introduces a new math environment we call Samsara, for its theme of universal renewal. It reflects a fundamental rethinking of how scalable machine learning algorithms are built and customized. Mahout-Samsara is here to help people create their own math while providing some off-the-shelf algorithm implementations. At its core are general linear algebra and statistical operations along with the data structures to support them. You can use is as a library or customize it in Scala with Mahout-specific extensions that look something like R. Mahout-Samsara comes with an interactive shell that runs distributed operations on a Spark cluster. This make prototyping or task submission much easier and allows users to customize algorithms with a whole new degree of freedom. Mahout Algorithms include many new implementations built for speed on Mahout-Samsara. They run on Spark 1.3+ and some on H2O, which means as much as a 10x speed increase. You’ll find robust matrix decomposition algorithms as well as a Naive Bayes classifier and collaborative filtering. The new spark-itemsimilarity enables the next generation of cooccurrence recommenders that can use entire user click streams and context in making recommendations.

Logo Universal Java Matrix Package 0.3.0

by arndt - July 31, 2015, 14:23:14 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 14488 views, 2749 downloads, 3 subscriptions

About: The Universal Java Matrix Package (UJMP) is a data processing tool for Java. Unlike JAMA and Colt, it supports multi-threading and is therefore much faster on current hardware. It does not only support matrices with double values, but instead handles every type of data as a matrix through a common interface, e.g. CSV files, Excel files, images, WAVE audio files, tables in SQL data bases, and much more.


Updated to version 0.3.0