Projects that are tagged with large scale.


Logo JMLR SHOGUN 2.1.0

by sonne - March 17, 2013, 13:59:34 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 41171 views, 8617 downloads, 4 subscriptions

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 5 votes)

About: The SHOGUN machine learning toolbox's focus is on large scale learning methods with focus on Support Vector Machines (SVM), providing interfaces to python, octave, matlab, r and the command line.

Changes:

This release also contains several enhancements, cleanups and bugfixes:

Features

  • Linear Time MMD two-sample test now works on streaming-features, which allows to perform tests on infinite amounts of data. A block size may be specified for fast processing. The below features were also added. By Heiko Strathmann.
  • It is now possible to ask streaming features to produce an instance of streamed features that are stored in memory and returned as a CFeatures* object of corresponding type. See CStreamingFeatures::get_streamed_features().
  • New concept of artificial data generator classes: Based on streaming features. First implemented instances are CMeanShiftDataGenerator and CGaussianBlobsDataGenerator. Use above new concepts to get non-streaming data if desired.
  • Accelerated projected gradient multiclass logistic regression classifier by Sergey Lisitsyn.
  • New CCSOSVM based structured output solver by Viktor Gal
  • A collection of kernel selection methods for MMD-based kernel two- sample tests, including optimal kernel choice for single and combined kernels for the linear time MMD. This finishes the kernel MMD framework and also comes with new, more illustrative examples and tests. By Heiko Strathmann.
  • Alpha version of Perl modular interface developed by Christian Montanari.
  • New framework for unit-tests based on googletest and googlemock by Viktor Gal. A (growing) number of unit-tests from now on ensures basic funcionality of our framework. Since the examples do not have to take this role anymore, they should become more ilustrative in the future.
  • Changed the core of dimension reduction algorithms to the Tapkee library.

Bugfixes

  • Fix for shallow copy of gaussian kernel by Matt Aasted.
  • Fixed a bug when using StringFeatures along with kernel machines in cross-validation which cause an assertion error. Thanks to Eric (yoo)!
  • Fix for 3-class case training of MulticlassLibSVM reported by Arya Iranmehr that was suggested by Oksana Bayda.
  • Fix for wrong Spectrum mismatch RBF construction in static interfaces reported by Nona Kermani.
  • Fix for wrong include in SGMatrix causing build fail on Mac OS X (thanks to @bianjiang).
  • Fixed a bug that caused kernel machines to return non-sense when using custom kernel matrices with subsets attached to them.
  • Fix for parameter dictionary creationg causing dereferencing null pointers with gaussian processes parameter selection.
  • Fixed a bug in exact GP regression that caused wrong results.
  • Fixed a bug in exact GP regression that produced memory errors/crashes.
  • Fix for a bug with static interfaces causing all outputs to be -1/+1 instead of real scores (reported by Kamikawa Masahisa).

Cleanup and API Changes

  • SGStringList is now based on SGReferencedData.
  • "confidences" in context of CLabel and subclasses are now "values".
  • CLinearTimeMMD constructor changes, only streaming features allowed.
  • CDataGenerator will soon be removed and replaced by new streaming- based classes.
  • SGVector, SGMatrix, SGSparseVector, SGSparseVector, SGSparseMatrix refactoring: Now contains load/save routines, relevant functions from CMath, and implementations went to .cpp file.

Logo UniverSVM 1.22

by fabee - October 16, 2012, 11:24:12 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 12161 views, 1854 downloads, 0 subscriptions

About: The UniverSVM is a SVM implementation written in C/C++. Its functionality comprises large scale transduction via CCCP optimization, sparse solutions via CCCP optimization and data-dependent [...]

Changes:

Minor changes: fix bug on set_alphas_b0 function (thanks to Ferdinand Kaiser - ferdinand.kaiser@tut.fi)


Logo Linear SVM with general regularization 1.0

by rflamary - October 5, 2012, 15:34:21 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 865 views, 238 downloads, 1 subscription

About: This package is an implementation of a linear svm solver with a wide class of regularizations on the svm weight vector (l1, l2, mixed norm l1-lq, adaptive lasso). We provide solvers for the classical single task svm problem and for multi-task with joint feature selection or similarity promoting term.

Changes:

Initial Announcement on mloss.org.


Logo SGD 2.0

by leonbottou - October 11, 2011, 20:59:41 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 7599 views, 1213 downloads, 5 subscriptions

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 2 votes)

About: The SGD-2.0 package contains implementations of the SGD and ASGD algorithms for linear SVMs and linear CRFs.

Changes:

Version 2.0 features ASGD.


Logo Universal Java Matrix Package 0.2.5

by arndt - February 9, 2010, 15:55:23 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 8566 views, 1508 downloads, 1 subscription

About: The Universal Java Matrix Package (UJMP) is a data processing tool for Java. Unlike JAMA and Colt, it supports multi-threading and is therefore much faster on current hardware. It does not only support matrices with double values, but instead handles every type of data as a matrix through a common interface, e.g. CSV files, Excel files, images, WAVE audio files, tables in SQL data bases, and much more.

Changes:

Initial Announcement on mloss.org.


Logo LASVM 1.1

by leonbottou - August 3, 2009, 15:50:30 CET [ Project Homepage BibTeX Download ] 7074 views, 1195 downloads, 0 subscriptions

About: Reference implementation of the LASVM online and active SVM algorithms as described in the JMLR paper. The interesting bit is a small C library that implements the LASVM process and reprocess [...]

Changes:

Minor bug fix


Logo LibSGDQN 1.1

by antojne - July 2, 2009, 15:02:44 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 5053 views, 933 downloads, 1 subscription

About: LibSGDQN proposes an implementation of SGD-QN, a carefully designed quasi-Newton stochastic gradient descent solver for linear SVMs.

Changes:

small bug fix (thx nicolas ;)


Logo Aleph 0.6

by jiria - January 12, 2009, 20:52:12 CET [ Project Homepage BibTeX Download ] 5284 views, 1626 downloads, 1 subscription

About: Aleph is both a multi-platform machine learning framework aimed at simplicity and performance, and a library of selected state-of-the-art algorithms.

Changes:

Initial Announcement on mloss.org.


Logo Disco 0.1

by tuulos - October 6, 2008, 11:14:48 CET [ Project Homepage BibTeX Download ] 5137 views, 1022 downloads, 1 subscription

About: Disco is an open-source implementation of the [Map-Reduce framework](http://en.wikipedia.org/wiki/MapReduce) for distributed computing. As the original framework, Disco supports parallel [...]

Changes:

Initial Announcement on mloss.org.


Logo Sleipnir 1.0

by chuttenh - June 30, 2008, 03:22:19 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 4854 views, 982 downloads, 1 subscription

About: The Sleipnir C++ library implements a variety of machine learning and data manipulation algorithms focusing on heterogeneous data integration and efficiency for large biological data collections.

Changes:

Initial Announcement on mloss.org.


Logo mSplicer 0.3

by sonne - May 18, 2008, 13:07:40 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 4743 views, 947 downloads, 3 subscriptions

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 2 votes)

About: For modern biology, precise genome annotations are of prime importance as they allow the accurate definition of genic regions. We employ state of the art machine learning methods to assay and [...]

Changes:

Initial Announcement on mloss.org.


Logo GPDT Gradient Projection Decomposition Technique 1.01

by sezaza - December 21, 2007, 20:10:43 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 7454 views, 1274 downloads, 1 subscription

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 1 vote)

About: This is a C++ software designed to train large-scale SVMs for binary classification. The algorithm is also implemented in parallel (**PGPDT**) for distributed memory, strictly coupled multiprocessor [...]

Changes:

Initial Announcement on mloss.org.


Logo RapidMiner 4.0

by ingomierswa - November 16, 2007, 02:31:48 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 13039 views, 2289 downloads, 0 comments, 0 subscriptions

Rating Whole StarWhole StarWhole StarWhole StarWhole Star
(based on 5 votes)

About: RapidMiner (formerly YALE) is one of the most widely used open-source data mining suites and software solutions due to its leading-edge technologies and its functional range. Applications of [...]

Changes:

Initial Announcement on mloss.org.