Projects supporting the svmlight data format.


Logo JMLR dlib ml 19.11

by davis685 - May 18, 2018, 04:19:52 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 449352 views, 84195 downloads, 0 subscriptions

About: This project is a C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems.

Changes:

This release adds a bunch of new image processing routines as well as many minor usability improvements and bug fixes.


Logo Harry 0.4.2

by konrad - April 16, 2016, 10:50:38 CET [ Project Homepage BibTeX Download ] 39263 views, 9011 downloads, 0 subscriptions

About: A Tool for Measuring String Similarity

Changes:

This release fixes the incorrect implementation of the bag distance.


Logo JaTeCS 1.0.0

by aesuli - April 5, 2016, 17:23:12 CET [ Project Homepage BibTeX Download ] 8795 views, 2220 downloads, 0 subscriptions

About: Jatecs is an open source Java library focused on automatic text categorization.

Changes:

Initial Announcement on mloss.org.


Logo JMLR Sally 1.0.0

by konrad - March 26, 2015, 17:01:35 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 117564 views, 22622 downloads, 0 subscriptions

About: A Tool for Embedding Strings in Vector Spaces

Changes:

Support for explicit selection of granularity added. Several minor bug fixes. We have reached 1.0


Logo JMLR SHOGUN 4.0.0

by sonne - February 5, 2015, 09:09:37 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 219545 views, 35662 downloads, 0 subscriptions

Rating Whole StarWhole StarWhole StarEmpty StarEmpty Star
(based on 6 votes)

About: The SHOGUN machine learning toolbox's focus is on large scale learning methods with focus on Support Vector Machines (SVM), providing interfaces to python, octave, matlab, r and the command line.

Changes:

This release features the work of our 8 GSoC 2014 students [student; mentors]:

  • OpenCV Integration and Computer Vision Applications [Abhijeet Kislay; Kevin Hughes]
  • Large-Scale Multi-Label Classification [Abinash Panda; Thoralf Klein]
  • Large-scale structured prediction with approximate inference [Jiaolong Xu; Shell Hu]
  • Essential Deep Learning Modules [Khaled Nasr; Sergey Lisitsyn, Theofanis Karaletsos]
  • Fundamental Machine Learning: decision trees, kernel density estimation [Parijat Mazumdar ; Fernando Iglesias]
  • Shogun Missionary & Shogun in Education [Saurabh Mahindre; Heiko Strathmann]
  • Testing and Measuring Variable Interactions With Kernels [Soumyajit De; Dino Sejdinovic, Heiko Strathmann]
  • Variational Learning for Gaussian Processes [Wu Lin; Heiko Strathmann, Emtiyaz Khan]

It also contains several cleanups and bugfixes:

Features

  • New Shogun project description [Heiko Strathmann]
  • ID3 algorithm for decision tree learning [Parijat Mazumdar]
  • New modes for PCA matrix factorizations: SVD & EVD, in-place or reallocating [Parijat Mazumdar]
  • Add Neural Networks with linear, logistic and softmax neurons [Khaled Nasr]
  • Add kernel multiclass strategy examples in multiclass notebook [Saurabh Mahindre]
  • Add decision trees notebook containing examples for ID3 algorithm [Parijat Mazumdar]
  • Add sudoku recognizer ipython notebook [Alejandro Hernandez]
  • Add in-place subsets on features, labels, and custom kernels [Heiko Strathmann]
  • Add Principal Component Analysis notebook [Abhijeet Kislay]
  • Add Multiple Kernel Learning notebook [Saurabh Mahindre]
  • Add Multi-Label classes to enable Multi-Label classification [Thoralf Klein]
  • Add rectified linear neurons, dropout and max-norm regularization to neural networks [Khaled Nasr]
  • Add C4.5 algorithm for multiclass classification using decision trees [Parijat Mazumdar]
  • Add support for arbitrary acyclic graph-structured neural networks [Khaled Nasr]
  • Add CART algorithm for classification and regression using decision trees [Parijat Mazumdar]
  • Add CHAID algorithm for multiclass classification and regression using decision trees [Parijat Mazumdar]
  • Add Convolutional Neural Networks [Khaled Nasr]
  • Add Random Forests algorithm for ensemble learning using CART [Parijat Mazumdar]
  • Add Restricted Botlzmann Machines [Khaled Nasr]
  • Add Stochastic Gradient Boosting algorithm for ensemble learning [Parijat Mazumdar]
  • Add Deep contractive and denoising autoencoders [Khaled Nasr]
  • Add Deep belief networks [Khaled Nasr]

Bugfixes

  • Fix reference counting bugs in CList when reference counting is on [Heiko Strathmann, Thoralf Klein, lambday]
  • Fix memory problem in PCA::apply_to_feature_matrix [Parijat Mazumdar]
  • Fix crash in LeastAngleRegression for the case D greater than N [Parijat Mazumdar]
  • Fix memory violations in bundle method solvers [Thoralf Klein]
  • Fix fail in library_mldatahdf5.cpp example when http://mldata.org is not working properly [Parijat Mazumdar]
  • Fix memory leaks in Vowpal Wabbit, LibSVMFile and KernelPCA [Thoralf Klein]
  • Fix memory and control flow issues discovered by Coverity [Thoralf Klein]
  • Fix R modular interface SWIG typemap (Requires SWIG >= 2.0.5) [Matt Huska]

Cleanup and API Changes

  • PCA now depends on Eigen3 instead of LAPACK [Parijat Mazumdar]
  • Removing redundant and fixing implicit imports [Thoralf Klein]
  • Hide many methods from SWIG, reducing compile memory by 500MiB [Heiko Strathmann, Fernando Iglesias, Thoralf Klein]

Logo Boosted Decision Trees and Lists 1.0.4

by melamed - July 25, 2014, 23:08:32 CET [ BibTeX Download ] 19278 views, 5572 downloads, 0 subscriptions

About: Boosting algorithms for classification and regression, with many variations. Features include: Scalable and robust; Easily customizable loss functions; One-shot training for an entire regularization path; Continuous checkpointing; much more

Changes:
  • added ElasticNets as a regularization option
  • fixed some segfaults, memory leaks, and out-of-range errors, which were creeping in in some corner cases
  • added a couple of I/O optimizations

Logo JMLR MultiBoost 1.2.02

by busarobi - March 31, 2014, 16:13:04 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 91930 views, 15282 downloads, 0 subscriptions

About: MultiBoost is a multi-purpose boosting package implemented in C++. It is based on the multi-class/multi-task AdaBoost.MH algorithm [Schapire-Singer, 1999]. Basic base learners (stumps, trees, products, Haar filters for image processing) can be easily complemented by new data representations and the corresponding base learners, without interfering with the main boosting engine.

Changes:

Major changes :

  • The “early stopping” feature can now based on any metric output with the --outputinfo command line argument.

  • Early stopping now works with --slowresume command line argument.

Minor fixes:

  • More informative output when testing.

  • Various compilation glitch with recent clang (OsX/Linux).


Logo LIBOL 0.3.0

by stevenhoi - December 12, 2013, 15:26:14 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 41761 views, 13043 downloads, 0 subscriptions

About: LIBOL is an open-source library with a family of state-of-the-art online learning algorithms for machine learning and big data analytics research. The current version supports 16 online algorithms for binary classification and 13 online algorithms for multiclass classification.

Changes:

In contrast to our last version (V0.2.3), the new version (V0.3.0) has made some important changes as follows:

• Add a template and guide for adding new algorithms;

• Improve parameter settings and make documentation clear;

• Improve documentation on data formats and key functions;

• Amend the "OGD" function to use different loss types;

• Fixed some name inconsistency and other minor bugs.


Logo KMLib sparse GPU SVM 0.1

by ksopyla - March 20, 2013, 14:30:08 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 10542 views, 3026 downloads, 0 subscriptions

About: Support Vectors Machine library in .net with CUDA support. Library includes GPU SVM solver for kernels linear,RBF,Chi-Square and Exp Chi-Square which use NVIDIA CUDA technology. It allows for classification of feature rich sparse datasets through utilization of sparse matrix formats CSR, Ellpack-R or Sliced EllR-T

Changes:

Initial Announcement on mloss.org.


Logo pGBRT, Parallel Gradient Boosted Regression Trees 0.9

by swtyree - September 16, 2011, 22:15:46 CET [ Project Homepage BibTeX Download ] 21040 views, 3243 downloads, 0 subscriptions

About: Learns gradient boosted regression tree ensembles in parallel on shared memory or cluster systems

Changes:

Initial Announcement on mloss.org.


Logo mldata-utils 0.5.0

by sonne - April 8, 2011, 10:02:44 CET [ Project Homepage BibTeX Download ] 79069 views, 17221 downloads, 0 subscriptions

About: Tools to convert datasets from various formats to various formats, performance measures and API functions to communicate with mldata.org

Changes:
  • Change task file format, such that data splits can have a variable number items and put into up to 256 categories of training/validation/test/not used/...
  • Various bugfixes.

Logo redsvd 0.1.0

by hillbig - August 30, 2010, 18:13:55 CET [ Project Homepage BibTeX Download ] 11455 views, 2762 downloads, 0 subscriptions

About: redsvd is a library for solving several matrix decomposition (SVD, PCA, eigen value decomposition) redsvd can handle very large matrix efficiently, and optimized for a truncated SVD of sparse matrices. For example, redsvd can compute a truncated SVD with top 20 singular values for a 100K x 100K matrix with 10M nonzero entries in about two second.

Changes:

Initial Announcement on mloss.org.


Logo sofia ml 0.1

by dsculley - December 29, 2009, 23:30:58 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 14295 views, 2573 downloads, 0 comments, 0 subscriptions

About: A fast implementation of several stochastic gradient descent learners for classification, ranking, and ROC area optimization, suitable for large, sparse data sets. Includes Pegasos SVM, SGD-SVM, Passive-Aggressive Perceptron, Perceptron with Margins, Logistic Regression, and ROMMA. Commandline utility and API libraries are provided.

Changes:

Initial Announcement on mloss.org.


Logo Elefant 0.4

by kishorg - October 17, 2009, 08:48:19 CET [ Project Homepage BibTeX Download ] 35183 views, 11546 downloads, 0 subscriptions

Rating Whole StarWhole Star1/2 StarEmpty StarEmpty Star
(based on 2 votes)

About: Elefant is an open source software platform for the Machine Learning community licensed under the Mozilla Public License (MPL) and developed using Python, C, and C++. We aim to make it the platform [...]

Changes:

This release contains the Stream module as a first step in the direction of providing C++ library support. Stream aims to be a software framework for the implementation of large scale online learning algorithms. Large scale, in this context, should be understood as something that does not fit in the memory of a standard desktop computer.

Added Bundle Methods for Regularized Risk Minimization (BMRM) allowing to choose from a list of loss functions and solvers (linear and quadratic).

Added the following loss classes: BinaryClassificationLoss, HingeLoss, SquaredHingeLoss, ExponentialLoss, LogisticLoss, NoveltyLoss, LeastMeanSquareLoss, LeastAbsoluteDeviationLoss, QuantileRegressionLoss, EpsilonInsensitiveLoss, HuberRobustLoss, PoissonRegressionLoss, MultiClassLoss, WinnerTakesAllMultiClassLoss, ScaledSoftMarginMultiClassLoss, SoftmaxMultiClassLoss, MultivariateRegressionLoss

Graphical User Interface provides now extensive documentation for each component explaining state variables and port descriptions.

Changed saving and loading of experiments to XML (thereby avoiding storage of large input data structures).

Unified automatic input checking via new static typing extending Python properties.

Full support for recursive composition of larger components containing arbitrary statically typed state variables.


Logo Dirichlet Forest LDA 0.1.1

by davidandrzej - July 16, 2009, 21:59:53 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 12857 views, 2685 downloads, 0 subscriptions

About: This software implements the Dirichlet Forest (DF) Prior within the Latent Dirichlet Allocation (LDA) model. When combined with LDA, the Dirichlet Forest Prior allows the user to encode domain knowledge (must-links and cannot-links between words) into the prior on topic-word multinomials.

Changes:

Initial Announcement on mloss.org.


Logo LibSGDQN 1.1

by antojne - July 2, 2009, 15:02:44 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 18985 views, 3872 downloads, 0 subscriptions

About: LibSGDQN proposes an implementation of SGD-QN, a carefully designed quasi-Newton stochastic gradient descent solver for linear SVMs.

Changes:

small bug fix (thx nicolas ;)


Logo OLaRankGreedy 1.0

by antojne - June 24, 2009, 17:07:57 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 11951 views, 2595 downloads, 0 subscriptions

About: OLaRankGreedy is an online solver of the dual formulation of support vector machines for sequence labeling using greedy inference.

Changes:

Initial Announcement on mloss.org.


Logo OLaRankExact 1.0

by antojne - June 24, 2009, 17:03:48 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 11666 views, 2661 downloads, 0 subscriptions

About: OLaRank is an online solver of the dual formulation of support vector machines for sequence labeling using viterbi decoding.

Changes:

Initial Announcement on mloss.org.


Logo BMRM 2.1

by chteo - May 8, 2009, 08:08:20 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 14556 views, 3177 downloads, 0 subscriptions

About: BMRM is an open source, modular and scalable convex solver for many machine learning problems cast in the form of regularized risk minimization problem.

Changes:

Initial Announcement on mloss.org.


Logo CoFiRank 0.1

by alexis - March 30, 2009, 17:17:34 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 14146 views, 2987 downloads, 0 subscriptions

About: CoFiRank is a Collaborative Filtering system based on matrix factorization. CoFiRank is based on the idea that it is better to predict the relative order of preferences (ranking) instead of the absolute rating.

Changes:

Initial Announcement on mloss.org.