Project details for MLPACK

Logo JMLR MLPACK 2.0.0

by rcurtin - January 11, 2016, 17:24:35 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view ( today), download ( today ), 5 comments, 0 subscriptions

OverallWhole StarWhole StarWhole StarWhole Star1/2 Star
FeaturesWhole StarWhole StarWhole StarWhole StarWhole Star
UsabilityWhole StarWhole StarWhole StarWhole StarWhole Star
DocumentationWhole StarWhole StarWhole StarWhole StarEmpty Star
(based on 1 vote)
Description:

mlpack is a scalable C++ machine learning library. Its aim is to make large-scale machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and maximum flexibility for expert users.

The following methods are provided:

  • Collaborative Filtering (with NMF)
  • Decision Stumps
  • Density Estimation Trees
  • Euclidean Minimum Spanning Trees
  • Fast Exact Max-Kernel Search (FastMKS)
  • Gaussian Mixture Models (GMMs)
  • Hidden Markov Models (HMMs)
  • Hoeffding trees (streaming decision trees)
  • Kernel Principal Components Analysis (KPCA)
  • K-Means Clustering
  • Least-Angle Regression (LARS/LASSO)
  • Local Coordinate Coding
  • Locality-Sensitive Hashing (LSH)
  • Logistic regression
  • Naive Bayes Classifier
  • Neighborhood Components Analysis (NCA)
  • Nonnegative Matrix Factorization (NMF)
  • Perceptron
  • Principal Components Analysis (PCA)
  • QUIC-SVD
  • RADICAL (ICA)
  • Regularized SVD
  • Rank-Approximate Nearest Neighbor (RANN)
  • Simple Least-Squares Linear Regression (and Ridge Regression)
  • Sparse Autoencoder
  • Sparse Coding
  • Tree-based Neighbor Search (all-k-nearest-neighbors, all-k-furthest-neighbors), using either kd-trees or cover trees
  • Tree-based Range Search

Command-line executables are provided for each of these, and the C++ classes which define the methods are highly flexible, extensible, and modular. More information (including documentation, tutorials, and bug reports) is available at http://www.mlpack.org/.

Changes to previous version:
  • Removed overclustering support from k-means because it is not well-tested, may be buggy, and is (I think) unused. If this was support you were using, open a bug or get in touch with us; it would not be hard for us to reimplement it.
  • Refactored KMeans to allow different types of Lloyd iterations.
  • Added implementations of k-means: Elkan's algorithm, Hamerly's algorithm, Pelleg-Moore's algorithm, and the DTNN (dual-tree nearest neighbor) algorithm.
  • Significant acceleration of LRSDP via the use of accu(a % b) instead of trace(a * b).
  • Added MatrixCompletion class (matrix_completion), which performs nuclear norm minimization to fill unknown values of an input matrix.
  • No more dependence on Boost.Random; now we use C++11 STL random support.
  • Add softmax regression, contributed by Siddharth Agrawal and QiaoAn Chen.
  • Changed NeighborSearch, RangeSearch, FastMKS, LSH, and RASearch API; these classes now take the query sets in the Search() method, instead of in the constructor.
  • Use OpenMP, if available. For now OpenMP support is only available in the DET training code.
  • Add support for predicting new test point values to LARS and the command-line 'lars' program.
  • Add serialization support for Perceptron and LogisticRegression.
  • Refactor SoftmaxRegression to predict into an arma::Row object, and add a softmax_regression program.
  • Refactor LSH to allow loading and saving of models.
  • ToString() is removed entirely (#487).
  • Add --input_model_file and --output_model_file options to appropriate machine learning algorithms.
  • Rename all executables to start with an "mlpack" prefix (#229).

See also https://mailman.cc.gatech.edu/pipermail/mlpack/2015-December/000706.html for more information.

BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
Supported Operating Systems: Platform Independent
Data Formats: Plain Ascii, Ascii, Txt, Hdf, Bin, Csv, Xml
Tags: Gmm, Hmm, Machine Learning, Sparse, Dual Tree, Fast, Scalable, Tree
Archive: download here

Comments

Eileen (on February 13, 2009, 12:13:23)
having this problem when running fl-build-all /bin/sh: g++4: not found make: *** [$FASTLIBPATH/bin/i686_Linux_fast_gcc4_-DDISABLE_DISK_MATRIX/obj/mlpack_allnn_main.o] Error 127 and a whole lot of similar error Am i missing something?
fastlab (on February 14, 2009, 03:55:05)
You need to install gcc 4. Which platform are you running on?
Paul Rodriguez (on December 21, 2010, 21:38:24)
Hi, I've set up the ccmake configuration options as appropriate but now I'm having trouble with the make command described below, thanks, Paul Rodriguez Using a santos linux, on an intel 64 bit processor, when I execute "make install" I get the following error regarding pthread_atfork: -- A library with BLAS API found. -- A library with BLAS API found. -- A library with LAPACK API found. -- Configuring done -- Generating done -- Build files have been written to: /users/sdsc/prodriguez/mlpack-0.2/fastlib/build [ 2%] Built target template_types [ 5%] Built target template_types_detect [ 17%] Built target base [ 20%] Built target col [ 23%] Built target file [ 30%] Built target fx [ 33%] Built target la [ 35%] Built target data [ 35%] Built target tree [ 43%] Built target math [ 46%] Built target par [ 87%] Built target fastlib [ 89%] Built target otrav_test [ 92%] Built target col_test [ 94%] Building CXX object fastlib/data/CMakeFiles/dataset_test.dir/dataset_test.cc.o Linking CXX executable dataset_test /rmount/usr_apps/compilers/intel/Compiler/11.1/038/lib/intel64/libguide.so: undefined reference to `pthread_atfork' collect2: ld returned 1 exit status make[2]: *** [fastlib/data/dataset_test] Error 1 make[1]: *** [fastlib/data/CMakeFiles/dataset_test.dir/all] Error 2 make: *** [all] Error 2
Andreas Mueller (on March 20, 2012, 13:29:07)
Two comments: 1) I have not found a way to contact the project on the project website. Having to come to mloss and logging in to contact the developers seems a bit weird. 2) mlpack does not seems to build with armadilla in a non-standard location. After trying to feed cmake the correct pathes for a while I gave up and installed globally. In particular, setting the paths in the CMake configuration doesn't help much. Would be cool if you could fix that. Cheers, Andy
Ryan Curtin (on March 20, 2012, 20:22:49)
Hello Andy, I've clarified www.mlpack.org a bit to note that the Trac site is where bugs can be filed. As for finding Armadillo, I have not had a problem doing the following (in this instance, I've got Armadillo 2.99.1 built in /home/ryan/src/armadillo-2.99.1/) `build$ cmake -D ARMADILLO_INCLUDE_DIR=/home/ryan/src/armadillo-2.99.1/build/ -D ARMADILLO_LIBRARY=/home/ryan/src/armadillo-2.99.1/libarmadillo.so ../` Did those two variables (ARMADILLO_INCLUDE_DIR and ARMADILLO_LIBRARY) not work for you? If you're still having problems (or have other problems) feel free to file a ticket at http://trac.research.cc.gatech.edu/fastlab/

Leave a comment

You must be logged in to post comments.