-
- Description:
mlpack is a scalable C++ machine learning library. Its aim is to make large-scale machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and maximum flexibility for expert users.
The following methods are provided:
- Collaborative Filtering (with NMF)
- Decision Stumps
- Density Estimation Trees
- Euclidean Minimum Spanning Trees
- Fast Exact Max-Kernel Search (FastMKS)
- Gaussian Mixture Models (GMMs)
- Hidden Markov Models (HMMs)
- Hoeffding trees (streaming decision trees)
- Kernel Principal Components Analysis (KPCA)
- K-Means Clustering
- Least-Angle Regression (LARS/LASSO)
- Local Coordinate Coding
- Locality-Sensitive Hashing (LSH)
- Logistic regression
- Naive Bayes Classifier
- Neighborhood Components Analysis (NCA)
- Nonnegative Matrix Factorization (NMF)
- Perceptron
- Principal Components Analysis (PCA)
- QUIC-SVD
- RADICAL (ICA)
- Regularized SVD
- Rank-Approximate Nearest Neighbor (RANN)
- Simple Least-Squares Linear Regression (and Ridge Regression)
- Sparse Autoencoder
- Sparse Coding
- Tree-based Neighbor Search (all-k-nearest-neighbors, all-k-furthest-neighbors), using either kd-trees or cover trees
- Tree-based Range Search
Command-line executables are provided for each of these, and the C++ classes which define the methods are highly flexible, extensible, and modular. More information (including documentation, tutorials, and bug reports) is available at http://www.mlpack.org/.
- Changes to previous version:
- Removed overclustering support from k-means because it is not well-tested, may be buggy, and is (I think) unused. If this was support you were using, open a bug or get in touch with us; it would not be hard for us to reimplement it.
- Refactored KMeans to allow different types of Lloyd iterations.
- Added implementations of k-means: Elkan's algorithm, Hamerly's algorithm, Pelleg-Moore's algorithm, and the DTNN (dual-tree nearest neighbor) algorithm.
- Significant acceleration of LRSDP via the use of accu(a % b) instead of trace(a * b).
- Added MatrixCompletion class (matrix_completion), which performs nuclear norm minimization to fill unknown values of an input matrix.
- No more dependence on Boost.Random; now we use C++11 STL random support.
- Add softmax regression, contributed by Siddharth Agrawal and QiaoAn Chen.
- Changed NeighborSearch, RangeSearch, FastMKS, LSH, and RASearch API; these classes now take the query sets in the Search() method, instead of in the constructor.
- Use OpenMP, if available. For now OpenMP support is only available in the DET training code.
- Add support for predicting new test point values to LARS and the command-line 'lars' program.
- Add serialization support for Perceptron and LogisticRegression.
- Refactor SoftmaxRegression to predict into an arma::Row object, and add a softmax_regression program.
- Refactor LSH to allow loading and saving of models.
- ToString() is removed entirely (#487).
- Add --input_model_file and --output_model_file options to appropriate machine learning algorithms.
- Rename all executables to start with an "mlpack" prefix (#229).
See also https://mailman.cc.gatech.edu/pipermail/mlpack/2015-December/000706.html for more information.
- BibTeX Entry: Download
- Corresponding Paper BibTeX Entry: Download
- Supported Operating Systems: Platform Independent
- Data Formats: Plain Ascii, Ascii, Txt, Hdf, Bin, Csv, Xml
- Tags: Gmm, Hmm, Machine Learning, Sparse, Dual Tree, Fast, Scalable, Tree
- Archive: download here
Comments
-
- Eileen (on February 13, 2009, 12:13:23)
- having this problem when running fl-build-all /bin/sh: g++4: not found make: *** [$FASTLIBPATH/bin/i686_Linux_fast_gcc4_-DDISABLE_DISK_MATRIX/obj/mlpack_allnn_main.o] Error 127 and a whole lot of similar error Am i missing something?
-
- fastlab (on February 14, 2009, 03:55:05)
- You need to install gcc 4. Which platform are you running on?
-
- Paul Rodriguez (on December 21, 2010, 21:38:24)
- Hi, I've set up the ccmake configuration options as appropriate but now I'm having trouble with the make command described below, thanks, Paul Rodriguez Using a santos linux, on an intel 64 bit processor, when I execute "make install" I get the following error regarding pthread_atfork: -- A library with BLAS API found. -- A library with BLAS API found. -- A library with LAPACK API found. -- Configuring done -- Generating done -- Build files have been written to: /users/sdsc/prodriguez/mlpack-0.2/fastlib/build [ 2%] Built target template_types [ 5%] Built target template_types_detect [ 17%] Built target base [ 20%] Built target col [ 23%] Built target file [ 30%] Built target fx [ 33%] Built target la [ 35%] Built target data [ 35%] Built target tree [ 43%] Built target math [ 46%] Built target par [ 87%] Built target fastlib [ 89%] Built target otrav_test [ 92%] Built target col_test [ 94%] Building CXX object fastlib/data/CMakeFiles/dataset_test.dir/dataset_test.cc.o Linking CXX executable dataset_test /rmount/usr_apps/compilers/intel/Compiler/11.1/038/lib/intel64/libguide.so: undefined reference to `pthread_atfork' collect2: ld returned 1 exit status make[2]: *** [fastlib/data/dataset_test] Error 1 make[1]: *** [fastlib/data/CMakeFiles/dataset_test.dir/all] Error 2 make: *** [all] Error 2
-
- Andreas Mueller (on March 20, 2012, 13:29:07)
- Two comments: 1) I have not found a way to contact the project on the project website. Having to come to mloss and logging in to contact the developers seems a bit weird. 2) mlpack does not seems to build with armadilla in a non-standard location. After trying to feed cmake the correct pathes for a while I gave up and installed globally. In particular, setting the paths in the CMake configuration doesn't help much. Would be cool if you could fix that. Cheers, Andy
-
- Ryan Curtin (on March 20, 2012, 20:22:49)
- Hello Andy, I've clarified www.mlpack.org a bit to note that the Trac site is where bugs can be filed. As for finding Armadillo, I have not had a problem doing the following (in this instance, I've got Armadillo 2.99.1 built in /home/ryan/src/armadillo-2.99.1/) `build$ cmake -D ARMADILLO_INCLUDE_DIR=/home/ryan/src/armadillo-2.99.1/build/ -D ARMADILLO_LIBRARY=/home/ryan/src/armadillo-2.99.1/libarmadillo.so ../` Did those two variables (ARMADILLO_INCLUDE_DIR and ARMADILLO_LIBRARY) not work for you? If you're still having problems (or have other problems) feel free to file a ticket at http://trac.research.cc.gatech.edu/fastlab/
Leave a comment
You must be logged in to post comments.