-
- Description:
Milk is a machine learning toolkit in Python.
Its focus is on supervised classification with several classifiers available: SVMs (based on libsvm), k-NN, random forests, decision trees. It also performs feature selection. These classifiers can be combined in many ways to form different classification systems. It works over many datatypes, with a preference for numpy arrays.
For unsupervised learning, milk supports k-means clustering and affinity propagation.
- Changes to previous version:
Added LASSO (using coordinate descent optimization). Made SVM classification (learning and applying) much faster: 2.5x speedup on yeast UCI dataset.
- BibTeX Entry: Download
- Supported Operating Systems: Agnostic
- Data Formats: None, Agnostic
- Tags: Python, Svm, Feature Selection, Kmeans, Decision Tree Learning, Random Forests, Supervised, Libsvm, Affinity Propagation, Nonnegative Matrix Factorization
- Archive: download here
Other available revisons
-
Version Changelog Date 0.5 Added LASSO (using coordinate descent optimization). Made SVM classification (learning and applying) much faster: 2.5x speedup on yeast UCI dataset.
November 7, 2012, 13:08:28 0.4.3 Multiprocessing support (for cross-validation); more ways of handling multi-label problems (error-correcting output codes, trees); very significant performance increases. Many bug fixes.
September 23, 2012, 14:31:44 0.3.10 Added a new module: milk.ext.jugparallel to interface with jug (http://luispedro.org/software/jug). This makes it easy to parallelise things such as n-fold cross validation (each fold runs on its own processor) or multiple kmeans random starts.
Add some new functions: measures.curves.precision_recall, milk.unsupervised.kmeans.select_best.kmeans.
Fixed a tricky bug in SDA and a few minor issues elsewhere
May 11, 2011, 04:18:53 0.3.9 Speed improvements. Bug fixes. Added
folds
argument to nfoldcrossvalidation. Addedassign_centroids
functionMarch 18, 2011, 22:45:11 0.3.8 Fix compilation on Windows.
February 12, 2011, 23:53:03 0.3.7 - Logistic regression
- Source demos included (in source and documentation)
- Add cluster agreement metrics
- Fix nfoldcrossvalidation bug when using origins
February 10, 2011, 15:32:54 0.3.6 - Unsupervised (1-class) kernel density modeling
- Fix for when SDA returns empty
- weights option to some learners
- stump learner
- Adaboost (result of above changes)
December 20, 2010, 19:04:15 0.3.5 - fixes for 64-bit machines
November 4, 2010, 05:25:35 0.3.4 - Random forest learners
- Decision trees sped up 20x
- Much faster gridsearch (finds optimum without computing all folds)
November 1, 2010, 02:01:11 0.3.1 - fix sparse non-negative matrix factorisation
- mean grouped classifier
- update multi classifier to newer interface
- documentation & testing fixes
September 26, 2010, 23:46:27 0.3 - no scipy.weave dependency
- flatter namespace
- faster kmeans
- affinity propagation (borrowed from scikits-learn & slightly improved to take less memory and time)
- pdist()
- more documentation
September 24, 2010, 01:24:30 0.2 Cleaned up and tested code. Removed some dependencies. Better documentation. Changed the classification interface to separate model learning from model usage.
May 21, 2010, 22:05:04 alpha-1 Improved Performance. Removed files from the distribution that were mistakenly included.
December 17, 2009, 18:44:18 alpha-0 Initial Announcement on mloss.org.
November 24, 2009, 00:16:42
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.