Project details for Milk

Screenshot Milk 0.5

by luispedro - November 7, 2012, 13:08:28 CET [ Project Homepage BibTeX Download ]

view (1 today), download ( 0 today ), 0 subscriptions

OverallWhole StarWhole StarWhole StarEmpty StarEmpty Star
FeaturesWhole StarWhole Star1/2 StarEmpty StarEmpty Star
UsabilityWhole StarWhole StarWhole StarEmpty StarEmpty Star
DocumentationWhole StarWhole StarWhole Star1/2 StarEmpty Star
(based on 2 votes)

Milk is a machine learning toolkit in Python.

Its focus is on supervised classification with several classifiers available: SVMs (based on libsvm), k-NN, random forests, decision trees. It also performs feature selection. These classifiers can be combined in many ways to form different classification systems. It works over many datatypes, with a preference for numpy arrays.

For unsupervised learning, milk supports k-means clustering and affinity propagation.

Changes to previous version:

Added LASSO (using coordinate descent optimization). Made SVM classification (learning and applying) much faster: 2.5x speedup on yeast UCI dataset.

BibTeX Entry: Download
Supported Operating Systems: Agnostic
Data Formats: None, Agnostic
Tags: Python, Svm, Feature Selection, Kmeans, Decision Tree Learning, Random Forests, Supervised, Libsvm, Affinity Propagation, Nonnegative Matrix Factorization
Archive: download here

Other available revisons

Version Changelog Date

Added LASSO (using coordinate descent optimization). Made SVM classification (learning and applying) much faster: 2.5x speedup on yeast UCI dataset.

November 7, 2012, 13:08:28

Multiprocessing support (for cross-validation); more ways of handling multi-label problems (error-correcting output codes, trees); very significant performance increases. Many bug fixes.

September 23, 2012, 14:31:44
  • Added a new module: milk.ext.jugparallel to interface with jug ( This makes it easy to parallelise things such as n-fold cross validation (each fold runs on its own processor) or multiple kmeans random starts.

  • Add some new functions: measures.curves.precision_recall, milk.unsupervised.kmeans.select_best.kmeans.

  • Fixed a tricky bug in SDA and a few minor issues elsewhere

May 11, 2011, 04:18:53

Speed improvements. Bug fixes. Added folds argument to nfoldcrossvalidation. Added assign_centroids function

March 18, 2011, 22:45:11

Fix compilation on Windows.

February 12, 2011, 23:53:03
  • Logistic regression
  • Source demos included (in source and documentation)
  • Add cluster agreement metrics
  • Fix nfoldcrossvalidation bug when using origins
February 10, 2011, 15:32:54
  • Unsupervised (1-class) kernel density modeling
  • Fix for when SDA returns empty
  • weights option to some learners
  • stump learner
  • Adaboost (result of above changes)
December 20, 2010, 19:04:15
  • fixes for 64-bit machines
November 4, 2010, 05:25:35
  • Random forest learners
  • Decision trees sped up 20x
  • Much faster gridsearch (finds optimum without computing all folds)
November 1, 2010, 02:01:11
  • fix sparse non-negative matrix factorisation
  • mean grouped classifier
  • update multi classifier to newer interface
  • documentation & testing fixes
September 26, 2010, 23:46:27
  • no scipy.weave dependency
  • flatter namespace
  • faster kmeans
  • affinity propagation (borrowed from scikits-learn & slightly improved to take less memory and time)
  • pdist()
  • more documentation
September 24, 2010, 01:24:30

Cleaned up and tested code. Removed some dependencies. Better documentation. Changed the classification interface to separate model learning from model usage.

May 21, 2010, 22:05:04

Improved Performance. Removed files from the distribution that were mistakenly included.

December 17, 2009, 18:44:18

Initial Announcement on

November 24, 2009, 00:16:42


No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.