Project details for MLPY Machine Learning Py

Screenshot MLPY Machine Learning Py 1.2.8

by albanese - January 14, 2009, 10:39:26 CET [ Project Homepage BibTeX Download ]

view (27 today), download ( 4 today ), 2 comments, 2 subscriptions

OverallWhole StarWhole StarWhole Star1/2 StarEmpty Star
FeaturesWhole StarWhole StarWhole Star1/2 StarEmpty Star
UsabilityWhole StarWhole StarWhole Star1/2 StarEmpty Star
DocumentationWhole StarWhole StarWhole Star1/2 StarEmpty Star
(based on 3 votes)
Description:

We introduce mlpy, a high-performance Python package for predictive modeling. It makes extensive use of NumPy to provide fast N-dimensional array manipulation and easy integration of C code. Mlpy provides high level procedures that support, with few lines of code, the design of rich Data Analysis Protocols (DAPs) for predictive classification and feature selection. Methods are available for feature weighting and ranking, data resampling, error evaluation and experiment landscaping. The package includes tools to measure stability in sets of ranked feature lists, of special interest in bioinformatics for functional genomics, for which large scale experiments with up to 10^6 classifiers have been run on Linux clusters and on the Grid.

The modular structure of mlpy allows easily adding new algorithms to each of the 7 categories in which the package is organized. They are:

Classification. For each algorithm, distinct methods are deployed for the training and the testing phases (whenever possible, real valued prediction can be obtained). The implemented algorithms are in the families of SVMs-Support Vector Machines (four kernels available), DA-Discriminant Analysis (Fisher, Penalized, Diagonal Linear and Spectral Regression) and Nearest Neighbours.

Feature weighting. A total of nine methods is made available to obtain weights from models such as SVMs or DAs; classifier-independent methods for weighting features are also implemented, including I-RELIEF and Discrete Wavelet Transform.

Feature ranking. Two main schemas are used for selecting and ranking purposes, belonging either to the Recursive Feature Elimination or the Recursive Forward Selection family (for a total of six variants).

Resampling methods. The classification and feature ranking operations can be organized within a sampling procedure such as Textbook/Monte-Carlo cross validation (stratification over labels is available), leave-one-out or user-defined train/test split schema.

Metric functions. Performance assessment can be evaluated by a set of different measures, including Error, Accuracy, Matthews Correlation Coefficient, Area Under the ROC Curve. Variability can assessed by Standard Deviation or Bootstrap Confidence Intervals.

Feature list analysis. The ordered lists from the feature ranking experiments can be analyzed in terms of stability (Canberra indicator, extraction/position indicator) and an optimal list can be retrieved (Borda count).

Landscaping tools. A system of executable scripts to be used off-the-shelf to tabulate performance (e.g. Error, MCC and stability measures) on a grid of different experimental conditions by a basic DAP implementation (resampling by k-fold or Monte Carlo CV, training, feature ranking, test).

mlpy is a project developed by the MPBA research unit at FBK, the Bruno Kessler Foundation in Trento, Italy (http://mpba.fbk.eu).

Changes to previous version:

Initial Announcement on mloss.org.

BibTeX Entry: Download
URL: Project Homepage
Supported Operating Systems: Linux, Macosx, Windows, Unix
Data Formats: None
Tags: Svm, Classification, Fda, Feature Weighting, Irelief, Rfe, Feature Ranking, Resampling, Srda, Nn, Dwt, Pda, Nips2008, Dlda
Archive: download here

Other available revisons

Version Changelog Date
3.5.0

New features:

  • LibSvm(): pred_probability() now returns probability estimates; pred_values() added
  • LibLinear(): pred_values() and pred_probability() added
  • dtw_std: squared Euclidean option added
  • LCS for series composed by real values (lcs_real()) added
  • Documentation

Fix:

  • wavelet submodule: cwt(): it returned only real values in morlet and poul
  • IRelief(): remove np. in learn()
  • fix rfe_kfda and rfe_w2 when p=1
March 15, 2012, 09:52:41
3.4.0

New features:

  • Standard DTW added
  • Subsequence DTW added
  • Standard LCS added

Fix:

  • LibSvm: fix error when x is a list in learn() method
  • fix code for vc++
  • fix setup.py (cblas)
January 9, 2012, 12:10:16
3.3.0

New features:

  • Maximum Likelihood Classifier added
  • Classification Tree added
  • KNN: remove labels restrictions

Fix:

  • fix elasticnet classifier doc
  • fix PCA (method paramenter): PCA method was always svd
  • setup.py: fix classifiers
  • from this version, mlpy for Windows is compiled with Visual Studio Express 2008 in order to avoid runtime errors
December 19, 2011, 11:35:05
3.2.1

Fix:

  • fix stats import in init
  • PLS: speed improved
  • remove function declaration isn't a prototype warnings from libml
  • clean findpeaks
  • mlpy works with python 3.X
  • add KNN to all
December 9, 2011, 16:12:50
3.2

Version 3.2

New features:

  • PLS added

Fix:

  • fix docs in LibSVM and KernelAdatron
  • fix svg logo
  • minor fix in LibSVM and KernelAdatron
  • include stddef.h in fastcluster
December 5, 2011, 16:20:01
3.1

Version 3.1

November 30, 2011, 16:00:02
2.2.1

New features:

  • Elastic Net
  • FSSun speeded up
  • doctests added (mlpy-tests)
  • Documentation improved

Several bugs fixed

August 17, 2010, 14:45:50
2.2.0

New features:

  • OLS
  • Ridge Regression
  • Kernel Ridge Regression
  • LASSO
  • LARS
  • Gradient Descent for Regression
  • K-Means
  • Documentation improved

Bug fixes:

  • FSSun() SigmaErrorFS fixed
July 13, 2010, 18:25:57
2.1.0

New features:

  • Svm optimal offset option added
  • FSSun for feature weighting/selection added
  • Dlda: adaptive offset for classification implemented
  • Srda: memory usage optimization, speeded up
  • added Tversky kernel for SVM

Bug fixes:

  • fixed gaussian weights for SVM
November 24, 2009, 10:27:46
2.0.8

New features:

  • HCluster: sample <-> feature in input data x. Groups are now in 0, ..., N-1
  • k-medoids added
  • minkowski distance added
  • Documentation improved

Bug fixes:

  • canberraq tool fixed
  • Svm(): MatrixKernelGaussian() for Svm.weights() speeded up
September 9, 2009, 15:22:55
2.0.7

New features:

  • New function span_pd(). three_points_pd() deprecated.
  • New Dtw class (dtw() has been removed):
    • Naive and Derivative DTW
    • Symmetric, Asymmetric, Quasi-Symmetric implementation with Slope Constraint Condition P=0
    • Sakoe-Chiba window condition option
    • Linear space-complexity implementation option
    • (0, 0) boundary condition option
  • canberra() - canberraq(): new option 'dist' returns partial distances
  • canberra - canberraq: partial distances to file(s) added
  • Documentation improved

Bug fixes:

  • Derivative DTW algorithm fixed
  • knn_imputing() inf2 bug fixed
August 28, 2009, 15:42:38
2.0.6

New features:

  • DTW and DDTW (Naive Dynamic Time Warping and Derivative Dynamic Time Warping) added
  • documentation improved
  • cwt(): option pad removed, use extmethod and extlen instead (see extend())
  • extend() function added
  • is_power(n, b) and next_power(n, b) added
July 20, 2009, 17:07:19
2.0.5

Bug fixes:

  • purify() fixed

New features:

  • knn_imputing() euclidean squared distance and median method added
June 18, 2009, 14:10:19
2.0.4
  • _imputing.py: purify() function added
  • imputing.py added; knnimputing() added
  • data_fromfile(): ytype parameter for label type added
  • knn.predict() fixed
June 16, 2009, 13:56:57
2.0.3
  • canberracore, nncore, svmcore improved
  • misc.c added (away())
  • Ranking(): onestep fixed
  • new mlpy logo
  • lmatrix_from_numpy() added; canberra*() now work with int64
  • Svm(): Problem int64 with numpy array fixed
June 3, 2009, 11:15:19
2.0.2
  • Undecimated Wavelet Trasform (uwt() and iuwt()) added
  • Documentation improved
  • cdf_gaussian_P() added
May 18, 2009, 12:08:40
2.0.1
  • Three points peaks detection added
  • Miscellaneous documentation improved
  • _wavelet.py removed
  • icwt() sped up
April 27, 2009, 13:30:37
2.0.0

Initial Announcement on mloss.org.

April 17, 2009, 20:36:45
1.2.8

Initial Announcement on mloss.org.

February 15, 2008, 09:32:35

Comments

jacob Yang (on April 30, 2010, 14:24:11)

when the program is running, there is no output. I don't know when it will be finish.

Michele Filosi (on December 13, 2011, 10:04:04)

Very useful and well implemented!

Leave a comment

You must be logged in to post comments.