Project details for MLPY Machine Learning Py

Screenshot MLPY Machine Learning Py 1.2.7

by albanese - November 11, 2008, 12:13:33 CET [ Project Homepage BibTeX Download ]

3601 views (10 today), 593 downloads ( 0 today ), 2 subscriptions

Description:

We introduce mlpy, a high-performance Python package for predictive modeling. It makes extensive use of NumPy to provide fast N-dimensional array manipulation and easy integration of C code. Mlpy provides high level procedures that support, with few lines of code, the design of rich Data Analysis Protocols (DAPs) for predictive classification and feature selection. Methods are available for feature weighting and ranking, data resampling, error evaluation and experiment landscaping. The package includes tools to measure stability in sets of ranked feature lists, of special interest in bioinformatics for functional genomics, for which large scale experiments with up to 10^6 classifiers have been run on Linux clusters and on the Grid.

The modular structure of mlpy allows easily adding new algorithms to each of the 7 categories in which the package is organized. They are:

Classification. For each algorithm, distinct methods are deployed for the training and the testing phases (whenever possible, real valued prediction can be obtained). The implemented algorithms are in the families of SVMs-Support Vector Machines (four kernels available), DA-Discriminant Analysis (Fisher, Penalized and Spectral Regression) and Nearest Neighbours.

Feature weighting. A total of nine methods is made available to obtain weights from models such as SVMs or DAs; classifier-independent methods for weighting features are also implemented, including I-RELIEF and Discrete Wavelet Transform.

Feature ranking. Two main schemas are used for selecting and ranking purposes, belonging either to the Recursive Feature Elimination or the Recursive Forward Selection family (for a total of six variants).

Resampling methods. The classification and feature ranking operations can be organized within a sampling procedure such as Textbook/Monte-Carlo cross validation (stratification over labels is available), leave-one-out or user-defined train/test split schema.

Metric functions. Performance assessment can be evaluated by a set of different measures, including Error, Accuracy, Matthews Correlation Coefficient, Area Under the ROC Curve. Variability can assessed by Standard Deviation or Bootstrap Confidence Intervals.

Feature list analysis. The ordered lists from the feature ranking experiments can be analyzed in terms of stability (Canberra indicator, extraction/position indicator) and an optimal list can be retrieved (Borda count).

Landscaping tools. A system of executable scripts to be used off-the-shelf to tabulate performance (e.g. Error, MCC and stability measures) on a grid of different experimental conditions by a basic DAP implementation (resampling by k-fold or Monte Carlo CV, training, feature ranking, test).

mlpy is a project developed by the MPBA research unit at FBK, the Bruno Kessler Foundation in Trento, Italy (http://mpba.fbk.eu).

BibTeX Entry:
Download
URL:
Project Homepage
Supported Operating Systems:
Linux, Windows, Unix
Tags:
Svm, Classification, Fda, Feature Weighting, Irelief, Rfe, Feature Ranking, Resampling, Srda, Nn, Dwt, Pda, Nips2008
Archive:
download here

Comments

No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

(will not appear publicly)

You may use Markdown syntax here, but raw HTML will be removed.