-
- Description:
This is a C++ software designed to train large-scale SVMs for binary classification. The algorithm is also implemented in parallel (PGPDT) for distributed memory, strictly coupled multiprocessor systems.
GPDT uses the Joachims' problem decomposition technique to split the whole quadratic programming (QP) problem into a sequence of smaller QP subproblems. The particular feature of this software is that it is explicitly designed to use subproblems sized even much larger than two (up to hundreds or thousands variables), each one being solved by a suitable gradient projection method (GPM). The currently implemented GPMs are the Generalized Variable Projection Method GVPM [1] and the Dai-Fletcher Method DFGPM [2].
GPDT can read/write data in the SVMlight format (T. Joachims, 1998) and implements very effective caching and working set selection strategies. It is mainly targeted to heavy binary classification tasks, with large amount of data and costly nonlinear kernels, but it can also be effectively used with small data sets and linear kernel.
GPDT is designed as a tool and not as a complete Machine Learning system. For this reason it is command-line oriented and does not have a native graphical user interface (GUI). However, it is embedded in other GUI-oriented ML systems such as SHOGUN (by G. Raetsch, S. Sonnenburg). The command-line syntax is very similar to the one of SVM[HTML_REMOVED][HTML_REMOVED]light[HTML_REMOVED][HTML_REMOVED].
Currently GPDT solves binary classification problems by standard SMVs. In the upcoming version additional features will be available, such as regression and possibly multiple-class classification.
The parallel version PGPDT implements an MPI-based approach and is described in [3].
This work was supported by the Italian FIRB Projects Statistical Learning: Theory, Algorithms and Applications (grant RBAU01877P, http://slipguru.disi.unige.it/ASTA) and Parallel Algorithms and Numerical Nonlinear Optimization (grant RBAU01JYPN, http://dm.unife.it/pn2o).
Authors
- Thomas Serafini, Luca Zanni Department of Mathematics, University of Modena and Reggio Emilia - ITALY serafini.thomas@unimo.it, zanni.luca@unimo.it
- Gaetano Zanghirati Department of Mathematics, University of Ferrara - ITALY g.zanghirati@unife.it
Copyright (C) 2004 by T. Serafini, G. Zanghirati, L. Zanni.
References
[HTML_REMOVED][1] T. Serafini, G. Zanghirati, L. Zanni, "Gradient Projection Methods for Quadratic Programs and Applications in Training Support Vector Machines", Optim. Meth. Soft. 20 (2005), 353-378.
[HTML_REMOVED][2] Y. Dai and R. Fletcher,"New Algorithms for Singly Linear Constrained Quadratic Programs Subject to Lower and Upper Bounds", Math. Prog. 106 (2006), 403-421.
[HTML_REMOVED][3] L. Zanni, T. Serafini, G, Zanghirati, "Parallel Software for Training Large-Scale Support Vector Machines on Multiprocessor Systems", JMLR 7 (2006), 1467-1492.
- Changes to previous version:
Initial Announcement on mloss.org.
- BibTeX Entry: Download
- Corresponding Paper BibTeX Entry: Download
- Supported Operating Systems: Cygwin, Linux, Windows, Unix
- Data Formats: None
- Tags: Large Scale, Classification, Support Vector Machines, Kernel Methods, Convex Optimization, Gradient Based Learning, Distributed, Parallel
- Archive: download here
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.