Project details for sofia ml

Logo sofia ml 0.1

by dsculley - December 29, 2009, 23:30:58 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view (2 today), download ( 0 today ), 0 subscriptions


The suite of fast incremental algorithms for machine learning (sofia-ml) can be used for training models for classification or ranking, using several different techniques. This release is intended to aid researchers and practitioners who require fast methods for classification and ranking on large, sparse data sets.

Supported learners include: * Pegasos SVM * Stochastic Gradient Descent (SGD) SVM * Passive-Aggressive Perceptron * Perceptron with Margins * ROMMA * Logistic Regression (with Pegasos Projection)

These learners can be configured for binary classification, ranking, and optimizing ROC area, through the use of several available sampling methods for stochastic gradient descent.

This implementation is very fast. For example, 100,000 Pegasos SVM training iterations can be performed on data from the CCAT task from the RCV1 benchmark data set (with roughly 780,000 examples) in 0.1 CPU seconds on an ordinary 2.4GHz laptop, with no loss in classification performance compared with other SVM methods. On LETOR learning to rank benchmark tasks, training time with 100,000 Pegasos SVM rank steps complete 0.2 CPU seconds on an ordinary laptop.

The primary computational bottleneck is actually reading the data off of disk; sofia-ml reads and parses data from disk substantially faster than other SVM packages we tested.

This package provides a commandline utility for training models and using them to predict on new data, and also exposes an API for model training and prediction that can be used in new applications. The underlying libraries for data sets, weight vectors, and example vectors are also provided for researchers wishing to use these classes to implement other algorithms.

Changes to previous version:

Initial Announcement on

BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
Supported Operating Systems: Platform Independent
Data Formats: Svmlight
Tags: Svm, Classification, Online Learning, Stochastic Gradient Descent, Ranking, Logistic Regression, Passive Aggressive Perceptron, Pegasos, Perceptron, Romma
Archive: download here


No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.