Projects authored by soeren sonnenburg.


Logo JMLR SHOGUN 4.0.0

by sonne - February 5, 2015, 09:09:37 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 219069 views, 35567 downloads, 0 subscriptions

Rating Whole StarWhole StarWhole StarEmpty StarEmpty Star
(based on 6 votes)

About: The SHOGUN machine learning toolbox's focus is on large scale learning methods with focus on Support Vector Machines (SVM), providing interfaces to python, octave, matlab, r and the command line.

Changes:

This release features the work of our 8 GSoC 2014 students [student; mentors]:

  • OpenCV Integration and Computer Vision Applications [Abhijeet Kislay; Kevin Hughes]
  • Large-Scale Multi-Label Classification [Abinash Panda; Thoralf Klein]
  • Large-scale structured prediction with approximate inference [Jiaolong Xu; Shell Hu]
  • Essential Deep Learning Modules [Khaled Nasr; Sergey Lisitsyn, Theofanis Karaletsos]
  • Fundamental Machine Learning: decision trees, kernel density estimation [Parijat Mazumdar ; Fernando Iglesias]
  • Shogun Missionary & Shogun in Education [Saurabh Mahindre; Heiko Strathmann]
  • Testing and Measuring Variable Interactions With Kernels [Soumyajit De; Dino Sejdinovic, Heiko Strathmann]
  • Variational Learning for Gaussian Processes [Wu Lin; Heiko Strathmann, Emtiyaz Khan]

It also contains several cleanups and bugfixes:

Features

  • New Shogun project description [Heiko Strathmann]
  • ID3 algorithm for decision tree learning [Parijat Mazumdar]
  • New modes for PCA matrix factorizations: SVD & EVD, in-place or reallocating [Parijat Mazumdar]
  • Add Neural Networks with linear, logistic and softmax neurons [Khaled Nasr]
  • Add kernel multiclass strategy examples in multiclass notebook [Saurabh Mahindre]
  • Add decision trees notebook containing examples for ID3 algorithm [Parijat Mazumdar]
  • Add sudoku recognizer ipython notebook [Alejandro Hernandez]
  • Add in-place subsets on features, labels, and custom kernels [Heiko Strathmann]
  • Add Principal Component Analysis notebook [Abhijeet Kislay]
  • Add Multiple Kernel Learning notebook [Saurabh Mahindre]
  • Add Multi-Label classes to enable Multi-Label classification [Thoralf Klein]
  • Add rectified linear neurons, dropout and max-norm regularization to neural networks [Khaled Nasr]
  • Add C4.5 algorithm for multiclass classification using decision trees [Parijat Mazumdar]
  • Add support for arbitrary acyclic graph-structured neural networks [Khaled Nasr]
  • Add CART algorithm for classification and regression using decision trees [Parijat Mazumdar]
  • Add CHAID algorithm for multiclass classification and regression using decision trees [Parijat Mazumdar]
  • Add Convolutional Neural Networks [Khaled Nasr]
  • Add Random Forests algorithm for ensemble learning using CART [Parijat Mazumdar]
  • Add Restricted Botlzmann Machines [Khaled Nasr]
  • Add Stochastic Gradient Boosting algorithm for ensemble learning [Parijat Mazumdar]
  • Add Deep contractive and denoising autoencoders [Khaled Nasr]
  • Add Deep belief networks [Khaled Nasr]

Bugfixes

  • Fix reference counting bugs in CList when reference counting is on [Heiko Strathmann, Thoralf Klein, lambday]
  • Fix memory problem in PCA::apply_to_feature_matrix [Parijat Mazumdar]
  • Fix crash in LeastAngleRegression for the case D greater than N [Parijat Mazumdar]
  • Fix memory violations in bundle method solvers [Thoralf Klein]
  • Fix fail in library_mldatahdf5.cpp example when http://mldata.org is not working properly [Parijat Mazumdar]
  • Fix memory leaks in Vowpal Wabbit, LibSVMFile and KernelPCA [Thoralf Klein]
  • Fix memory and control flow issues discovered by Coverity [Thoralf Klein]
  • Fix R modular interface SWIG typemap (Requires SWIG >= 2.0.5) [Matt Huska]

Cleanup and API Changes

  • PCA now depends on Eigen3 instead of LAPACK [Parijat Mazumdar]
  • Removing redundant and fixing implicit imports [Thoralf Klein]
  • Hide many methods from SWIG, reducing compile memory by 500MiB [Heiko Strathmann, Fernando Iglesias, Thoralf Klein]

Logo mldata.org svn-r1070-Apr-2011

by sonne - April 8, 2011, 10:15:49 CET [ Project Homepage BibTeX Download ] 10741 views, 2681 downloads, 0 subscriptions

About: The source code of the mldata.org site - a community portal for machine learning data sets.

Changes:

Initial Announcement on mloss.org.


Logo mldata-utils 0.5.0

by sonne - April 8, 2011, 10:02:44 CET [ Project Homepage BibTeX Download ] 78783 views, 17159 downloads, 0 subscriptions

About: Tools to convert datasets from various formats to various formats, performance measures and API functions to communicate with mldata.org

Changes:
  • Change task file format, such that data splits can have a variable number items and put into up to 256 categories of training/validation/test/not used/...
  • Various bugfixes.

Logo mloss.org svn-r645-Mar-2011

by sonne - March 23, 2011, 11:09:18 CET [ Project Homepage BibTeX Download ] 34753 views, 5641 downloads, 0 subscriptions

About: This is the source code of the mloss.org website.

Changes:

Now works with newer django versions and fixes several warnings and minor bugs underneath. The only user visible change is probably that the subscription and bookmark buttons work again.


Logo LIBOCAS 0.93

by vf - June 20, 2010, 12:22:05 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 24033 views, 4130 downloads, 0 subscriptions

About: The library implements Optimized Cutting Plane Algorithm (OCAS) for efficient training of linear SVM classifiers from large-scale data.

Changes:

Implemented COFFIN framework which allows efficient training of invariant image classifiers via virtual examples.


Logo asp 0.3

by sonne - May 7, 2010, 10:25:39 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 21725 views, 4089 downloads, 0 subscriptions

About: Accurate splice site predictor for a variety of genomes.

Changes:

Asp now supports three formats:

-g fname for gff format

-s fname for spf format

-b dir for a binary format compatible with mGene.

And a new switch

-t which switches on a sigmoid-based transformation of the svm scores to get scores between 0 and 1.


Logo arts 0.2

by sonne - May 25, 2009, 09:56:31 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 12290 views, 2653 downloads, 0 subscriptions

About: ARTS is an accurate predictor for Transcription Start Sites (TSS).

Changes:

Initial Announcement on mloss.org.


Logo dataformat 0.1.1

by mikio - March 12, 2009, 16:07:55 CET [ Project Homepage BibTeX Download ] 17868 views, 3787 downloads, 0 subscriptions

About: The goal of this project is to provide code for reading and writing machine learning data sets for as many programming languages as possible.

Changes:

Forgot to include the Java sources.


Logo mSplicer 0.3

by sonne - May 18, 2008, 13:07:40 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 13457 views, 2801 downloads, 0 subscriptions

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 2 votes)

About: For modern biology, precise genome annotations are of prime importance as they allow the accurate definition of genic regions. We employ state of the art machine learning methods to assay and [...]

Changes:

Initial Announcement on mloss.org.