-
- Description:
Many applications in information and computer technology domains deal with structured data. For example, in Natural Language Processing (NLP), sentences are typically represented as syntactic parse trees or in Biology, chemical compounds can be represented as undirected graphs. In contrast, most Machine Learning (ML) methods and toolkits represent data as feature vectors, whose definition and computation is typically costly, especially in case of structured data. For example, the number of times a substructure appears in a structure can be an important feature. However, the number of substructures in a tree grows exponentially with the size of its nodes leading to an exponential number of structural features, which cannot thus be fully exploited in practice. A solution to the above-mentioned problem is given by Kernel Methods applied with kernel machines, e.g., SVMs or online learning models. The Kernel-based Learning Platform is a Java framework that aims to facilitate kernel-based learning, in particular on structural data. It contains the implementation of several kernel machines as well as kernel functions, enabling an easy and agile definition of new methods over generic data representations, e.g., vectorial data or discrete structures, such as trees and strings. The framework has been designed to decouple kernel functions and learning algorithms thanks to the definition of specific interfaces. Once a new kernel function is implemented, it can be immediately used in all available kernel-machines, which include different online and batch algorithms for Classification, Regression and Clustering. The library is highly interoperable: data objects, kernel functions and algorithms are serializable in XML and JSON, enabling the agile definition of kernel-based learning systems. Additionally, such engineering choice allows for defining kernel and algorithm combinations by simply changing parameters in the XML and JSON files (without the need of writing new code).
Some available kernels:
Tree Kernels: SubTreeKernel, SubSetTreeKernel, PartialTreeKernel, SmoothedPartialTreeKernel, CompositionallySmoothedPartialTreeKernel
Graph Kernels: ShortestPathKernel. Weisfeiler-Lehman Subtree Kernel for Graphs
SequenceKernel
PreferenceKernel and other kernels defined over pairs
Standard Kernels: LinearKernel, PolynomialKernel, RBFKernel, NormalizationKernel, LinearKernelCombination, KernelMultiplication
Some available algorithms:
Batch Learning: OneClassSVM, C-SVM, nu-SVM, LinearSVM, LinearSVMRegression, epsilon-regression, Dual Coordinate Descent
Online Learning: Perceptron, PassiveAggressive, BudgetedPassiveAggressive, Stoptron, RandomizedPerceptronOnBudget
Clustering: KernelizedKMean
- Changes to previous version:
This is a major release that includes brand new features as well as a renewed architecture of the entire project.
Now KeLP is organized in four maven projects:
kelp-core: it contains the infrastructure of abstract classes and interfaces to work with KeLP. Furthermore, some implementations of algorithms, kernels and representations are included, to provide a base operative environment.
kelp-additional-kernels: it contains several kernel functions that extend the set of kernels made available in the kelp-core project. Moreover, this project implements the specific representations required to enable the application of such kernels. In this project the following kernel functions are considered: Sequence kernels, Tree kernels and Graphs kernels.
kelp-additional-algorithms: it contains several learning algorithms extending the set of algorithms provided in the kelp-core project, e.g. the C-Support Vector Machine or ν-Support Vector Machine learning algorithms. In particular, advanced learning algorithms for classification and regression can be found in this package. The algorithms are grouped in: 1) Batch Learning, where the complete training dataset is supposed to be entirely available during the learning phase; 2) Online Learning, where individual examples are exploited one at a time to incrementally acquire the model.
kelp-full: this is the complete package of KeLP. It aggregates the previous modules in one jar. It contains also a set of fully functioning examples showing how to implement a learning system with KeLP. Batch learning algorithm as well as Online Learning algorithms usage is shown here. Different examples cover the usage of standard kernels, Tree Kernels and Graph Kernels, with caching mechanisms.
Furthermore this new release includes:
CsvDatasetReader: it allows to read files in CSV format
DCDLearningAlgorithm: it is the implementation of the Dual Coordinate Descent learning algorithm
methods for checking the consistency of a dataset.
Check out this new version from our repositories. API Javadoc is already available. Your suggestions will be very precious for us, so download and try KeLP 2.0.0!
- BibTeX Entry: Download
- Corresponding Paper BibTeX Entry: Download
- Supported Operating Systems: Platform Independent
- Data Formats: Csv, Libsvm, Json, Multiple Representations Format
- Tags: Svm, Classification, Clustering, Regression, Kernels, Online Learning, Kernel Methods, Graph Kernels, Structured Data, Linear Models, Tree Kernels
- Archive: download here
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.