Project details for KeBABS

Screenshot KeBABS 1.2.0

by UBod - April 17, 2015, 21:15:37 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view ( today), download ( today ), 0 subscriptions

OverallEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
FeaturesEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
UsabilityEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
DocumentationEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
(based on 1 vote)

The package provides functionality for kernel-based analysis of DNA, RNA, and amino acid sequences via SVM-based methods. As core functionality, kebabs implements following sequence kernels: spectrum kernel, mismatch kernel, gappy pair kernel, and motif kernel. Apart from an efficient implementation of standard position-independent functionality, the kernels are extended in a novel way to take the position of patterns into account for the similarity measure. Because of the flexibility of the kernel formulation, other kernels like the weighted degree kernel or the shifted weighted degree kernel with constant weighting of positions are included as special cases. An annotation-specific variant of the kernels uses annotation information placed along the sequence together with the patterns in the sequence. The package allows for the generation of a kernel matrix or an explicit feature representation in dense or sparse format for all available kernels which can be used with methods implemented in other R packages. With focus on SVM-based methods, kebabs provides a framework which simplifies the usage of existing SVM implementations in kernlab, e1071, and LiblineaR. Binary and multi-class classification as well as regression tasks can be used in a unified way without having to deal with the different functions, parameters, and formats of the selected SVM. As support for choosing hyperparameters, the package provides cross validation - including grouped cross validation, grid search and model selection functions. For easier biological interpretation of the results, the package computes feature weights for all SVMs and prediction profiles which show the contribution of individual sequence positions to the prediction result and indicate the relevance of sequence sections for the learning result and the underlying biological functions.

Changes to previous version:
  • inclusion of dense LIBSVM 3.20 for dense kernel matrix support to provide a reliable way for training with kernel matrices
  • new accessors folds and performance for CrossValidationResult
  • removed fold performance from show of CV result
  • adaptions for user defined sequence kernel with new export isUserDefined, example in inst/examples/UserDefinedKernel
  • correction of errors with position offset for position specific kernels
  • computation of AUC via trapezoidal rule
  • changes for auto mode in CV, grid search, model selection
  • check for non-negative mixing coefficients in spectrum and gappy pair kernel
  • build warnings on Windows removed
  • added definition of performance parameters for binary and multiclass classification to vignette
  • update of citation file and reference section in help pages
BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
Supported Operating Systems: Platform Independent
Data Formats: Any Format Supported By R
Tags: Bioinformatics, Support Vector Machine, Sequence Analysis, Classification, Kernels, Kernel Methods, Supervised Learning
Archive: download here


No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.