-
- Description:
Sequence analysis is one of the major subjects of bioinformatics. Several existing libraries combine the representation of biological sequences with exact and approximate pattern matching as well as alignment algorithms. We present Jstacs, an open source Java library, which focuses on the statistical analysis of biological sequences instead. Jstacs comprises an efficient representation of sequence data and provides implementations of many statistical models with generative and discriminative approaches for parameter learning. Using Jstacs, classifiers can be assessed and compared on test datasets or by cross-validation experiments evaluating several performance measures. Due to its strictly object-oriented design Jstacs is easy to use and readily extensible.
- Changes to previous version:
New classes:
- MultipleIterationsCondition: Requires another TerminationCondition to fail a contiguous, specified number of times
- ClassifierFactory: Allows for creating standard classifiers
- SeqLogoPlotter: Plot PNG sequence logos from within Jstacs
- MultivariateGaussianEmission: Multivariate Gaussian emission density for a Hidden Markov Model
- MEManager: Maximum entropy model
New features and improvements:
- Alignment: Added free shift alignment
- PerformanceMeasure and sub-classes: Extension to weighted test data
- AbstractClassifier, ClassifierAssessment and sub-classes: Adaption to weighted PerformanceMeasures
- DNAAlphabet: Parser speed-up
- PFMComparator: Extension to PFM from other sources/databases
- ToolBox: New convenience methods for computing several statistics (e.g., median, correlation)
- SignificantMotifOccurrencesFinder: New methods for computing PWMs and statistics from predictions
- SequenceScore and sub-classes: New method toString(NumberFormat)
- DataSet: Adaption to weighted data, e.g., partitioning
- REnvironment: Changed several methods from String to CharSequence
Restructuring:
- changed MultiDimensionalSequenceWrapperDiffSM to MultiDimensionalSequenceWrapperDiffSS
Several minor new features, bug fixes, and code cleanups
- BibTeX Entry: Download
- Corresponding Paper BibTeX Entry: Download
- Supported Operating Systems: Cygwin, Linux, Macosx, Windows, Unix, Agnostic, Solaris, Freebsd, Platform Independent
- Data Formats: Plain Ascii, Fasta
- Tags: Bioinformatics, R, Classification, Machine Learning, Bayesian Networks, Markov Random Fields, Supervised Learning, Em, Mixture Models, Java, Learning Principles, Probabilistic Models, Motif Discovery
- Archive: download here
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.