Project details for Jstacs

Logo JMLR Jstacs 2.3

by keili - September 13, 2017, 14:25:38 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view (3 today), download ( 0 today ), 0 subscriptions

Description:

Sequence analysis is one of the major subjects of bioinformatics. Several existing libraries combine the representation of biological sequences with exact and approximate pattern matching as well as alignment algorithms. We present Jstacs, an open source Java library, which focuses on the statistical analysis of biological sequences instead. Jstacs comprises an efficient representation of sequence data and provides implementations of many statistical models with generative and discriminative approaches for parameter learning. Using Jstacs, classifiers can be assessed and compared on test datasets or by cross-validation experiments evaluating several performance measures. Due to its strictly object-oriented design Jstacs is easy to use and readily extensible.

Changes to previous version:

New classes and packages:

  • Jstacs 2.3 is the first release to be accompanied by JstacsFX, a library for building JavaFX-based graphical user interfaces based on JstacsTools
  • new interface MultiThreadedFunction
  • new class LargeSequenceReader for reading large sequence files in chunks
  • new interface QuickScanningSequenceScore
  • new class RegExpValidator for checking String inputs against a regular expression
  • new class IUPACDNAAlphabet

New features and improvements:

  • Alignments may now handle different costs for insert and delete gaps
  • ListResults may now be constructed from Collections of ResultSets
  • Several minor improvements and bugfixes in many classes
  • Improvements of documentation of several classes
BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
Supported Operating Systems: Cygwin, Linux, Macosx, Windows, Unix, Agnostic, Solaris, Freebsd, Platform Independent
Data Formats: Plain Ascii, Fasta
Tags: Bioinformatics, R, Classification, Machine Learning, Bayesian Networks, Markov Random Fields, Supervised Learning, Em, Mixture Models, Java, Learning Principles, Probabilistic Models, Motif Discovery
Archive: download here

Other available revisons

Version Changelog Date
2.3

New classes and packages:

  • Jstacs 2.3 is the first release to be accompanied by JstacsFX, a library for building JavaFX-based graphical user interfaces based on JstacsTools
  • new interface MultiThreadedFunction
  • new class LargeSequenceReader for reading large sequence files in chunks
  • new interface QuickScanningSequenceScore
  • new class RegExpValidator for checking String inputs against a regular expression
  • new class IUPACDNAAlphabet

New features and improvements:

  • Alignments may now handle different costs for insert and delete gaps
  • ListResults may now be constructed from Collections of ResultSets
  • Several minor improvements and bugfixes in many classes
  • Improvements of documentation of several classes
September 13, 2017, 14:25:38
2.2

New classes and packages:

  • CorreationCoefficient: PerformanceMeasure
  • de.jstacs.clustering: package with classes for hierarchical clustering
  • DeBruijnGraphSequenceGenerator and DeBruijnSequenceGenerator for generating De Buijn sequences
  • CyclicSequenceAdaptor for representing cyclic sequences
  • PlotGeneratorResult for representing results that plot images to a Graphics2D object
  • TextResult for results that may be stored as text files
  • package de.jstacs.results.savers for generic classes that store results to disk
  • LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder for sparse local inhomogeneous mixture (Slim) models
  • PFMWrapperTrainSM for representing position frequency matrices and position weight matrices from databases
  • package de.jstacs.tools with classes for generic Jstacs tools that may be used in different user interfaces (command line, Galaxy, JavaFX)
  • Compression for ZIP compression of Strings
  • package de.jstacs.utils.graphics with generic GraphicsAdaptor using Apache XML commons
  • projects: Dimont, GeMoMa, Slim, TALEN, motif comparison

New features and improvements:

  • Major restructuring of Alignment for better efficiency
  • Alignment Costs and StringAlignment now Storable
  • New constructor of DataSet allowing a specified percentage of sequences to mismatch the given alphabet
  • BioJavaAdapter ported to BioJava 1.9
  • XMLParser now also allows for storing Sequences
  • New method for parsing HMMer profile HMMs in HMMFactory
  • Several minor improvements and bugfixes in many classes
  • Improvements of documentation of several classes
February 17, 2016, 11:57:56
2.1

New classes:

  • MultipleIterationsCondition: Requires another TerminationCondition to fail a contiguous, specified number of times
  • ClassifierFactory: Allows for creating standard classifiers
  • SeqLogoPlotter: Plot PNG sequence logos from within Jstacs
  • MultivariateGaussianEmission: Multivariate Gaussian emission density for a Hidden Markov Model
  • MEManager: Maximum entropy model

New features and improvements:

  • Alignment: Added free shift alignment
  • PerformanceMeasure and sub-classes: Extension to weighted test data
  • AbstractClassifier, ClassifierAssessment and sub-classes: Adaption to weighted PerformanceMeasures
  • DNAAlphabet: Parser speed-up
  • PFMComparator: Extension to PFM from other sources/databases
  • ToolBox: New convenience methods for computing several statistics (e.g., median, correlation)
  • SignificantMotifOccurrencesFinder: New methods for computing PWMs and statistics from predictions
  • SequenceScore and sub-classes: New method toString(NumberFormat)
  • DataSet: Adaption to weighted data, e.g., partitioning
  • REnvironment: Changed several methods from String to CharSequence

Restructuring:

  • changed MultiDimensionalSequenceWrapperDiffSM to MultiDimensionalSequenceWrapperDiffSS

Several minor new features, bug fixes, and code cleanups

June 3, 2013, 07:32:55
2.0

February 2, 2012: Jstacs 2.0 released

Jstacs 2.0 changes many names and the structure of several packages. It is not code-compatible with Jstacs 1.5 and earlier

RESTRUCTURING and RENAMING:

former ScoringFunction, NormalizableScoringFunction, Model

  • new base-interface SequenceScore
  • new sub-interface StatisticalModel of SequenceScore for all statistical models with sub-iterfaces DifferentiableStatisticalModel and TrainableStatisticalModel
  • new interface DifferentiableSequenceScore replaces ScoringFunction
  • new interface DifferentiableStatisticalModel replaces NormalizableScoringFunction
  • new interface TrainableStatisticalModel replaces Model
  • new abstract class AbstractDifferentiableSequenceScore
  • new abstract class AbstractDifferentiableStatisticalModel replaces AbstractNormalizableScoringFunction
  • new abstract class AbstractTrainableStatisticalModel replaces AbstractModel
  • former Models renamed to TrainSM
  • former ScoringFunction renamed to DiffSS or DiffSM
  • getProbFor removed from TrainableStatisticalModel (former Model) and conceptually replaced by getLogProbFor
  • getLogScore(Sequence,int,int) with changed meaning of arguments: getLogScore(Sequence,start,end) instead of getLogScore(Sequence,start,length)
  • isTrained() replaced by common method isInitialized()

Parameters and Results

  • new super-class of Parameters and Results: AnnotatedEntity
  • common list-type for Parameters and Results: AnnotatedEntityList
  • Renaming: CollectionParameter -> SelectionParameter, MultiSelectionCollectionParameter -> MultiSelectionParameter, new super-class AbstractSelectionParameter
  • major refactoring due to common hierarchy and code-cleanup
  • lazy evaluation of Parameter/ParameterSet hierarchies moved from ParameterSet (loadParameters()) to ParameterSetContainer (constructor on class)
  • SubclassFinder adapted to lazy evaluation

performance measures

  • new abstract super-class AbstractPerformanceMeasure of all performance measures
  • new interface NumericalPerformanceMeasure for all performance measures that return a single number (as opposed, e.g., to curves)
  • new class PerformanceMeasureParameterSet for a collection of general performance measures
  • new class NumericalPerformanceMeasureParameterSet for a collection of NumericalPerformanceMeasures
  • used in evaluate-method of AbstractClassifier and in ClassifierAssessments

further changes

  • Sample renamed to DataSet
  • evaluate and evaluateAll in AbstractClassifier joined
  • new class IndependentProductDiffSS as super-class of IndepedentProductDiffSM (former IndependentProductScoringFunction)
  • new class UniformDiffSS as super-class of UniformDiffSM (former UniformScoringFunction)

NEW FUNCTIONALITY:

  • multi-threaded implementation of Baum-Welch and Viterbi training of hidden Markov models
  • new Interface Singleton that can be used for singleton instances to save memory, current examples: DNAAlphabet, DNAAlphabetContainer, ProteinAlphabet
  • added ProteinAlphabet
  • added possibility to use NaN-values with ContinuousAlphabets
  • added ArbitraryFloatSequence including static methods for DataSet creation for cases where double-precision is not needed
  • new performance measure MaximumFMeasure
  • access to Parameters in ParameterSets and Results in ResultSets by name
  • emitDataSet in BayesianNetworkDiffSM
  • new static method Time.getTimeInstance that returns UserTime or RealTime depending on availability of shared lib
  • SubclassFinder allows for adding own base packages
  • new method overlaps() in LocatedSequenceAnnotationWithLength
  • AbstractTerminationCondition used in ScoreClassifier and sub-classes
  • public method propagateESS in HMMFactory
  • new method generateLog in DirichletMRG for drawing log-values
  • added DifferentiableStatisticalModelFactory

BUGFIXES/IMPROVEMENTS:

  • bugfix in propagation of equivalent sample size in HMMFactory
  • bugfix in random initialization of BasicHigherOrderTransition
  • improved Alignment implementation
  • SafeOutputStream with new static factory method getSafeOutputStream, write methods now work on Objects

DOCUMENTATION:

  • improved Javadocs in many classes and packages
  • new Cookbook with extensive documentation and explanation

MISC:

  • output of NonParsableException more verbose
  • Exceptions in multi-threaded code now lead to exit of program instead of only stopping the thread
  • update of RServe/RClient
February 2, 2012, 17:14:02
1.5

June 1st, 2011: Jstacs 1.5 released

  • new package de.jstacs.algorithms.alignment for sequence alignment algorithms

  • new class de.jstacs.models.ModelFactory with static classes to construct many standard models

  • de.jstacs.utils.galaxy.GalaxyAdaptor, an adaptor to Galaxy, which allows for creating Galaxy applications using Jstacs ParameterSets, also requires new interface GalaxyConvertible

  • new package de.jstacs.models.hmm for a variety of hidden Markov models, which can be learned by different learning principles including generative and discriminative learning principles, maximization and sampling methods

  • new package de.jstacs.sampling that contains general infrastructure for parameter sampling

  • new class de.jstacs.scoringFunctions.MappingScoringFunction that allows for internal mapping of symbols from the alphabet

  • new package de.jstacs.classifier.scoringFunctionBases.sampling containing classifiers that sample their parameters by the Metropolis-Hastings algorithm

  • new interface de.jstacs.scoringFunctions.SamplingScoringFunction for NormalizableScoringFunctions that can be used in Metropolis-Hastings sampling of parameters

  • bugfix in XMLParser for cases, where the tag of interest also occurrs within other, nested tags

June 6, 2011, 10:48:08
1.4

December 31, 2010: Jstacs 1.4 released

  • added DincleotideProperty for computing properties like melting temperature, twist, or G/C content

  • support for multidimensional sequence data

  • more widespread use of TerminationConditions

  • completely rewritten XMLParser

  • extension of motif discovery to weighted data

  • OneSampleLogGenDisMixFunction for using the same Sample with different weights for the different classes

  • Jstacs requires Java 1.6 now

January 2, 2011, 11:07:10
1.3.1

March 2, 2010: Jstacs 1.3.1 released

  • Partitioning of Samples including weights

  • Release of Dispom (de-novo discovery of differentially abundant transcription factor binding sites including their positional preference)

  • Several bugfixes

March 2, 2010, 14:15:46
1.3

Initial Announcement on mloss.org.

December 4, 2009, 11:27:55

Comments

No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.