Projects supporting the csv data format.
Showing Items 1-20 of 23 on page 1 of 2: 1 2 Next

Logo Cognitive Foundry 3.3.3

by Baz - May 21, 2013, 05:59:37 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 8684 views, 1723 downloads, 2 subscriptions

About: The Cognitive Foundry is a modular Java software library of machine learning components and algorithms designed for research and applications.

Changes:
  • General:
    • Made code able to compile under both Java 1.6 and 1.7. This required removing some potentially unsafe methods that used varargs with generics.
    • Upgraded XStream dependency to 1.4.4.
    • Improved support for regression algorithms in learning.
    • Added general-purpose adapters to make it easier to compose learning algorithms and adapt their input or output.
  • Common Core:
    • Added isSparse, toArray, dotDivide, and dotDivideEquals methods for Vector and Matrix.
    • Added scaledPlus, scaledPlusEquals, scaledMinus, and scaledMinusEquals to Ring (and thus Vector and Matrix) for potentially faster such operations.
    • Fixed issue where matrix and dense vector equals was not checking for equal dimensionality.
    • Added transform, transformEquals, tranformNonZeros, and transformNonZerosEquals to Vector.
    • Made LogNumber into a signed version of a log number and moved the prior unsigned implementation into UnsignedLogNumber.
    • Added EuclideanRing interface that provides methods for times, timesEquals, divide, and divideEquals. Also added Field interface that provides methods for inverse and inverseEquals. These interfaces are now implemented by the appropriate number classes such as ComplexNumber, MutableInteger, MutableLong, MutableDouble, LogNumber, and UnsignedLogNumber.
    • Added interface for Indexer and DefaultIndexer implementation for creating a zero-based indexing of values.
    • Added interfaces for MatrixFactoryContainer and DivergenceFunctionContainer.
    • Added ReversibleEvaluator, which various identity functions implement as well as a new utility class ForwardReverseEvaluatorPair to create a reversible evaluator from a pair of other evaluators.
    • Added method to create an ArrayList from a pair of values in CollectionUtil.
    • ArgumentChecker now properly throws assertion errors for NaN values. Also added checks for long types.
    • Fixed handling of Infinity in subtraction for LogMath.
    • Fixed issue with angle method that would cause a NaN if cosine had a rounding error.
    • Added new createMatrix methods to MatrixFactory that initializes the Matrix with the given value.
    • Added copy, reverse, and isEmpty methods for several array types to ArrayUtil.
    • Added utility methods for creating a HashMap, LinkedHashMap, HashSet, or LinkedHashSet with an expected size to CollectionUtil.
    • Added getFirst and getLast methods for List types to CollectionUtil.
    • Removed some calls to System.out and Exception.printStackTrace.
  • Common Data:
    • Added create method for IdentityDataConverter.
    • ReversibleDataConverter now is an extension of ReversibleEvaluator.
  • Learning Core:
    • Added general learner transformation capability to make it easier to adapt and compose algorithms. InputOutputTransformedBatchLearner provides this capability for supervised learning algorithms by composing together a triplet. CompositeBatchLearnerPair does it for a pair of algorithms.
    • Added a constant and identity learners.
    • Added Chebyshev, Identity, and Minkowski distance metrics.
    • Added methods to DatasetUtil to get the output values for a dataset and to compute the sum of weights.
    • Made generics more permissive for supervised cost functions.
    • Added ClusterDistanceEvaluator for taking a clustering that encodes the distance from an input value to all clusters and returns the result as a vector.
    • Fixed potential round-off issue in decision tree splitter.
    • Added random subspace technique, implemented in RandomSubspace.
    • Separated functionality from LinearFunction into IdentityScalarFunction. LinearFunction by default is the same, but has parameters that can change the slope and offset of the function.
    • Default squashing function for GeneralizedLinearModel and DifferentiableGeneralizedLinearModel is now a linear function instead of an atan function.
    • Added a weighted estimator for the Poisson distribution.
    • Added Regressor interface for evaluators that are the output of (single-output) regression learning algorithms. Existing such evaluators have been updated to implement this interface.
    • Added support for regression ensembles including additive and averaging ensembles with and without weights. Added a learner for regression bagging in BaggingRegressionLearner.
    • Added a simple univariate regression class in UnivariateLinearRegression.
    • MultivariateDecorrelator now is a VectorInputEvaluator and VectorOutputEvaluator.
    • Added bias term to PrimalEstimatedSubGradient.
  • Text Core:
    • Fixed issue with the start position for tokens from LetterNumberTokenizer being off by one except for the first one.

Logo MLPACK 1.0.5

by rcurtin - May 2, 2013, 07:24:32 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 20432 views, 3608 downloads, 4 subscriptions

About: A scalable, fast C++ machine learning library, with emphasis on usability.

Changes:

Speedups of cover tree traversers; addition of rank-approximate nearest neighbor (RANN); addition of fast exact max-kernel search (FastMKS); fix for EM covariance estimation; more parameters for GMM estimation; force GMM and GaussianDistribution covariance matrices to be positive definite during training; add a tolerance parameter to the Baum-Welch algorithm for HMM training; fix for compilation with clang; fix for k-furthest neighbor search.


Logo HDDM 0.5

by Wiecki - April 24, 2013, 02:53:07 CET [ Project Homepage BibTeX Download ] 1782 views, 386 downloads, 1 subscription

About: HDDM is a python toolbox for hierarchical Bayesian parameter estimation of the Drift Diffusion Model (via PyMC). Drift Diffusion Models are used widely in psychology and cognitive neuroscience to study decision making.

Changes:
  • New and improved HDDM model with the following changes:
    • Priors: by default model will use informative priors (see http://ski.clps.brown.edu/hddm_docs/methods.html#hierarchical-drift-diffusion-models-used-in-hddm) If you want uninformative priors, set informative=False.
    • Sampling: This model uses slice sampling which leads to faster convergence while being slower to generate an individual sample. In our experiments, burnin of 20 is often good enough.
    • Inter-trial variablity parameters are only estimated at the group level, not for individual subjects.
    • The old model has been renamed to HDDMTransformed.
    • HDDMRegression and HDDMStimCoding are also using this model.
  • HDDMRegression takes patsy model specification strings. See http://ski.clps.brown.edu/hddm_docs/howto.html#estimate-a-regression-model and http://ski.clps.brown.edu/hddm_docs/tutorial_regression_stimcoding.html#chap-tutorial-hddm-regression
  • Improved online documentation at http://ski.clps.brown.edu/hddm_docs
  • A new HDDM demo at http://ski.clps.brown.edu/hddm_docs/demo.html
  • Ratcliff's quantile optimization method for single subjects and groups using the .optimize() method
  • Maximum likelihood optimization.
  • Many bugfixes and better test coverage.
  • hddm_fit.py command line utility is depracated.

Logo JMLR MultiBoost 1.2.00

by busarobi - April 22, 2013, 15:42:53 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 14257 views, 2490 downloads, 1 subscription

About: MultiBoost is a multi-purpose boosting package implemented in C++. It is based on the multi-class/multi-task AdaBoost.MH algorithm [Schapire-Singer, 1999]. Basic base learners (stumps, trees, products, Haar filters for image processing) can be easily complemented by new data representations and the corresponding base learners, without interfering with the main boosting engine.

Changes:
  • A new fast (sublinear in the number of instances) stump algorithm is implemented. The gain in time is proportional to the sparsity of the features (it is significant when a lot of instances take the most frequent feature value). See Section B.2 in the documentation.
  • A parametrized early stopping option is added in --traintest mode. We stop if the (smoothed) test error does not improve for a certain number of iterations. See Section 4.1.3 in the documentation.

Logo Armadillo library 3.810

by cu24gjf - April 22, 2013, 05:24:18 CET [ Project Homepage BibTeX Download ] 27347 views, 6201 downloads, 2 subscriptions

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 2 votes)

About: Armadillo is a template C++ linear algebra library aiming towards a good balance between speed and ease of use. Matrix decompositions are provided through optional integration with LAPACK, or one of its high performance drop-in replacements (eg. Intel MKL).

Changes:
  • added fast Fourier transform
  • added handling of .imbue() and .transform() by submatrices and subcubes
  • added batch insertion constructors for sparse matrices
  • minor fix for multiplication of complex sparse matrices
  • better detection of recent Intel MKL versions during installation

Logo CAM Java 2.0

by wangny - April 11, 2013, 18:21:12 CET [ BibTeX Download ] 1365 views, 555 downloads, 1 subscription

About: The CAM R-Java software provides a noval way to solve blind source separation problem.

Changes:
  1. Three classic BSS algorithms - NMF, fastICA and Factor Analysis - are combined into the software. Users can readily call the three functions from Java GUI
  2. A simple plug-in mechanism is added. Users can add their own BSS algorithm into the software by following the Software Plugin Adding Guide and User Manual

Logo JMLR Waffles 2013-04-06

by mgashler - April 7, 2013, 02:04:10 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 16183 views, 5326 downloads, 1 subscription

About: A broad collection of script-friendly command-line tools for machine learning and data mining tasks. (The command-line tools wrap functionality from a C++ class library.)

Changes:

See the change log at http://waffles.sourceforge.net/changelog.html


Logo EnsembleSVM 1.2

by claesenm - March 30, 2013, 14:04:13 CET [ Project Homepage BibTeX Download ] 856 views, 197 downloads, 1 subscription

About: The EnsembleSVM library offers functionality to perform ensemble learning using Support Vector Machine (SVM) base models. In particular, we offer routines for binary ensemble models using SVM base classifiers. Experimental results have shown the predictive performance to be comparable with standard SVM models but with drastically reduced training time. Ensemble learning with SVM models is particularly useful for semi-supervised tasks.

Changes:

Fixed bug in IndexedFile, which caused esvm-train to fail when used without bootstrap mask. Library API/ABI remain unchanged, library revision increased.


Logo MLDemos 0.5.1

by basilio - March 2, 2013, 16:06:13 CET [ Project Homepage BibTeX Download ] 13017 views, 2933 downloads, 2 subscriptions

About: MLDemos is a user-friendly visualization interface for various machine learning algorithms for classification, regression, clustering, projection, dynamical systems, reward maximisation and reinforcement learning.

Changes:

New Visualization and Dataset Features Added 3D visualization of samples and classification, regression and maximization results Added Visualization panel with individual plots, correlations, density, etc. Added Editing tools to drag/magnet data, change class, increase or decrease dimensions of the dataset Added categorical dimensions (indexed dimensions with non-numerical values) Added Dataset Editing panel to swap, delete and rename dimensions, classes or categorical values Several bug-fixes for display, import/export of data, classification performance

New Algorithms and methodologies Added Projections to pre-process data (which can then be classified/regressed/clustered), with LDA, PCA, KernelPCA, ICA, CCA Added Grid-Search panel for batch-testing ranges of values for up to two parameters at a time Added One-vs-All multi-class classification for non-multi-class algorithms Trained models can now be kept and tested on new data (training on one dataset, testing on another) Added a dataset generator panel for standard toy datasets (e.g. swissroll, checkerboard,...) Added a number of clustering, regression and classification algorithms (FLAME, DBSCAN, LOWESS, CCA, KMEANS++, GP Classification, Random Forests) Added Save/Load Model option for GMMs and SVMs Added Growing Hierarchical Self Organizing Maps (original code by Michael Dittenbach) Added Automatic Relevance Determination for SVM with RBF kernel (Thanks to Ashwini Shukla!)


Logo JKernelMachines 2.0

by dpicard - February 28, 2013, 21:09:31 CET [ Project Homepage BibTeX Download ] 3703 views, 859 downloads, 1 subscription

About: machine learning library in java for easy development of new kernels

Changes:

Version 2.0.

  • Separation of the core library and unit testing
  • Junit testing added
  • Lots of bug fixes
  • Better examples, and many useless test classes removed
  • A small demo script to benchmark the library was added

Warning: all classes have migrated under the fr.lip6.jkernelmachines package, which breaks backward compatibility, but was necessary to keep the project going on sanely.


Logo ADAMS 0.4.2

by fracpete - February 26, 2013, 03:26:25 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 1792 views, 317 downloads, 1 subscription

About: The Advanced Data mining And Machine learning System (ADAMS) is a novel, flexible workflow engine aimed at quickly building and maintaining real-world, complex knowledge workflows.

Changes:
  • Added almost 20 more conversions and 20 new actors
  • R-Project integration using Rserve
  • WEKA webservice allows for programming language agnostic training, evaluation and use of WEKA models (classifiers, clusterers) and data processing using filters
  • Spreadsheets now come with basic formula support
  • Spreadsheets can be used for lookup tables in the flow
  • Support for "chunked" reading/writing of spreadsheets to process millions of rows

Logo MyMediaLite 3.07

by zenog - February 9, 2013, 13:14:25 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 27806 views, 4634 downloads, 1 subscription

About: MyMediaLite is a lightweight, multi-purpose library of recommender system algorithms.

Changes:

Important changes:

  • new rating predictor GSVD++ (contributed by Marcelo Manzato)
  • new recommenders ExternalRatingPredictor and ExternalItemRecommender to evaluate external tools with the MyMediaLite evaluation framework
  • incremental update support for item recommendation UserKNN and ItemKNN (based on a pull request by João Vinagre)
  • --cross-validation support for the rating_based_ranking tool (as requested by Pieter-Jan Verbrugen)
  • removed the group recommendation code
  • cleaner item recommendation evaluation, with a bug fix in the cross-validation code and a complete rewrite of online evaluation
  • removed unused matrix and vector math, faster and simplified matrix code

Logo Encog Machine Learning Framework 3.1

by jeffheaton - January 1, 2013, 00:05:08 CET [ Project Homepage BibTeX Download ] 1051 views, 203 downloads, 1 subscription

About: Encog is a Machine Learning framework for Java, C#, Javascript and C/C++ that supports SVM's, Genetic Programming, Bayesian Networks, Hidden Markov Models and other algorithms.

Changes:

Initial Announcement on mloss.org.


Logo Neural network designer 1.1.1

by bragi - December 28, 2012, 11:38:10 CET [ Project Homepage BibTeX Download ] 1234 views, 284 downloads, 1 subscription

About: a dbms for resonating neural networks. Create and use different types of machine learning algorithms.

Changes:

AIML compatible (AIML files can be imported); new 'Grid channel' for developing board games; improved topics editor; new demo project: ALice (from AIML); lots of bug-fixes and speed improvements


Logo ELKI 0.5.5

by erich - December 14, 2012, 18:49:58 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 4241 views, 757 downloads, 2 subscriptions

About: ELKI is a framework for implementing data-mining algorithms with support for index structures, that includes a wide variety of clustering and outlier detection methods.

Changes:

This is mostly a bug fix release. A lot of small issues have been fixed that improve performance, make error reporting a lot better, ease the use of sparse vectors and external precomputed distances, for example.

This will be the last ELKI release to support Java 6. The next ELKI release will require Java 7.

Algorithms

  • Some new LOF variants (LDF, SimpleLOF, SimpleKernelDensityLOF)
  • Correlation Outlier Probabilities (ICDM 2012)
  • A naive mean-shift clustering
  • Single-link clustering (SLINK algorithm) should be significantly faster due to optimized data structures
  • "Benchmarking" algorithms for measuring the performance of index structures

Index layer

  • Bulk loading R-Trees should be faster - in particular Sort Tile Recursive can work very well.
  • M-Trees have been refactored and optimized for double distances

Database layer

  • Bundle format (work in progress): low-level binary format for fast data exchange
  • DBID and DataStore layer received some additional classes for further performance improvements
  • KNN heap structures were revisited. The code is less clean now, but performs better in benchmarks.

Visualizations

  • General clean up and API simplifications
  • Some additional modules and improvements

Various

  • There is a new parameter class, RandomParameter
  • Some new distributions were added, also to the data set generator.

Tutorials

  • The website has new tutorials, including one on a k-means variation that produces equal sized clusters.

Logo Divvy 1.1.1

by jlewis - November 14, 2012, 20:21:29 CET [ Project Homepage BibTeX Download ] 505 views, 157 downloads, 1 subscription

About: Divvy is a Mac OS X application for performing dimensionality reduction, clustering, and visualization.

Changes:

Initial Announcement on mloss.org.


Logo Pattern 2.4

by tomdesmedt - August 31, 2012, 02:26:01 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 4449 views, 864 downloads, 1 subscription

About: "Pattern" is a web mining module for Python. It bundles tools for data retrieval, text analysis, clustering and classification, and data visualization.

Changes:
  • Small bug fixes in overall + performance improvements.
  • Module pattern.web: updated to the new Bing API (Bing API has is paid service now).
  • Module pattern.en: now includes Norvig's spell checking algorithm.
  • Module pattern.de: new German tagger/chunker, courtesy of Schneider & Volk (1998) who kindly agreed to release their work in Pattern under BSD.
  • Module pattern.search: the search syntax now includes { } syntax to define match groups.
  • Module pattern.vector: fast implementation of information gain for feature selection.
  • Module pattern.graph: now includes a toy semantic network of commonsense (see examples).
  • Module canvas.js: image pixel effects & editor now supports live editing

Logo MLWizard 5.2

by remat - July 26, 2012, 15:04:14 CET [ Project Homepage BibTeX Download ] 1636 views, 354 downloads, 1 subscription

About: MLwizard recommends and optimizes classification algorithms based on meta-learning and is a software wizard fully integrated into RapidMiner but can be used as library as well.

Changes:

Faster parameter optimization using genetic algorithm with predefined start population.


Logo NaN toolbox 2.5.2

by schloegl - February 10, 2012, 11:45:52 CET [ Project Homepage BibTeX Download ] 21511 views, 4100 downloads, 1 subscription

About: NaN-toolbox is a statistics and machine learning toolbox for handling data with and without missing values.

Changes:

Changes in v.2.5.2 - faster version of quantile if multiple quantiles are requested - removes the dependency on ZLIB and thus - fixes "pkg install nan" for Octave on Windows - a number of minor improvements

For details see the CHANGELOG at http://pub.ist.ac.at/~schloegl/matlab/NaN/CHANGELOG


Logo JMLR SSA Toolbox 1.3

by paulbuenau - January 24, 2012, 15:51:02 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 8147 views, 2334 downloads, 1 subscription

About: The SSA Toolbox is an efficient, platform-independent, standalone implementation of the Stationary Subspace Analysis algorithm with a friendly graphical user interface and a bridge to Matlab. Stationary Subspace Analysis (SSA) is a general purpose algorithm for the explorative analysis of non-stationary data, i.e. data whose statistical properties change over time. SSA helps to detect, investigate and visualize temporal changes in complex high-dimensional data sets.

Changes:
  • Various bugfixes.

Showing Items 1-20 of 23 on page 1 of 2: 1 2 Next