Projects supporting the csv data format.
Showing Items 1-20 of 52 on page 1 of 3: 1 2 3 Next

Logo MLweb 0.1.2

by lauerfab - October 9, 2015, 11:55:52 CET [ Project Homepage BibTeX Download ] 881 views, 227 downloads, 2 subscriptions

About: MLweb is an open source project that aims at bringing machine learning capabilities into web pages and web applications, while maintaining all computations on the client side. It includes (i) a javascript library to enable scientific computing within web pages, (ii) a javascript library implementing machine learning algorithms for classification, regression, clustering and dimensionality reduction, (iii) a web application providing a matlab-like development environment.

  • Add Regression:AutoReg method
  • Add KernelRidgeRegression tuning function
  • More efficient predictions for KRR, SVM, SVR
  • Add BFGS optimization method
  • Faster QR, SVD and eigendecomposition
  • Better support for sparse vectors and matrices
  • Add linear algebra benchmark at
  • Fix plots in LALOlib/ML.js
  • Fix cross-origin issues in new MLlab()
  • Small bug fixes

Logo Armadillo library 6.100

by cu24gjf - October 3, 2015, 07:12:38 CET [ Project Homepage BibTeX Download ] 65067 views, 13229 downloads, 5 subscriptions

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 3 votes)

About: Armadillo is a template C++ linear algebra library aiming towards a good balance between speed and ease of use, with a function syntax similar to MATLAB. Matrix decompositions are provided through optional integration with LAPACK, or one of its high performance drop-in replacements (eg. Intel MKL, OpenBLAS).

  • faster norm() and normalise() when using Intel MKL, ATLAS or OpenBLAS
  • faster handling of compound expressions by join_rows() and join_cols()
  • added Schur decomposition: schur()
  • added .each_slice() for repeated matrix operations on each slice of a cube
  • expanded join_slices() to handle joining cubes with matrices
  • expanded .each_col() and .each_row() to handle out-of-place operations
  • stricter handling of matrix objects by hist() and histc()
  • Cube class now delays allocation of .slice() related structures until needed

Logo python weka wrapper 0.3.3

by fracpete - September 26, 2015, 06:11:42 CET [ Project Homepage BibTeX Download ] 17487 views, 3764 downloads, 3 subscriptions

About: A thin Python wrapper that uses the javabridge Python library to communicate with a Java Virtual Machine executing Weka API calls.

  • updated to Weka 3.7.13
  • documentation now covers the API as well

Logo BayesPy 0.3.7

by jluttine - September 23, 2015, 14:29:20 CET [ Project Homepage BibTeX Download ] 8204 views, 1999 downloads, 3 subscriptions

About: Variational Bayesian inference tools for Python

  • Enable keyword arguments when plotting via the inference engine
  • Add initial support for logging

Logo ELKI 0.7.0-20150828

by erich - September 17, 2015, 10:20:30 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 14806 views, 2729 downloads, 4 subscriptions

About: ELKI is a framework for implementing data-mining algorithms with support for index structures, that includes a wide variety of clustering and outlier detection methods.


Additions and Improvements from ELKI 0.6.0:

  • Uncertain data types, and clustering algorithms for uncertain data.

  • Major refactoring of distances - removal of Distance values and removed support for non-double-valued distance functions. While this reduces the generality of ELKI, we could remove about 2.5% of the codebase by not having to have optimized codepaths for double-distance anymore. Generics for distances were present in almost any distance-based algorithm, and we were also happy to reduce the use of generics this way. Support for non-double-valued distances can trivially be added again, e.g. by adding the specialization one level higher: at the query instead of the distance level, for example.

  • In this process, we also removed the Generics from NumberVector. The object-based get was deprecated for a good reason long ago, and e.g. doubleValue are more efficient (even for non-DoubleVectors).

  • Dropped some long-deprecated classes

Clustering algorithms:


  • speedups for some initialization heuristics
  • K-means++ initialization no longer squares distances (again)
  • farthest-point heuristics now uses minimum instead of sum (renamed)
  • additional evaluation criteria
  • Elkan's and Hamerly's faster k-means variants

CLARA clustering


Hierarchical clustering

  • Renamed naive algorithm to AGNES
  • Anderbergs algorithm (faster than AGNES, slower than SLINK)
  • CLINK for complete linkage clustering in O(n²) time, O(n) memory
  • Simple extraction from HDBSCAN
  • "Optimal" extraction from HDBSCAN
  • HDBSCAN, in two variants

LSDBC clustering

EM clustering was refactored and moved into its own package. The new version is much more extensible.

Parallel computation framework, and some parallelized algorithms

  • Parallel k-means
  • Parallel LOF and variants


  • LibSVM format parser


  • kNN classification (with index acceleration)

Evaluation: Internal cluster evaluation:

  • Silhouette index
  • Simplified Silhouette index (faster)
  • Davis-Bouldin index
  • PBM index
  • Variance-Ratio-Criteria
  • Sum of squared errors
  • C-Index
  • Concordant pair indexes (Gamma, Tau)
  • Different noise handling strategies for internal indexes

Statistical dependence measures:

  • Distance correlation dCor.
  • Hoeffings D.
  • Some divergence / mutual information measures.

Distance functions:

  • Big refactoring.
  • Time series distances refactored, allow variable length series now.
  • Hellinger distance and kernel function.


  • Faster MDS implementation using power iterations.

Indexing improvements:

  • Precomputed distance matrix "index".
  • iDistance index (static only).
  • Inverted-list index for sparse data and cosine/arccosine distance.
  • cover tree index (static only).

Frequent Itemset Mining:

  • Improved APRIORI implementation.
  • FP-Growth added.
  • Eclat (basic version only) added.

Uncertain clustering:

  • Discrete and continuous data models
  • FDBSCAN clustering
  • UKMeans clustering
  • CKMeans clustering
  • Representative Uncertain Clustering (Meta-algorithm)
  • Center-of-mass meta Clustering (allows using other clustering algorithms on uncertain objects) (KDD'14)

Outlier detection changes / smaller improvements:

  • KDEOS outlier detection (SDM14)
  • k-means based outlier detection (distance to centroid) and Silhouette coefficient based approach (which does not work too well on the toy data sets - the lowest silhouette are usually where two clusters touch).
  • bug fix in kNN weight, when distances are tied and kNN yields more than k results.
  • kNN and kNN weight outlier have their k parameter changed: old 2NN outlier is now 1NN outlier, as commonly understood in classification literature (1 nearest neighbor ''other than the query object''; whereas in database literature the 1NN is usually the query object itself). You can get the old result back by decreasing k by one easily.
  • LOCI implementation is now only O(n^3 log n) instead of O(n^4).


  • MiniGUI has two "secret" new options: -minigui.last -minigui.autorun to load the last saved configuration and run it, for convenience.

  • Logging API has been extended, to make logging more convenient in a number of places (saving some lines for progress logging and timing).

Logo YCML 0.2.2

by yconst - August 24, 2015, 20:28:45 CET [ Project Homepage BibTeX Download ] 640 views, 118 downloads, 3 subscriptions

About: A Machine Learning framework for Objective-C and Swift (OS X / iOS)


Initial Announcement on

Logo Java Data Mining Package 0.3.0

by arndt - August 19, 2015, 15:44:46 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 863 views, 155 downloads, 3 subscriptions

About: A Java library for machine learning and data analytics


Initial Announcement on

Logo JMLR dlib ml 18.17

by davis685 - August 16, 2015, 04:33:39 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 114384 views, 19176 downloads, 4 subscriptions

About: This project is a C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems.


This release adds new clustering tools as well as upgrades the shape_predictor to allow training on datasets with missing landmarks. It also includes bug fixes and minor usability improvements.

Logo RiVal 0.1

by alansaid - July 29, 2015, 12:39:54 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 628 views, 153 downloads, 2 subscriptions

About: Rival is an open source Java toolkit for recommender system evaluation. It provides a simple way to create evaluation results comparable across different recommendation frameworks.


Initial Announcement on

Logo NaN toolbox 2.8.1

by schloegl - July 6, 2015, 22:43:23 CET [ Project Homepage BibTeX Download ] 38052 views, 7930 downloads, 3 subscriptions

About: NaN-toolbox is a statistics and machine learning toolbox for handling data with and without missing values.


Changes in v.2.8.1 - number of bug fixes - compatibility issues with recent versions of Octave are addressed - upgrade to libsvm 3-12

For details see the CHANGELOG at

Logo ADAMS 0.4.10

by fracpete - June 22, 2015, 23:14:58 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 14160 views, 2890 downloads, 3 subscriptions

About: The Advanced Data mining And Machine learning System (ADAMS) is a novel, flexible workflow engine aimed at quickly building and maintaining real-world, complex knowledge workflows.

  • fixes a glitch in the debugging functionality, when using the Breakpoint control actor

About: R package implementing statistical test and post hoc tests to compare multiple algorithms in multiple problems.


Initial Announcement on

Logo deepdetect 0.1

by beniz - June 2, 2015, 09:25:28 CET [ Project Homepage BibTeX Download ] 846 views, 242 downloads, 3 subscriptions

About: A Deep Learning API and server


Initial Announcement on

Logo Probabilistic Classification Vector Machine 0.21

by fmschleif - May 26, 2015, 16:24:17 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 2137 views, 509 downloads, 3 subscriptions

About: PCVM library a c++/armadillo implementation of the Probabilistic Classification Vector Machine.


27.05.2015: - Matlab binding under Windows available. Added a solution file for VS'2013 express to compile a matlab mex binding. Can not yet confirm that under windows the code is really using multiple cores (under linux it does)

Logo Cognitive Foundry 3.4.1

by Baz - May 13, 2015, 06:55:24 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 21992 views, 3650 downloads, 3 subscriptions

About: The Cognitive Foundry is a modular Java software library of machine learning components and algorithms designed for research and applications.

  • General:
    • Updated MTJ to version 1.0.2 and netlib-java to 1.1.2.
    • Updated XStream to version 1.4.8.
  • Common:
    • Fixed issue in VectorUnionIterator.
  • Learning:
    • Added Alternating Least Squares (ALS) Factorization Machine training implementation.
    • Fixed performance issue in Factorization Machine where linear component was not making use of sparsity.
    • Added utility function to sigmoid unit.

Logo streamDM 0.0.1

by abifet - April 28, 2015, 12:34:00 CET [ Project Homepage BibTeX Download ] 917 views, 375 downloads, 1 subscription

About: streamDM is a new open source data mining and machine learning library, designed on top of Spark Streaming, an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of data streams.


Initial Announcement on

Logo Choquistic Utilitaristic Regression 1.00

by AliFall - April 17, 2015, 11:31:20 CET [ BibTeX BibTeX for corresponding Paper Download ] 836 views, 349 downloads, 2 subscriptions

About: This Matlab package implements a method for learning a choquistic regression model (represented by a corresponding Moebius transform of the underlying fuzzy measure), using the maximum likelihood approach proposed in [2], eqquiped by sigmoid normalization, see [1].


Initial Announcement on

Logo Loom 0.2.10

by fritzo - March 19, 2015, 19:22:03 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 1162 views, 267 downloads, 2 subscriptions

About: A streaming inference and query engine for the Cross-Categorization model of tabular data.


Initial Announcement on

Logo Hivemall 0.3

by myui - March 13, 2015, 17:08:22 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 6905 views, 1156 downloads, 3 subscriptions

About: Hivemall is a scalable machine learning library running on Hive/Hadoop.

  • Supported Matrix Factorization
  • Added a support for TF-IDF computation
  • Supported AdaGrad/AdaDelta
  • Supported AdaGradRDA classification
  • Added normalization scheme

Logo Machine Learning Support System MALSS 0.5.0

by canard0328 - February 20, 2015, 15:56:02 CET [ Project Homepage BibTeX Download ] 1019 views, 271 downloads, 1 subscription

About: MALSS is a python module to facilitate machine learning tasks.


Initial Announcement on

Showing Items 1-20 of 52 on page 1 of 3: 1 2 3 Next