All entries.
Showing Items 1-10 of 605 on page 1 of 61: 1 2 3 4 5 6 Next Last

Logo ELKI 0.7.0

by erich - November 27, 2015, 18:23:16 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 15723 views, 2864 downloads, 4 subscriptions

About: ELKI is a framework for implementing data-mining algorithms with support for index structures, that includes a wide variety of clustering and outlier detection methods.


Additions and Improvements from ELKI 0.6.0:

ELKI is now available on Maven:|de.lmu.ifi.dbs.elki|elki|0.7.0|jar

Please clone for a minimal project example.

Uncertain data types, and clustering algorithms for uncertain data.

Major refactoring of distances - removal of Distance values and removed support for non-double-valued distance functions (in particular DoubleDistance was removed). While this reduces the generality of ELKI, we could remove about 2.5% of the codebase by not having to have optimized codepaths for double-distance anymore. Generics for distances were present in almost any distance-based algorithm, and we were also happy to reduce the use of generics this way. Support for non-double-valued distances can trivially be added again, e.g. by adding the specialization one level higher: at the query instead of the distance level, for example. In this process, we also removed the Generics from NumberVector. The object-based get was deprecated for a good reason long ago, and e.g. doubleValue are more efficient (even for non-DoubleVectors).

Dropped some long-deprecated classes.


  • speedups for some initialization heuristics.

  • K-means++ initialization no longer squares distances (again).

  • farthest-point heuristics now uses minimum instead of sum (renamed).

  • additional evaluation criteria.

  • Elkan's and Hamerly's faster k-means variants.

CLARA clustering.


Hierarchical clustering:

  • Renamed naive algorithm to AGNES.

  • Anderbergs algorithm (faster than AGNES, slower than SLINK).

  • CLINK for complete linkage clustering in O(n²) time, O(n) memory.

  • Simple extraction from HDBSCAN.

  • "Optimal" extraction from HDBSCAN.

  • HDBSCAN, in two variants.

LSDBC clustering.

EM clustering was refactored and moved into its own package. The new version is much more extensible.

OPTICS clustering:

  • Added a list-based variant of OPTICS to our heap-based.

  • FastOPTICS (contributed by Johannes Schneider).

  • Improved OPTICS Xi cluster extraction.

Outlier detection:

  • KDEOS outlier detection (SDM14).

  • k-means based outlier detection (distance to centroid) and Silhouette coefficient based approach (which does not work too well on the toy data sets - the lowest silhouette are usually where two clusters touch).

  • bug fix in kNN weight, when distances are tied and kNN yields more than k results.

  • kNN and kNN weight outlier have their k parameter changed: old 2NN outlier is now 1NN outlier, as commonly understood in classification literature (1 nearest neighbor other than the query object; whereas in database literature the 1NN is usually the query object itself). You can get the old result back by decreasing k by one easily.

  • LOCI implementation is now only O(n^3 log n) instead of O(n^4).

  • Local Isolation Coefficient (LIC).

  • IDOS outlier detection with intrinsic dimensionality.

  • Baseline intrinsic dimensionality outlier detection.

  • Variance-of-Volumes outlier detection (VOV).

Parallel computation framework, and some parallelized algorithms

  • Parallel k-means.

  • Parallel LOF and variants.

LibSVM format parser.

kNN classification (with index acceleration).

Internal cluster evaluation:

  • Silhouette index.

  • Simplified Silhouette index (faster).

  • Davis-Bouldin index.

  • PBM index.

  • Variance-Ratio-Criteria.

  • Sum of squared errors.

  • C-Index.

  • Concordant pair indexes (Gamma, Tau).

  • Different noise handling strategies for internal indexes.

Statistical dependence measures:

  • Distance correlation dCor.

  • Hoeffings D.

  • Some divergence / mutual information measures.

Distance functions:

  • Big refactoring.

  • Time series distances refactored, allow variable length series now.

  • Hellinger distance and kernel function.


  • Faster MDS implementation using power iterations.

Indexing improvements:

  • Precomputed distance matrix "index".

  • iDistance index (static only).

  • Inverted-list index for sparse data and cosine/arccosine distance.

  • Cover tree index (static only).

  • Additional LSH hash functions.

Frequent Itemset Mining:

  • Improved APRIORI implementation.

  • FP-Growth added.

  • Eclat (basic version only) added.

Uncertain clustering:

  • Discrete and continuous data models.

  • FDBSCAN clustering.

  • UKMeans clustering.

  • CKMeans clustering.

  • Representative Uncertain Clustering (Meta-algorithm).

  • Center-of-mass meta Clustering (allows using other clustering algorithms on uncertain objects).


  • Several estimators for intrinsic dimensionality.

MiniGUI has two "secret" new options: -minigui.last -minigui.autorun to load the last saved configuration and run it, for convenience.

Logging API has been extended, to make logging more convenient in a number of places (saving some lines for progress logging and timing).

Logo KeLP 2.0.0

by kelpadmin - November 26, 2015, 16:14:53 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 3993 views, 993 downloads, 3 subscriptions

About: Kernel-based Learning Platform (KeLP) is Java framework that supports the implementation of kernel-based learning algorithms, as well as an agile definition of kernel functions over generic data representation, e.g. vectorial data or discrete structures. The framework has been designed to decouple kernel functions and learning algorithms, through the definition of specific interfaces. Once a new kernel function has been implemented, it can be automatically adopted in all the available kernel-machine algorithms. KeLP includes different Online and Batch Learning algorithms for Classification, Regression and Clustering, as well as several Kernel functions, ranging from vector-based to structural kernels. It allows to build complex kernel machine based systems, leveraging on JSON/XML interfaces to instantiate classifiers without writing a single line of code.


This is a major release that includes brand new features as well as a renewed architecture of the entire project.

Now KeLP is organized in four maven projects:

  • kelp-core: it contains the infrastructure of abstract classes and interfaces to work with KeLP. Furthermore, some implementations of algorithms, kernels and representations are included, to provide a base operative environment.

  • kelp-additional-kernels: it contains several kernel functions that extend the set of kernels made available in the kelp-core project. Moreover, this project implements the specific representations required to enable the application of such kernels. In this project the following kernel functions are considered: Sequence kernels, Tree kernels and Graphs kernels.

  • kelp-additional-algorithms: it contains several learning algorithms extending the set of algorithms provided in the kelp-core project, e.g. the C-Support Vector Machine or ν-Support Vector Machine learning algorithms. In particular, advanced learning algorithms for classification and regression can be found in this package. The algorithms are grouped in: 1) Batch Learning, where the complete training dataset is supposed to be entirely available during the learning phase; 2) Online Learning, where individual examples are exploited one at a time to incrementally acquire the model.

  • kelp-full: this is the complete package of KeLP. It aggregates the previous modules in one jar. It contains also a set of fully functioning examples showing how to implement a learning system with KeLP. Batch learning algorithm as well as Online Learning algorithms usage is shown here. Different examples cover the usage of standard kernel, Tree Kernels and Sequence Kernel, with caching mechanisms.

Furthermore this new release includes:

  • CsvDatasetReader: it allows to read files in CSV format

  • DCDLearningAlgorithm: it is the implementation of the Dual Coordinate Descent learning algorithm

  • methods for checking the consistency of a dataset.

Check out this new version from our repositories. API Javadoc is already available. Your suggestions will be very precious for us, so download and try KeLP 2.0.0!

Logo PROFET 1.0.0

by Hamda - November 26, 2015, 13:20:28 CET [ Project Homepage BibTeX Download ] 136 views, 25 downloads, 1 subscription

About: Software for Automatic Construction and Inference of DBNs Based on Mathematical Models


Initial Announcement on

Logo A Library for Online Streaming Feature Selection 1.0

by ykui713 - November 25, 2015, 13:23:01 CET [ BibTeX Download ] 152 views, 41 downloads, 0 subscriptions

About: LOFS is a software toolbox for online streaming feature selection


Initial Announcement on

Logo PyScriptClassifier 0.3.0

by cjb60 - November 25, 2015, 04:07:51 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 892 views, 236 downloads, 2 subscriptions

About: Easily prototype WEKA classifiers and filters using Python scripts.



  • Filters have now been implemented.
  • Classifier and filter classes satisfy base unit tests.


  • Can now choose to save the script in the model using the -save flag.


  • Added Python 3 support.
  • Added uses decorator to prevent non-essential arguments from being passed.
  • Fixed nasty bug where imputation, binarisation, and standardisation would not actually be applied to test instances.
  • GUI in WEKA now displays the exception as well.
  • Fixed bug where single quotes in attribute values could mess up args creation.
  • ArffToPickle now recognises class index option and arguments.
  • Fix nasty bug where filters were not being saved and were made from scratch from test data.


  • ArffToArgs gets temporary folder in a platform-independent way, instead of assuming /tmp/.
  • Can now save args in ArffToPickle using save.


  • Initial release.

Logo bandicoot 0.4

by yvesalexandre - November 20, 2015, 17:08:31 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 293 views, 54 downloads, 2 subscriptions

About: An open-source Python toolbox to analyze mobile phone metadata.


Initial Announcement on

Logo ADAMS 0.4.11

by fracpete - November 18, 2015, 10:58:55 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 15417 views, 3086 downloads, 3 subscriptions

About: The Advanced Data mining And Machine learning System (ADAMS) is a novel, flexible workflow engine aimed at quickly building and maintaining real-world, complex knowledge workflows.


Some highlights of this release:

  • switch to Java 8
  • preferred IDE is now IntelliJ IDEA
  • removed OSX builds
  • 43 new actors
  • 13 new conversions
  • removed obsolete actors and conversions
  • added video support (video files and webcams)
  • added object detection and tracking (incl recording of object trails)
  • proof-of-concept remote-execution of jobs
  • SSH console
  • support for webscraping using JSoup
  • MEKA upgraded to 1.9.0
  • MOA regressor support added
  • better syntax highlighting for Groovy/Jython
  • several new Weka classifiers (eg Veto, LeanMultiScheme, ThresholdedBinaryClassification, InputSmearing)
  • new genetic algorithm: Hermione
  • extended the abstaining classifier framework (integrates with Weka)
  • adams-imaging split into: adams-imaging, adams-boofcv, adams-imagemagick, adams-imagej, adams-openimaj (newly added)

Logo Deep Semantic Ranking Based Hashing 1.0

by openpr_nlpr - November 18, 2015, 07:25:16 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 319 views, 79 downloads, 3 subscriptions

About: This algorithm is described in Deep Semantic Ranking Based Hashing for Multi-Label Image Retrieval. See


Initial Announcement on

Logo Hype 0.1.0

by gbaydin - November 16, 2015, 18:35:57 CET [ Project Homepage BibTeX Download ] 276 views, 43 downloads, 3 subscriptions

About: Hype is a proof-of-concept deep learning library, where you can perform optimization on compositional machine learning systems of many components, even when such components themselves internally perform optimization.


Initial Announcement on

Logo Armadillo library 6.200

by cu24gjf - November 15, 2015, 06:54:50 CET [ Project Homepage BibTeX Download ] 68718 views, 14004 downloads, 5 subscriptions

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 3 votes)

About: Armadillo is a template C++ linear algebra library aiming towards a good balance between speed and ease of use, with a function syntax similar to MATLAB. Matrix decompositions are provided through optional integration with LAPACK, or one of its high performance drop-in replacements (eg. Intel MKL, OpenBLAS).

  • expanded diagmat() to handle non-square matrices and arbitrary diagonals
  • expanded trace() to handle non-square matrices
  • correction for datum::Z_0 constant
  • bug fixes for sparse eigen decomposition

Showing Items 1-10 of 605 on page 1 of 61: 1 2 3 4 5 6 Next Last