Projects that are tagged with visualization.


Logo ELKI 0.6.0

by erich - January 10, 2014, 18:32:28 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 9801 views, 1792 downloads, 3 subscriptions

About: ELKI is a framework for implementing data-mining algorithms with support for index structures, that includes a wide variety of clustering and outlier detection methods.

Changes:

Additions and Improvements from ELKI 0.5.5:

Algorithms

Clustering:

  • Hierarchical Clustering - the slower naive variants were added, and the code was refactored
  • Partition extraction from hierarchical clusterings - different linkage strategies (e.g. Ward)
  • Canopy pre-Clustering
  • Naive Mean-Shift Clustering
  • Affinity propagation clustering (both with distances and similarities / kernel functions)
  • K-means variations: Best-of-multiple-runs, bisecting k-means
  • New k-means initialization: farthest points, sample initialization
  • Cheng and Church Biclustering
  • P3C Subspace Clustering
  • One-dimensional clustering algorithm based on kernel density estimation

Outlier detection

  • COP - correlation outlier probabilities
  • LDF - a kernel density based LOF variant
  • Simplified LOF - a simpler version of LOF (not using reachability distance)
  • Simple Kernel Density LOF - a simple LOF using kernel density (more consistent than LDF)
  • Simple outlier ensemble algorithm
  • PINN - projection indexed nearest neighbors, via projected indexes.
  • ODIN - kNN graph based outlier detection
  • DWOF - Dynamic-Window Outlier Factor (contributed by Omar Yousry)
  • ABOD refactored, into ABOD, FastABOD and LBABOD

Distances

  • Geodetic distances now support different world models (WGS84 etc.) and are subtantially faster.
  • Levenshtein distances for processing strings, e.g. for analyzing phonemes (contributed code, see "Word segmentation through cross-lingual word-to-phoneme alignment", SLT2013, Stahlberg et al.)
  • Bray-Curtis, Clark, Kulczynski1 and Lorentzian distances with R-tree indexing support
  • Histogram matching distances
  • Probabilistic divergence distances (Jeffrey, Jensen-Shannon, Chi2, Kullback-Leibler)
  • Kulczynski2 similarity
  • Kernel similarity code has been refactored, and additional kernel functions have been added

Database Layer and Data Types

Projection layer * Parser for simple textual data (for use with Levenshtein distance) Various random projection families (including Feature Bagging, Achlioptas, and p-stable) Latitude+Longitude to ECEF Sparse vector improvements and bug fixes New filter: remove NaN values and missing values New filter: add histogram-based jitter New filter: normalize using statistical distributions New filter: robust standardization using Median and MAD New filter: Linear discriminant analysis (LDA)

Index Layer

  • Another speed up in R-trees
  • Refactoring of M- and R-trees: Support for different strategies in M-tree New strategies for M-tree splits Speedups in M-tree
  • New index structure: in-memory k-d-tree
  • New index structure: in-memory Locality Sensitive Hashing (LSH)
  • New index structure: approximate projected indexes, such as PINN
  • Index support for geodetic data - (Details: Geodetic Distance Queries on R-Trees for Indexing Geographic Data, SSTD13)
  • Sampled k nearest neighbors: reference KDD13 "Subsampling for Efficient and Effective Unsupervised Outlier Detection Ensembles"
  • Cached (precomputed) k-nearest neighbors to share across multiple runs
  • Benchmarking "algorithms" for indexes

Mathematics and Statistics

  • Many new distributions have been added, now 28 different distributions are supported
  • Additional estimation methods (using advanced statistics such as L-Moments), now 44 estimators are available
  • Trimming and Winsorizing
  • Automatic best-fit distribution estimation
  • Preprocessor using these distributions for rescaling data sets
  • API changes related to the new distributions support
  • More kernel density functions
  • RANSAC covariance matrix builder (unfortunately rather slow)

Visualization

  • 3D projected coordinates (Details: Interactive Data Mining with 3D-Parallel-Coordinate-Trees, SIGMOD2013)
  • Convex hulls now also include nested hierarchical clusters

Other

  • Parser speedups
  • Sparse vector bug fixes and improvements
  • Various bug fixes
  • PCA, MDS and LDA filters
  • Text output was slightly improved (but still needs to be redesigned from scratch - please contribute!)
  • Refactoring of hierarchy classes
  • New heap classes and infrastructure enhancements
  • Classes can have aliases, e.g. "l2" for euclidean distance.
  • Some error messages were made more informative.
  • Benchmarking classes, also for approximate nearest neighbor search.

Logo Differential Dependency Network cabig cytoscape plugin 1.0

by cbil - October 27, 2013, 17:31:58 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 1159 views, 259 downloads, 1 subscription

About: DDN learns and visualize differential dependency networks from condition-specific data.

Changes:

Initial Announcement on mloss.org.


Logo MLDemos 0.5.1

by basilio - March 2, 2013, 16:06:13 CET [ Project Homepage BibTeX Download ] 17906 views, 4264 downloads, 2 subscriptions

About: MLDemos is a user-friendly visualization interface for various machine learning algorithms for classification, regression, clustering, projection, dynamical systems, reward maximisation and reinforcement learning.

Changes:

New Visualization and Dataset Features Added 3D visualization of samples and classification, regression and maximization results Added Visualization panel with individual plots, correlations, density, etc. Added Editing tools to drag/magnet data, change class, increase or decrease dimensions of the dataset Added categorical dimensions (indexed dimensions with non-numerical values) Added Dataset Editing panel to swap, delete and rename dimensions, classes or categorical values Several bug-fixes for display, import/export of data, classification performance

New Algorithms and methodologies Added Projections to pre-process data (which can then be classified/regressed/clustered), with LDA, PCA, KernelPCA, ICA, CCA Added Grid-Search panel for batch-testing ranges of values for up to two parameters at a time Added One-vs-All multi-class classification for non-multi-class algorithms Trained models can now be kept and tested on new data (training on one dataset, testing on another) Added a dataset generator panel for standard toy datasets (e.g. swissroll, checkerboard,...) Added a number of clustering, regression and classification algorithms (FLAME, DBSCAN, LOWESS, CCA, KMEANS++, GP Classification, Random Forests) Added Save/Load Model option for GMMs and SVMs Added Growing Hierarchical Self Organizing Maps (original code by Michael Dittenbach) Added Automatic Relevance Determination for SVM with RBF kernel (Thanks to Ashwini Shukla!)


Logo Orange 2.6

by janez - February 14, 2013, 18:15:08 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 11477 views, 2257 downloads, 1 subscription

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 1 vote)

About: Orange is a component-based machine learning and data mining software. It includes a friendly yet powerful and flexible graphical user interface for visual programming. For more advanced use(r)s, [...]

Changes:

The core of the system (except the GUI) no longer includes any GPL code and can be licensed under the terms of BSD upon request. The graphical part remains under GPL.

Changed the BibTeX reference to the paper recently published in JMLR MLOSS.


Logo Divvy 1.1.1

by jlewis - November 14, 2012, 20:21:29 CET [ Project Homepage BibTeX Download ] 1565 views, 789 downloads, 1 subscription

About: Divvy is a Mac OS X application for performing dimensionality reduction, clustering, and visualization.

Changes:

Initial Announcement on mloss.org.


Logo MLPlot Beta

by pascal - August 22, 2011, 11:07:53 CET [ Project Homepage BibTeX Download ] 2186 views, 440 downloads, 1 subscription

About: MLPlot is a lightweight plotting library written in Java.

Changes:

Initial Announcement on mloss.org.


Logo Finding nonlinear and stochastic structures in time series 1

by Dante - October 29, 2008, 11:14:44 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 5797 views, 1144 downloads, 2 subscriptions

About: The Delay vector variance (DVV) method uses predictability of the signal in phase space to characterize the time series. Using the surrogate data methodology, so called DVV plots and DVV scatter [...]

Changes:

Initial Announcement on mloss.org.