All entries.
Showing Items 141-150 of 540 on page 15 of 54: First Previous 10 11 12 13 14 15 16 17 18 19 20 Next Last

Logo Ordinal Choquistic Regression 1.00

by AliFall - January 30, 2014, 15:42:34 CET [ BibTeX BibTeX for corresponding Paper Download ] 990 views, 233 downloads, 1 subscription

About: "Ordinal Choquistic Regression" model using the maximum likelihood

Changes:

Initial Announcement on mloss.org.


Logo A Parallel LDA Learning Toolbox 1.0

by yanjianfeng - January 24, 2014, 11:48:07 CET [ BibTeX Download ] 925 views, 309 downloads, 1 subscription

About: We introduces PLL, a parallel LDA learning toolbox for big topic modeling.

Changes:

Fix some compiling errors.


Logo DRVQ 1.0.1-beta

by iavr - January 18, 2014, 17:26:34 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 1036 views, 257 downloads, 1 subscription

About: DRVQ is a C++ library implementation of dimensionality-recursive vector quantization, a fast vector quantization method in high-dimensional Euclidean spaces under arbitrary data distributions. It is an approximation of k-means that is practically constant in data size and applies to arbitrarily high dimensions but can only scale to a few thousands of centroids. As a by-product of training, a tree structure performs either exact or approximate quantization on trained centroids, the latter being not very precise but extremely fast.

Changes:

Initial Announcement on mloss.org.


Logo ELKI 0.6.0

by erich - January 10, 2014, 18:32:28 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 10256 views, 1866 downloads, 3 subscriptions

About: ELKI is a framework for implementing data-mining algorithms with support for index structures, that includes a wide variety of clustering and outlier detection methods.

Changes:

Additions and Improvements from ELKI 0.5.5:

Algorithms

Clustering:

  • Hierarchical Clustering - the slower naive variants were added, and the code was refactored
  • Partition extraction from hierarchical clusterings - different linkage strategies (e.g. Ward)
  • Canopy pre-Clustering
  • Naive Mean-Shift Clustering
  • Affinity propagation clustering (both with distances and similarities / kernel functions)
  • K-means variations: Best-of-multiple-runs, bisecting k-means
  • New k-means initialization: farthest points, sample initialization
  • Cheng and Church Biclustering
  • P3C Subspace Clustering
  • One-dimensional clustering algorithm based on kernel density estimation

Outlier detection

  • COP - correlation outlier probabilities
  • LDF - a kernel density based LOF variant
  • Simplified LOF - a simpler version of LOF (not using reachability distance)
  • Simple Kernel Density LOF - a simple LOF using kernel density (more consistent than LDF)
  • Simple outlier ensemble algorithm
  • PINN - projection indexed nearest neighbors, via projected indexes.
  • ODIN - kNN graph based outlier detection
  • DWOF - Dynamic-Window Outlier Factor (contributed by Omar Yousry)
  • ABOD refactored, into ABOD, FastABOD and LBABOD

Distances

  • Geodetic distances now support different world models (WGS84 etc.) and are subtantially faster.
  • Levenshtein distances for processing strings, e.g. for analyzing phonemes (contributed code, see "Word segmentation through cross-lingual word-to-phoneme alignment", SLT2013, Stahlberg et al.)
  • Bray-Curtis, Clark, Kulczynski1 and Lorentzian distances with R-tree indexing support
  • Histogram matching distances
  • Probabilistic divergence distances (Jeffrey, Jensen-Shannon, Chi2, Kullback-Leibler)
  • Kulczynski2 similarity
  • Kernel similarity code has been refactored, and additional kernel functions have been added

Database Layer and Data Types

Projection layer * Parser for simple textual data (for use with Levenshtein distance) Various random projection families (including Feature Bagging, Achlioptas, and p-stable) Latitude+Longitude to ECEF Sparse vector improvements and bug fixes New filter: remove NaN values and missing values New filter: add histogram-based jitter New filter: normalize using statistical distributions New filter: robust standardization using Median and MAD New filter: Linear discriminant analysis (LDA)

Index Layer

  • Another speed up in R-trees
  • Refactoring of M- and R-trees: Support for different strategies in M-tree New strategies for M-tree splits Speedups in M-tree
  • New index structure: in-memory k-d-tree
  • New index structure: in-memory Locality Sensitive Hashing (LSH)
  • New index structure: approximate projected indexes, such as PINN
  • Index support for geodetic data - (Details: Geodetic Distance Queries on R-Trees for Indexing Geographic Data, SSTD13)
  • Sampled k nearest neighbors: reference KDD13 "Subsampling for Efficient and Effective Unsupervised Outlier Detection Ensembles"
  • Cached (precomputed) k-nearest neighbors to share across multiple runs
  • Benchmarking "algorithms" for indexes

Mathematics and Statistics

  • Many new distributions have been added, now 28 different distributions are supported
  • Additional estimation methods (using advanced statistics such as L-Moments), now 44 estimators are available
  • Trimming and Winsorizing
  • Automatic best-fit distribution estimation
  • Preprocessor using these distributions for rescaling data sets
  • API changes related to the new distributions support
  • More kernel density functions
  • RANSAC covariance matrix builder (unfortunately rather slow)

Visualization

  • 3D projected coordinates (Details: Interactive Data Mining with 3D-Parallel-Coordinate-Trees, SIGMOD2013)
  • Convex hulls now also include nested hierarchical clusters

Other

  • Parser speedups
  • Sparse vector bug fixes and improvements
  • Various bug fixes
  • PCA, MDS and LDA filters
  • Text output was slightly improved (but still needs to be redesigned from scratch - please contribute!)
  • Refactoring of hierarchy classes
  • New heap classes and infrastructure enhancements
  • Classes can have aliases, e.g. "l2" for euclidean distance.
  • Some error messages were made more informative.
  • Benchmarking classes, also for approximate nearest neighbor search.

Logo AIDE 0.2

by khalili - January 3, 2014, 18:01:06 CET [ Project Homepage BibTeX Download ] 968 views, 224 downloads, 1 subscription

About: AIDE (Automata Identification Engine) is a free open source tool for automata inference algorithms developed in C# .Net.

Changes:

Initial Announcement on mloss.org.


About: Kaiye Wang, Ran He, Wei Wang, Liang Wang, Tiuniu Tan. Learning Coupled Feature Spaces for Cross-modal Matching. In ICCV, 2013.

Changes:

Initial Announcement on mloss.org.


Logo hapFabia 1.4.2

by hochreit - December 28, 2013, 17:24:29 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 2397 views, 478 downloads, 1 subscription

About: hapFabia is an R package for identification of very short segments of identity by descent (IBD) characterized by rare variants in large sequencing data. It detects 100 times smaller segments than previous methods.

Changes:

o citation update

o plot function improved


About: hapFabia is an R package for identification of very short segments of identity by descent (IBD) characterized by rare variants in large sequencing data.

Changes:

o citation update

o plot function improved


Logo Malheur 0.5.4

by konrad - December 25, 2013, 13:20:31 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 12410 views, 2383 downloads, 1 subscription

About: Automatic Analysis of Malware Behavior using Machine Learning

Changes:

Support for new version of libarchive. Minor bug fixes.


Logo minFunc 2012

by markSchmidt - December 18, 2013, 01:07:07 CET [ Project Homepage BibTeX Download ] 1076 views, 238 downloads, 1 subscription

About: minFunc is a Matlab function for unconstrained optimization of differentiable real-valued multivariate functions using line-search methods. It uses an interface very similar to the Matlab Optimization Toolbox function fminunc, and can be called as a replacement for this function. On many problems, minFunc requires fewer function evaluations to converge than fminunc (or minimize.m). Further it can optimize problems with a much larger number of variables (fminunc is restricted to several thousand variables), and uses a line search that is robust to several common function pathologies.

Changes:

Initial Announcement on mloss.org.


Showing Items 141-150 of 540 on page 15 of 54: First Previous 10 11 12 13 14 15 16 17 18 19 20 Next Last