Projects supporting the txt data format.


Logo Salad 0.4.3

by chwress - August 11, 2014, 11:16:49 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 2870 views, 535 downloads, 1 subscription

About: A Content Anomaly Detector based on n-Grams

Changes:

Fixes a bug in prediction mode for the use of archives when -DUSE_NETWORK=ON


Logo Harry 0.3

by konrad - July 30, 2014, 16:15:26 CET [ Project Homepage BibTeX Download ] 1482 views, 315 downloads, 2 subscriptions

About: A Tool for Measuring String Similarity

Changes:

This new release implements 21 similarity measures for strings (Option -M). It supports splitting the computation of large similarity matrices into blocks and thus allows comparing large sets of strings (Option -s as well as -x and -y). The command-line interface has been improved and several minor bugs have been fixed.


Logo JMLR MLPACK 1.0.9

by rcurtin - July 28, 2014, 20:52:10 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 30770 views, 6186 downloads, 6 subscriptions

Rating Whole StarWhole StarWhole StarWhole Star1/2 Star
(based on 1 vote)

About: A scalable, fast C++ machine learning library, with emphasis on usability.

Changes:
  • GMM initialization is now safer and provides a working GMM when constructed with only the dimensionality and number of Gaussians (#314).
  • Check for division by 0 in Forward-Backward Algorithm in HMMs (#314).
  • Fix MaxVarianceNewCluster (used when re-initializing clusters for k-means) (#314).
  • Fixed implementation of Viterbi algorithm in HMM::Predict() (#316).
  • Significant speedups for dual-tree algorithms using the cover tree (#243, #329) including a faster implementation of FastMKS.
  • Fix for LRSDP optimizer so that it compiles and can be used (#325).
  • CF (collaborative filtering) now expects users and items to be zero-indexed, not one-indexed (#324).
  • CF::GetRecommendations() API change: now requires the number of recommendations as the first parameter. The number of users in the local neighborhood should be specified with CF::NumUsersForSimilarity().
  • Removed incorrect PeriodicHRectBound (#30).
  • Refactor LRSDP into LRSDP class and standalone function to be optimized (#318).
  • Fix for centering in kernel PCA (#355).
  • Added simulated annealing (SA) optimizer, contributed by Zhihao Lou.
  • HMMs now support initial state probabilities; these can be set in the constructor, trained, or set manually with HMM::Initial() (#315).
  • Added Nyström method for kernel matrix approximation by Marcus Edel.
  • Kernel PCA now supports using Nyström method for approximation.
  • Ball trees now work with dual-tree algorithms, via the BallBound<> bound structure (#320); fixed by Yash Vadalia.
  • The NMF class is now AMF<>, and supports far more types of factorizations, by Sumedh Ghaisas.
  • A QUIC-SVD implementation has returned, written by Siddharth Agrawal and based on older code from Mudit Gupta.
  • Added perceptron and decision stump by Udit Saxena (these are weak learners for an eventual AdaBoost class).
  • Sparse autoencoder added by Siddharth Agrawal.

Logo JMLR Sally 0.9.0

by konrad - July 1, 2014, 22:43:51 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 18830 views, 3834 downloads, 2 subscriptions

About: A Tool for Embedding Strings in Vector Spaces

Changes:

Support for hash-based dimension reduction: simhash, minhash and Bloom filter. Support for several n-gram variants: regular, sorted, positional and blended n-grams. Simplified configuration.


Logo DCABags 0.7

by wbuntine - June 5, 2014, 05:34:44 CET [ Project Homepage BibTeX Download ] 1960 views, 454 downloads, 4 subscriptions

About: Document/Text preprocessing for topic models: suite of Perl scripts for preprocessing text collections to create dictionaries and bag/list files for use by topic modelling software.

Changes:

Moved distribution and code across to GitHub. Changed "ldac" format to have 0 offset for word indices. Added "document frequency" (df) filtering on selection of tokens for linkTables. Playing with linkParse but its still unuseable generally.


Logo A Parallel LDA Learning Toolbox 1.0

by yanjianfeng - January 24, 2014, 11:48:07 CET [ BibTeX Download ] 685 views, 231 downloads, 1 subscription

About: We introduces PLL, a parallel LDA learning toolbox for big topic modeling.

Changes:

Initial Announcement on mloss.org.


Logo Malheur 0.5.4

by konrad - December 25, 2013, 13:20:31 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 11725 views, 2267 downloads, 1 subscription

About: Automatic Analysis of Malware Behavior using Machine Learning

Changes:

Support for new version of libarchive. Minor bug fixes.


Logo JMLR CAM Java 3.1

by wangny - October 14, 2013, 22:46:03 CET [ Project Homepage BibTeX Download ] 5034 views, 2450 downloads, 1 subscription

About: The CAM R-Java software provides a noval way to solve blind source separation problem.

Changes:

In this version, we fix the problem of not working under newest R version R-3.0.


About: TBEEF, a doubly ensemble framework for recommendation and prediction problems.

Changes:

Updated the included documentation.


About: A fast and robust learning of Bayesian networks

Changes:

Initial Announcement on mloss.org.


Logo fastICA 0.1

by maitra - February 28, 2013, 06:30:20 CET [ Project Homepage BibTeX Download ] 1340 views, 343 downloads, 1 subscription

About: The open-source C-package fastICA implements the fastICA algorithm of Aapo Hyvarinen et al. (URL: http://www.cs.helsinki.fi/u/ahyvarin/) to perform Independent Component Analysis (ICA) and Projection Pursuit. fastICA is released under the GNU Public License (GPL).

Changes:

Initial Announcement on mloss.org.


Logo Neural network designer 1.1.1

by bragi - December 28, 2012, 11:38:10 CET [ Project Homepage BibTeX Download ] 2504 views, 684 downloads, 1 subscription

About: a dbms for resonating neural networks. Create and use different types of machine learning algorithms.

Changes:

AIML compatible (AIML files can be imported); new 'Grid channel' for developing board games; improved topics editor; new demo project: ALice (from AIML); lots of bug-fixes and speed improvements


Logo Reranker Framework 1.0

by zenog - October 29, 2012, 10:05:30 CET [ Project Homepage BibTeX Download ] 1008 views, 328 downloads, 1 subscription

About: ReFr is a software architecture for specifying, training and using reranking models.

Changes:

Initial Announcement on mloss.org.


Logo Pattern 2.4

by tomdesmedt - August 31, 2012, 02:26:01 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 6420 views, 1634 downloads, 1 subscription

About: "Pattern" is a web mining module for Python. It bundles tools for data retrieval, text analysis, clustering and classification, and data visualization.

Changes:
  • Small bug fixes in overall + performance improvements.
  • Module pattern.web: updated to the new Bing API (Bing API has is paid service now).
  • Module pattern.en: now includes Norvig's spell checking algorithm.
  • Module pattern.de: new German tagger/chunker, courtesy of Schneider & Volk (1998) who kindly agreed to release their work in Pattern under BSD.
  • Module pattern.search: the search syntax now includes { } syntax to define match groups.
  • Module pattern.vector: fast implementation of information gain for feature selection.
  • Module pattern.graph: now includes a toy semantic network of commonsense (see examples).
  • Module canvas.js: image pixel effects & editor now supports live editing

Logo Random Forests 5.1

by zenog - September 21, 2011, 14:14:17 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 2418 views, 431 downloads, 1 subscription

About: The original Random Forests implementation by Breiman and Cutler.

Changes:

Initial Announcement on mloss.org.


Logo gWT graph indexing wavelet tree 1.0.0

by ytabei - May 12, 2011, 23:01:17 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 2916 views, 554 downloads, 1 subscription

About: Software for graph similarity search for massive graph databases

Changes:

Initial Announcement on mloss.org.


Logo JMLR Surrogate Modeling Toolbox 7.0.2

by dgorissen - September 4, 2010, 07:48:59 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 11270 views, 3328 downloads, 1 subscription

About: The SUMO Toolbox is a Matlab toolbox that automatically builds accurate surrogate models (also known as metamodels or response surface models) of a given data source (e.g., simulation code, data set, script, ...) within the accuracy and time constraints set by the user. The toolbox minimizes the number of data points (which it selects automatically) since they are usually expensive.

Changes:

Incremental update, fixing some cosmetic issues, coincides with JMLR publication.


Logo HSSVM 1.0.1

by xjbean - June 8, 2010, 16:16:05 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 8782 views, 1743 downloads, 1 subscription

Rating Whole StarWhole StarWhole StarWhole StarWhole Star
(based on 1 vote)

About: HSSVM is a software for solving multi-class problem using Hyper-sphere Support Vector Machines model, implemented by Java.

Changes:
  1. From this version, the version number is normalized to hssvm1.0.1;
  2. In this version, we delete the features about running parameter searching and run-all from Ant script, that is, commands "ant search-param" and "ant run-all" which exist in previous version are no longer available, and they are replaced with commands "svm search conf" and "svm runall conf", both of them are used on Linux(or all other POSIX systems).If you want to use this program on Windows, the cygwin is required to be installed.

Logo Bilingual Text Classification 0.1

by jorcisai - April 9, 2010, 15:13:08 CET [ BibTeX BibTeX for corresponding Paper Download ] 2372 views, 872 downloads, 1 subscription

About: This software package implements a series of statistical mixture models for bilingual text classificacion trained by the EM algorihtm.

Changes:

Initial Announcement on mloss.org.


Logo Universal Java Matrix Package 0.2.5

by arndt - February 9, 2010, 15:55:23 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 9892 views, 1772 downloads, 1 subscription

About: The Universal Java Matrix Package (UJMP) is a data processing tool for Java. Unlike JAMA and Colt, it supports multi-threading and is therefore much faster on current hardware. It does not only support matrices with double values, but instead handles every type of data as a matrix through a common interface, e.g. CSV files, Excel files, images, WAVE audio files, tables in SQL data bases, and much more.

Changes:

Meta data updated.