mloss.org new softwarehttp://mloss.orgUpdates and additions to mloss.orgenSun, 22 Jan 2017 12:24:59 -0000NaN toolbox 3.1.2http://mloss.org/revision/view/2050/<html><p>The NaN-toolbox provides a number of statistics functions and machine learning methods for the use with Octave and Matlab. The functions can handle data with missing values encoded as NaNs, weighting of data samples, and multi-class classification (using a one-versus-rest scheme). There is a common interface to a number of different classification methods (including FDA, LDA, Naive Bayes, QDA, RDA, sparse classifiers, interfaces to some SVMs, regression/PLS, Wiener-Hopf).<br />
</p></html>alois schloeglSun, 22 Jan 2017 12:24:59 -0000http://mloss.org/software/rss/comments/2050http://mloss.org/revision/view/2050/classificationmulti classmachine learningmissing datastatisticsweightingMLweb 0.1.5http://mloss.org/revision/view/2049/<html><p>MLweb is an open source project that aims at bringing machine learning capabilities into web pages and web applications, while maintaining all computations on the client side, i.e., in the browser.
</p>
<p>It includes the following.
</p>
<h2>LALOLib: a javascript library to enable and ease scientific computing within web pages</h2>
<p>LALOLib provides functions for
</p>
<ul>
<li>
linear algebra: basic vector and matrix operations, linear system solvers, matrix factorizations (QR, Cholesky), eigendecomposition, singular value decomposition, conjugate gradient sparse linear system solver,... ),
</li>
<li>
statistics: sampling from and estimating standard distributions,
</li>
<li>
optimization: steepest descent, BFGS, linear programming (thanks to glpk.js), quadratic programming.
</li>
</ul>
<p>Documentation is available at <a href="http://mlweb.loria.fr/lalolab/lalolib.html">http://mlweb.loria.fr/lalolab/lalolib.html</a>
</p>
<p>See also the benchmark at <a href="http://mlweb.loria.fr/benchmark/">http://mlweb.loria.fr/benchmark/</a>
</p>
<h2>ML.js: a javascript library for machine learning</h2>
<p>In addition to all the functions of LALOLib, ML.js implements the following algorithms.
</p>
<h3>Classification</h3>
<ul>
<li>
K-nearest neighbors,
</li>
<li>
Linear discriminant analysis,
</li>
<li>
Naive Bayes classifier,
</li>
<li>
Logistic regression,
</li>
<li>
Perceptron,
</li>
<li>
Multi-layer perceptron,
</li>
<li>
Support vector machines,
</li>
<li>
Multi-class support vector machines,
</li>
<li>
Decision trees
</li>
</ul>
<h3>Regression</h3>
<ul>
<li>
Least squares,
</li>
<li>
Least absolute devations,
</li>
<li>
K-nearest neighbors,
</li>
<li>
Ridge regression,
</li>
<li>
LASSO,
</li>
<li>
LARS,
</li>
<li>
Orthogonal least squares,
</li>
<li>
Multi-layer perceptron,
</li>
<li>
Kernel ridge regression,
</li>
<li>
Support vector regression,
</li>
<li>
K-LinReg
</li>
</ul>
<h3>Clustering</h3>
<ul>
<li>
K-means,
</li>
<li>
Spectral clustering
</li>
</ul>
<h3>Dimensionality reduction</h3>
<ul>
<li>
Principal component analysis,
</li>
<li>
Locally linear embedding,
</li>
<li>
Local tangent space alignment
</li>
</ul>
<p>Documentation is available at <a href="http://mlweb.loria.fr/lalolab/lalolib.html">http://mlweb.loria.fr/lalolab/lalolib.html</a>
</p>
<h2>LALOLab: a matlab-like development environment</h2>
<p>Try it at <a href="http://mlweb.loria.fr/lalolab/">http://mlweb.loria.fr/lalolab/</a>
</p></html>fabien lauer, pedro ernesto garcia rodriguezTue, 17 Jan 2017 15:47:41 -0000http://mloss.org/software/rss/comments/2049http://mloss.org/revision/view/2049/classificationclusteringregressiondimensionality reductionlinear algebradevelopment environmentscientific computingwebLogRegCrowds, Logistic Regression from Crowds 1.0http://mloss.org/revision/view/2048/<html><p>LogReg-Crowds is a collection of Julia implementations of various approaches for learning a logistic regression model multiple annotators and crowds, namely the works of:
</p>
<ul>
<li><p>Rodrigues, F., Pereira, F., and Ribeiro, B. Learning from multiple annotators: distinguishing good from random labelers. Pattern Recognition Letters, pp. 1428–1436, 2013.
</p>
</li>
<li><p>Raykar, V., Yu, S., Zhao, L., Valadez, G., Florin, C., Bogoni, L., and Moy, L. Learning from Crowds. Journal of Machine Learning Research, pp. 1297–1322, 2010.
</p>
</li>
<li><p>Dawid, A. P. and Skene, A. M. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society. Series C, 28(1):20–28, 1979.
</p>
</li>
</ul>
<p>All implementations are able to handle multi-class problems and do not require repeated labelling (i.e. annotators do not have to provide labels for the entire dataset). The code was though for interpretability and it is well commented, so that it can be very easy to use (kindly see the file “demo.jl”). At the same, the Julia language provides it with a great perfomance, specially when compared to other scientific languages such as MATLAB or Python/Numpy, without compromising its high-level and interpretability.
</p>
<p>Project homepage: https://github.com/fmpr/LogReg-Crowds
</p></html>filipe rodriguesMon, 16 Jan 2017 18:10:57 -0000http://mloss.org/software/rss/comments/2048http://mloss.org/revision/view/2048/emlogistic regressioncrowdsourcinglatent variable modelMulti Annotator Supervised LDA for regression 1.0http://mloss.org/revision/view/2047/<html><p>MA-sLDAr is a C++ implementation of the supervised topic models with response variables provided by multiple annotators with different levels of expertise, as proposed in:
</p>
<ul>
<li>
Rodrigues, F., Lourenço, M, Ribeiro, B, Pereira, F. Learning Supervised Topic Models for Classification and Regression from Crowds. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2017.
</li>
</ul>
<p>Sample multiple-annotator data using the MovieReviews dataset and more datasets are available here: http://www.fprodrigues.com/software/
</p></html>filipe rodriguesMon, 16 Jan 2017 18:10:19 -0000http://mloss.org/software/rss/comments/2047http://mloss.org/revision/view/2047/topic modelingsupervised learningcrowdsourcingMulti Annotator Supervised LDA for classification 1.0http://mloss.org/revision/view/2044/<html><p>MA-sLDAc is a C++ implementation of the supervised topic models with labels provided by multiple annotators with different levels of expertise, as proposed in:
</p>
<ul>
<li><p>Rodrigues, F., Lourenço, M, Ribeiro, B, Pereira, F. Learning Supervised Topic Models for Classification and Regression from Crowds. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2017.
</p>
</li>
<li><p>Rodrigues, F., Lourenço, M, Ribeiro, B, Pereira, F. Learning supervised topic models from crowds. The Third AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2015.
</p>
</li>
</ul>
<p>The code is based on the supervised LDA (sLDA) implementation by Chong Wang and David Blei (http://www.cs.cmu.edu/~chongw/slda/). Three different variants of the proposed model are provided:
</p>
<ul>
<li>
MA-sLDAc (mle): This implementation uses maximum likelihood estimates for the topics distributions (beta) and the annotators confusion matrices (pi);
</li>
<li>
MA-sLDAc (smooth): This implementation places priors on beta and pi and performs approximate Bayesian inference;
</li>
<li>
MA-sLDAc (svi): This implementation is similar to the “MA-sLDAc (smooth)”, but uses stochastic variational inference.
</li>
</ul>
<p>For simplicity reasons, I recommend first-time users to start with "MA-sLDAc (mle)", since this version has less parameters that need to be specified.
</p>
<p>Sample multiple-annotator data using the 20newsgroups dataset and more datasets are available here: http://www.fprodrigues.com/software/
</p></html>filipe rodriguesMon, 16 Jan 2017 18:01:36 -0000http://mloss.org/software/rss/comments/2044http://mloss.org/revision/view/2044/topic modelingsupervised learningcrowdsourcingJava Statistical Analysis Tool 0.0.7http://mloss.org/revision/view/2043/<html><p>JSAT is a general purpose Java Machine Learning library, primarily focused on classification, regression, and clustering algorithms. It supports reading and writing datasets in the common LIBSVM format, ARFF files, and a custom binary format for efficiency. It has no dependencies besides Java 6, and is meant to be relatively fast with multi-threaded implementations where possible.<br />
</p>
<p>JSAT aims to be useful for both practitioners and researchers. It provides a wide breadth of algorithms that can be used for comparison or to select the method most appropriate for a given dataset.
</p></html>Edward RaffSun, 15 Jan 2017 22:21:50 -0000http://mloss.org/software/rss/comments/2043http://mloss.org/revision/view/2043/classificationregressiononline learningmachine learningjavaFEAST 2.0.0http://mloss.org/revision/view/2042/<html><p>A FEAture Selection Toolbox for C/C++ [HTML_REMOVED] MATLAB/OCTAVE, v2.0.0.
</p>
<p>FEAST provides implementations of common mutual information based filter feature selection algorithms, and an implementation of RELIEF. All functions expect discrete inputs (except RELIEF, which does not depend on the MIToolbox), and they return the selected feature indices. These implementations were developed to help our research into the similarities between these algorithms, and our results are presented in the following paper:
</p>
<p> Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection
G. Brown, A. Pocock, M.-J. Zhao, M. Lujan
Journal of Machine Learning Research, 13:27-66 (2012)
</p>
<p>The weighted feature selection algorithms are described in:
</p>
<p> Information Theoretic Feature Selection for Cost-Sensitive Problems
A. Pocock, N. Edakunni, M.-J. Zhao, M. Lujan, G. Brown.
ArXiv
</p>
<p>If you use these implementations for academic research please cite the relevant paper above. All FEAST code is licensed under the BSD 3-Clause License.
</p>
<p>Contains implementations of:
mim, mrmr, mifs, cmim, jmi, disr, cife, icap, condred, cmi, relief, fcbf, betagamma
</p>
<p>And weighted implementations of:
mim, cmim, jmi, disr, cmi
</p>
<p>References for these algorithms are provided in the accompanying feast.bib file (in BibTeX format).
</p>
<p>FEAST works on discrete inputs, and all continuous values <strong>must</strong> be
discretised before use with FEAST. In our experiments we've found that using 10 equal width bins is suitable for many problems, though this is data set size dependent. FEAST produces unreliable results when used with continuous inputs, runs slowly and uses much more memory than usual. The discrete inputs should have small cardinality, FEAST will treat values {1,10,100} the same way it treats {1,2,3} and the latter will be both faster and use less memory.
</p>
<p>MATLAB Example (using "data" as our feature matrix, and "labels" as the class label vector):
</p>
<blockquote><blockquote><p>size(data)
ans =
(569,30) %% denoting 569 examples, and 30 features
</p>
<p>selectedIndices = feast('jmi',5,data,labels) %% selecting the top 5 features using the jmi algorithm
selectedIndices =
</p>
</blockquote></blockquote><pre><code>28
21
8
27
23
</code></pre><blockquote><blockquote><p>selectedIndices = feast('mrmr',10,data,labels) %% selecting the top 10 features using the mrmr algorithm
selectedIndices =
</p>
</blockquote></blockquote><pre><code>28
24
22
8
27
21
29
4
7
25
</code></pre><blockquote><blockquote><p>selectedIndices = feast('mifs',5,data,labels,0.7) %% selecting the top 5 features using the mifs algorithm with beta = 0.7
selectedIndices =
</p>
</blockquote></blockquote><pre><code>28
24
22
20
29
</code></pre><p>The library is written in ANSI C for compatibility with the MATLAB mex
compiler, except for MIM, FCBF and RELIEF, which are written in MATLAB/OCTAVE script. There is a different implementation of MIM available for use in the C library.
</p>
<p>MIToolbox v3.0.0 is required to compile these algorithms, and these implementations supercede the example implementations given in that package (they have more robust behaviour when used with unexpected inputs).
</p>
<p>MIToolbox can be found at: http://www.github.com/Craigacp/MIToolbox/
</p>
<p>The C library expects all matrices in column-major format (i.e. Fortran style). This is for two reasons, a) MATLAB generates Fortran-style arrays, and b) feature selection iterates over columns rather than rows, unlike most other ML processes.
</p>
<p>Compilation instructions:
* MATLAB/OCTAVE - run CompileFEAST.m
<em> Linux C shared library - use the included makefile
</em> Java - see java/README.md
</p></html>adam pocock, gavin brownSun, 08 Jan 2017 00:49:19 -0000http://mloss.org/software/rss/comments/2042http://mloss.org/revision/view/2042/matlabfeature selectionfeature rankingmutual informationjmlrMIToolbox 3.0.0http://mloss.org/revision/view/2041/<html><p>A mutual information library for C/C++ and Mex bindings for MATLAB.
</p>
<p>This toolbox is aimed at people who wish to use mutual information for feature selection, and provides a range of information theoretic functions. All functions estimate the probabilities from the supplied data vectors. Some example implementations of common mutual information based feature selection algorithms are provided in both C and MATLAB, CMIM - (Fleuret 2004), mRMR - (Peng et al 2005), DISR - (Bontempi & Meyer 2006). The implementations contained in here are early versions of those in the <a href="http://mloss.org/software/view/386/">FEAST library</a>, and the implementations in FEAST should be used in preference to the ones in MIToolbox.
</p>
<p>All functions discretise the inputs by rounding down to the nearest integer.
</p>
<p>This toolbox was developed to support our work in feature selection, which resulted in the paper "Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection", G Brown, A Pocock, M-J Zhao, M Lujan. JMLR 2012 (<a href="http://jmlr.csail.mit.edu/papers/v13/brown12a.html">link</a>). Please cite this paper if you use our toolbox.
</p>
<p>The feature selection algorithms developed for that paper form the FEAST toolbox, published at mloss <a href="http://mloss.org/software/view/386/">here</a>.
</p>
<p>List of functions:
Entropy, Conditional Entropy, Joint Entropy,
Mutual Information, Conditional Mutual Information,
Renyi's Entropy, Renyi's Mutual Information,
Weighted Entropy, Weighted Conditional Entropy
Creating a joint random variable
</p>
<p>A Java implementation of the Shannon entropy functions is available from my GitHub page <a href="https://github.com/Craigacp/JavaMI">here</a>.
</p></html>Adam PocockSun, 08 Jan 2017 00:43:22 -0000http://mloss.org/software/rss/comments/2041http://mloss.org/revision/view/2041/conditional entropyconditional mutual informationentropymutual informationrenyi entropypython weka wrapper3 0.1.2http://mloss.org/revision/view/2040/<html><p>A thin Python3 wrapper that uses the javabridge Python library to communicate with a Java Virtual Machine executing Weka API calls. Offers all major APIs, like data generators, loaders, savers, filters, classifiers, clusterers, attribute selection, associations and experiments. Weka packages can be listed/installed/uninstalled as well. It does not provide any graphical frontend, but some basic plotting and graph visualizations are available through matplotlib and pygraphviz.
</p></html>peter reutemannWed, 04 Jan 2017 10:27:40 -0000http://mloss.org/software/rss/comments/2040http://mloss.org/revision/view/2040/machine learningwekapython weka wrapper 0.3.10http://mloss.org/revision/view/2039/<html><p>A thin Python wrapper that uses the javabridge Python library to communicate with a Java Virtual Machine executing Weka API calls. Offers all major APIs, like data generators, loaders, savers, filters, classifiers, clusterers, attribute selection, associations and experiments. Weka packages can be listed/installed/uninstalled as well. It does not provide any graphical frontend, but some basic plotting and graph visualizations are available through matplotlib and pygraphviz.
A simple workflow engine was added with release 0.3.0.
</p></html>peter reutemannWed, 04 Jan 2017 10:21:33 -0000http://mloss.org/software/rss/comments/2039http://mloss.org/revision/view/2039/machine learningweka