About: Document/Text preprocessing for topic models: suite of Perl scripts for preprocessing text collections to create dictionaries and bag/list files for use by topic modelling software. Changes:Moved distribution and code across to GitHub. Changed "ldac" format to have 0 offset for word indices. Added "document frequency" (df) filtering on selection of tokens for linkTables. Playing with linkParse but its still unuseable generally.

About: Big Random Forests Changes:Fetched by rcranrobot on 20151101 00:00:04.072762

About: SAMOA is a platform for mining big data streams. It is a distributed streaming machine learning (ML) framework that contains a programing abstraction for distributed streaming ML algorithms. Changes:Initial Announcement on mloss.org.

About: DAL is an efficient and flexibible MATLAB toolbox for sparse/lowrank learning/reconstruction based on the dual augmented Lagrangian method. Changes:

About: Estimates statistical significance of association between variables and their principal components (PCs). Changes:Initial Announcement on mloss.org.

About: DRVQ is a C++ library implementation of dimensionalityrecursive vector quantization, a fast vector quantization method in highdimensional Euclidean spaces under arbitrary data distributions. It is an approximation of kmeans that is practically constant in data size and applies to arbitrarily high dimensions but can only scale to a few thousands of centroids. As a byproduct of training, a tree structure performs either exact or approximate quantization on trained centroids, the latter being not very precise but extremely fast. Changes:Initial Announcement on mloss.org.

About: hapFabia is an R package for identification of very short segments of identity by descent (IBD) characterized by rare variants in large sequencing data. It detects 100 times smaller segments than previous methods. Changes:o citation update o plot function improved

About: hapFabia is an R package for identification of very short segments of identity by descent (IBD) characterized by rare variants in large sequencing data. Changes:o citation update o plot function improved

About: A library for calculating and accessing generalized Stirling numbers of the second kind, which are used for inference in PoissonDirichlet processes. Changes:Initial Announcement on mloss.org.

About: Evolutionary Learning of Globally Optimal Trees Changes:Fetched by rcranrobot on 20140501 00:00:05.459097

About: The glmie toolbox contains scalable estimation routines for GLMs (generalised linear models) and SLMs (sparse linear models) as well as an implementation of a scalable convex variational Bayesian inference relaxation. We designed the glmie package to be simple, generic and easily expansible. Most of the code is written in Matlab including some MEX files. The code is fully compatible to both Matlab 7.x and GNU Octave 3.2.x. Probabilistic classification, sparse linear modelling and logistic regression are covered in a common algorithmical framework allowing for both MAP estimation and approximate Bayesian inference. Changes:added factorial mean field inference as a third algorithm complementing expectation propagation and variational Bayes generalised nonGaussian potentials so that affine instead of linear functions of the latent variables can be used

About: ALgebraic COmbinatorial COmpletion of MAtrices. A collection of algorithms to impute or denoise single entries in an incomplete rank one matrix, to determine for which entries this is possible with any algorithm, and to provide algorithmindependent error estimates. Includes demo scripts. Changes:Initial Announcement on mloss.org.

About: ClowdFlows is a web based platform for service oriented data mining publicly available at http://clowdflows.org . A web based interface allows users to construct data mining workflows that are hosted on the web and can be (if allowed by the author) accessed by anyone by following a URL of the workflow. Changes:Initial Announcement on mloss.org.

About: [FACTORIE](http://factorie.cs.umass.edu) is a toolkit for deployable probabilistic modeling, implemented as a software library in [Scala](http://scalalang.org). It provides its users with a succinct language for creating [factor graphs](http://en.wikipedia.org/wiki/Factor_graph), estimating parameters and performing inference. It also has implementations of many machine learning tools and a full NLP pipeline. Changes:Initial Announcement on mloss.org.

About: Dataefficient policy search framework using probabilistic Gaussian process models Changes:Initial Announcement on mloss.org.

About: PRoNTo is freely available software and aims to facilitate the interaction between the neuroimaging and machine learning communities. The toolbox is based on pattern recognition techniques for the analysis of neuroimaging data. PRoNTo supports the analysis of all image modalities as long as they are NIfTI format files. However, only the following modalites have been tested for version 1.1: sMRI, fMRI, PET, FA (fractional anisotropy) and Beta (GLM coefficients) images. Changes:Initial Announcement on mloss.org.

About: Approximate Rank One FACtorization of tensors. An algorithm for factorization of threewaytensors and determination of their rank, includes example applications. Changes:Initial Announcement on mloss.org.

About: This is the core MCMC sampler for the nonparametric sparse factor analysis model presented in David A. Knowles and Zoubin Ghahramani (2011). Nonparametric Bayesian Sparse Factor Models with application to Gene Expression modelling. Annals of Applied Statistics Changes:Initial Announcement on mloss.org.

About: Regularization paTH for LASSO problem (thalasso) thalasso solves problems of the following form: minimize 1/2X*betay^2 + lambda*sumbeta_i, where X and y are problem data and beta and lambda are variables. Changes:Initial Announcement on mloss.org.
