About: Multi-core non-parametric and bursty topic models (HDP-LDA, DCMLDA, and other variants of LDA) implemented in C using efficient Gibbs sampling, with hyperparameter sampling and other flexible controls. Changes:Corrected the new normalised Gamma model for topics so it works with multicore. Improvements to documentation. Added an asymptotic version of the generalised Stirling numbers so it longer fails when they run out of bounds on bigger data.
|
About: Document/Text preprocessing for topic models: suite of Perl scripts for preprocessing text collections to create dictionaries and bag/list files for use by topic modelling software. Changes:Moved distribution and code across to GitHub. Changed "ldac" format to have 0 offset for word indices. Added "document frequency" (df) filtering on selection of tokens for linkTables. Playing with linkParse but its still unuseable generally.
|
About: Generalised Stirling Numbers for Pitman-Yor Processes: this library provides ways of computing generalised 2nd-order Stirling numbers for Pitman-Yor and Dirichlet processes. Included is a tester and parameter optimiser. This accompanies Buntine and Hutter's article: http://arxiv.org/abs/1007.0296, and a series of papers by Buntine and students at NICTA and ANU. Changes:Moved repository to GitHub, and added thread support to use the main table lookups in multi-threaded code.
|
About: THIS VERSION DISCONTINUED, see "http://mloss.org/software/view/424/". This library provides ways of computing generalised 2nd-order Stirling numbers for Pitman-Yor and Dirichlet processes. Included is a tester and parameter optimiser. This accompanies Buntine and Hutter's article: http://arxiv.org/abs/1007.0296 Changes:See the alternative MLOSS entry "libstb". Updated to 1.4!
|