Project details for hca

Logo hca 0.4

by wbuntine - November 25, 2013, 05:15:27 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view (5 today), download ( 0 today ), 2 comments, 3 subscriptions


Non-parametric topic models implemented using efficient Gibbs sampling. Early theory from the ECML-PKDD 2011 paper cited.

Coded in C with no other dependencies. Input can be LdaC format, docword format, various Matlab style formats. Implements HDP-LDA, HPYP-LDA, symmetric-symmetric, symmetric-asymmetric, asymmetric-symmetric, and asymmetric-symmetric priors with Pitman-Yor or Dirichlet processes. Full hyper-parameter fitting, or setting initially. Special "turbo boost" function for even better performance. No Chinese restaurant processes so quite fast (non-parametric methods 1.5-3.0 times slower than regular LDA with Gibbs). Estimation of various vectors (document and topic vectors). Diagnostics, control, restarts, test likelihood via document completion. Coherence calculations on results using PMI and normalised PMI.

Changes to previous version:

Added example on using burstiness.

BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
URL: Project Homepage
Supported Operating Systems: Agnostic
Data Formats: Ascii
Tags: Topic Modeling, Nonparametric Bayes
Archive: download here

Other available revisons

Version Changelog Date

Modified command line -A and -B formats. Overhaul of diagnostics. Described changes in manual. Bug fixes: multi-core crashing when huge number of topics; -B when using number and fitting beta, beta sampling wasn't working; both now fixed.

August 6, 2014, 14:24:57

Implemented multi-core using atomic operations. Improved manual. Various extensions and bugs fixed. Also "-B" flag has a different argument, so watch it!

June 4, 2014, 04:08:12

Added example on using burstiness.

November 29, 2013, 03:16:11

Added example on using burstiness.

November 25, 2013, 05:15:27


Wray Buntine (on June 24, 2014, 06:21:54)

Noticed in this update hyper-parameter fitting of "beta" when using -B doesn't update the parameter. I'll have a new version out shortly along with a few other improvements to fix this.

Wray Buntine (on June 24, 2014, 06:29:59)

Get more details about the theory from the KDD 2014 paper. Will be presenting in New York!

Leave a comment

You must be logged in to post comments.