jLDADMMhttp://mloss.orgUpdates and additions to jLDADMMenWed, 19 Aug 2015 12:52:36 -0000jLDADMM 1.0<html><p>jLDADMM: A Java package for the LDA and DMM topic models </p> <p>The Java package jLDADMM is released to provide alternatives for topic modeling on normal or short texts. Probabilistic topic models, such as Latent Dirichlet Allocation (LDA) and related models, are widely used to discover latent topics in document collections. However, applying topic models for short texts (e.g. Tweets) is more challenging because of data sparsity and the limited contexts in such texts. One approach is to combine short texts into long pseudo-documents before training LDA. Another approach is to assume that there is only one topic per document. </p> <p>jLDADMM provides implementations of the LDA topic model and the one-topic-per-document Dirichlet Multinomial Mixture (DMM) model (i.e. mixture of unigrams). These implementations of LDA and DMM use Gibbs sampling for inference. Furthermore, jLDADMM supplies a document clustering evaluation to compare topic models, using two common metrics of Purity and normalized mutual information. </p> <p>See for more details. </p></html>Dat Quoc NguyenWed, 19 Aug 2015 12:52:36 -0000 dirichlet allocationdirichlet multinomial mixture modeldmmmixture of unigramsshort textstopic modeltweets