-
- Description:
MA-sLDAc is a C++ implementation of the supervised topic models with labels provided by multiple annotators with different levels of expertise, as proposed in:
Rodrigues, F., Lourenço, M, Ribeiro, B, Pereira, F. Learning Supervised Topic Models for Classification and Regression from Crowds. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2017.
Rodrigues, F., Lourenço, M, Ribeiro, B, Pereira, F. Learning supervised topic models from crowds. The Third AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2015.
The code is based on the supervised LDA (sLDA) implementation by Chong Wang and David Blei (http://www.cs.cmu.edu/~chongw/slda/). Three different variants of the proposed model are provided:
- MA-sLDAc (mle): This implementation uses maximum likelihood estimates for the topics distributions (beta) and the annotators confusion matrices (pi);
- MA-sLDAc (smooth): This implementation places priors on beta and pi and performs approximate Bayesian inference;
- MA-sLDAc (svi): This implementation is similar to the “MA-sLDAc (smooth)”, but uses stochastic variational inference.
For simplicity reasons, I recommend first-time users to start with "MA-sLDAc (mle)", since this version has less parameters that need to be specified.
Sample multiple-annotator data using the 20newsgroups dataset and more datasets are available here: http://www.fprodrigues.com/software/
- Changes to previous version:
Initial Announcement on mloss.org.
- BibTeX Entry: Download
- Corresponding Paper BibTeX Entry: Download
- Supported Operating Systems: Linux, Mac
- Data Formats: Various
- Tags: Topic Modeling, Supervised Learning, Crowdsourcing
- Archive: download here
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.