Latent Topic Models for Hypertext 1.0

A probabilistic generative model for hypertext document collections that explicitly models the generation of links. Specifically, links from a word w to a document d depend directly on how frequent the topic of w is in d, in addition to the in-degree of d. We show how to perform EM learning on this model efficiently. By not modeling links as analogous to words, we end up using far less free parameters, and obtain better link prediction results.

amit gruber
Wed, 02 Sep 2009