Spring 2014 ECE 6504 Probabilistic Graphical Models: Class Project

Virginia Tech

The baisc idea is to use the latent Dirichlet allocation (LDA) as a main component in the implementation. In order to compute the conditional probability of the LDA various methods are used. Generally they are categorized as Sampling-Based and Variational methods. For the Sampling-Based methods Collapsed Gibbs Sampling (CGS) was deployed and tested on random Wikipedia pages. On the Variational methods side, the Online LDA (OLDA), which implements a stochastic optimization approach, was deployed and tested against the CGS, under the same data-sets, and for different number of topics.

