做下一笔记
wiki里面的定义 http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation
关键所在:it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics。
将文档看成是一组主题的混合,词有分配到每个主题的概率。
Probabilistic latent semantic analysis(PLSA) LDA可以看成是服从贝叶斯分布的PLSA