aMMAI course blog: paper critique &summarization : Probabilistic Latent Semantic Indexing

Title: Probabilistic Latent Semantic Indexing
Author: Thomas Hoffmann

summarization:
This paper proposed a pLSA model which is a automatic indexing method based on statistical latent class model. As a simple example, give the matrix of document-term, this model could extract the concept of "topic", which measures the probability of each word to each topic and the relation between topic and document. A EM model is used to iterative compute the probabilities and try to reach the maximum likelihood and the process is done then.

critique:
pLSA is a useful tool that to extract the hidden topic, which I think is very interesting. But the problem of it is the efficiency, largely computation need to be compute during the process, especially when the word number is large. So, how to enhence the efficiency would be a important issue for it.

aMMAI course blog

2009年6月20日星期六

paper critique &summarization : Probabilistic Latent Semantic Indexing

沒有留言:

張貼留言

追蹤者

網誌存檔

關於我自己