Title:Rapid object detection using a boosted cascade of simple features
Author: Paula Viola and Michael Jones
Summarization:
Adaboost is a useful tool which based on the concept that collect many "weak classify" will construct a "strong classify" that is solid. In every stage, it test all data and add the weight with the wrongly cases and reduce the weight with the collect cases, which want to minimize the training error. And this paper also uses a technique called "Integral image" to improve the calculation efficiency based on a simple addition and subtraction. Adaboost could use for feature selection, this paper use it for face detection, which could filter the non-face image one by one classifiier with important decrease, and it is also suitable for other features to select the valuable feature.
critique:
The idea of this paper is cool and simple, but it is a supervised algorithm, so the training data is important for the performance. Moreover, as a feature selection algorithm, it also select the most important feature easily, whcih could be useful and efficiency for approximate classify.
2009年6月20日 星期六
paper critique &summarization : Algorithms for Fast Vector Quantization
Title: Algorithms for Fast Vector Quantization
Author: Sunil Arya and David M. Mount
Summarization:
For fast vector quantization, this paper introduced three algorithm:
1.standard KD-tree with incremental distance calculation, which uses the information of distance between query and boundaries to decide which path could be stopped.
2.priority kD-tree search , which maintains a priority queue of subtrees to record the sibling of the node pass down and could find the really near point with query.
3. neighborhood graphs, which use the neighborhood graph to improve precision, it will expand query to the nearest neighbor and pass down until a point which all neighbor have been parsed, this method could get best performance of the three methods.
critique:
KD-tree based quantize algorithm is efficiency, but only in low dimension vector space, how to use these methods in high dimension and could reach nearly same performance is very important now.
Author: Sunil Arya and David M. Mount
Summarization:
For fast vector quantization, this paper introduced three algorithm:
1.standard KD-tree with incremental distance calculation, which uses the information of distance between query and boundaries to decide which path could be stopped.
2.priority kD-tree search , which maintains a priority queue of subtrees to record the sibling of the node pass down and could find the really near point with query.
3. neighborhood graphs, which use the neighborhood graph to improve precision, it will expand query to the nearest neighbor and pass down until a point which all neighbor have been parsed, this method could get best performance of the three methods.
critique:
KD-tree based quantize algorithm is efficiency, but only in low dimension vector space, how to use these methods in high dimension and could reach nearly same performance is very important now.
paper critique &summarization : Probabilistic Latent Semantic Indexing
Title: Probabilistic Latent Semantic Indexing
Author: Thomas Hoffmann
summarization:
This paper proposed a pLSA model which is a automatic indexing method based on statistical latent class model. As a simple example, give the matrix of document-term, this model could extract the concept of "topic", which measures the probability of each word to each topic and the relation between topic and document. A EM model is used to iterative compute the probabilities and try to reach the maximum likelihood and the process is done then.
critique:
pLSA is a useful tool that to extract the hidden topic, which I think is very interesting. But the problem of it is the efficiency, largely computation need to be compute during the process, especially when the word number is large. So, how to enhence the efficiency would be a important issue for it.
Author: Thomas Hoffmann
summarization:
This paper proposed a pLSA model which is a automatic indexing method based on statistical latent class model. As a simple example, give the matrix of document-term, this model could extract the concept of "topic", which measures the probability of each word to each topic and the relation between topic and document. A EM model is used to iterative compute the probabilities and try to reach the maximum likelihood and the process is done then.
critique:
pLSA is a useful tool that to extract the hidden topic, which I think is very interesting. But the problem of it is the efficiency, largely computation need to be compute during the process, especially when the word number is large. So, how to enhence the efficiency would be a important issue for it.
paper critique &summarization : The structure and function of complex networks
Title : The structure and function of complex networks.
Author: M. E. Newman
summarization:
This paper introduces the basic properties of and models of network.
First, it tells us what is a network, and some types of different networks, which maybe useful in our real life.
And it starts to introduce the properties:
small-word(which famous),
transitivity(mention a good measure for density of network),
degree distributions(which tell us the long tail),
network resilience(some vertices are more important in the whole network),
miximg pattern ( a connect rule between vertices),
community structure (mentions a hierarchical algorithm to extract the cluster in network).
moreover, some models to construct the network are also be mentioned:
configuration model(the simplest model which just random connected)
Price's model(based on the theory that "The rich get richer", so generate the long tail)
Barabasi and Albert's model(based on Price's model, but undirected)
This paper also discussed some real world problem like the transmitted disease, measure the transmit speed and how to process. but I think it is still far from real case.
critique:
network provides a good tool for visualize many problems in the world, but as I have seem, most research still couldn't fit the real case well just like the semantic gap problem in image retrieval, but it provides us a direction to solve the problems.
Author: M. E. Newman
summarization:
This paper introduces the basic properties of and models of network.
First, it tells us what is a network, and some types of different networks, which maybe useful in our real life.
And it starts to introduce the properties:
small-word(which famous),
transitivity(mention a good measure for density of network),
degree distributions(which tell us the long tail),
network resilience(some vertices are more important in the whole network),
miximg pattern ( a connect rule between vertices),
community structure (mentions a hierarchical algorithm to extract the cluster in network).
moreover, some models to construct the network are also be mentioned:
configuration model(the simplest model which just random connected)
Price's model(based on the theory that "The rich get richer", so generate the long tail)
Barabasi and Albert's model(based on Price's model, but undirected)
This paper also discussed some real world problem like the transmitted disease, measure the transmit speed and how to process. but I think it is still far from real case.
critique:
network provides a good tool for visualize many problems in the world, but as I have seem, most research still couldn't fit the real case well just like the semantic gap problem in image retrieval, but it provides us a direction to solve the problems.
訂閱:
文章 (Atom)