Title:Rapid object detection using a boosted cascade of simple features
Author: Paula Viola and Michael Jones
Summarization:
Adaboost is a useful tool which based on the concept that collect many "weak classify" will construct a "strong classify" that is solid. In every stage, it test all data and add the weight with the wrongly cases and reduce the weight with the collect cases, which want to minimize the training error. And this paper also uses a technique called "Integral image" to improve the calculation efficiency based on a simple addition and subtraction. Adaboost could use for feature selection, this paper use it for face detection, which could filter the non-face image one by one classifiier with important decrease, and it is also suitable for other features to select the valuable feature.
critique:
The idea of this paper is cool and simple, but it is a supervised algorithm, so the training data is important for the performance. Moreover, as a feature selection algorithm, it also select the most important feature easily, whcih could be useful and efficiency for approximate classify.
2009年6月20日 星期六
paper critique &summarization : Algorithms for Fast Vector Quantization
Title: Algorithms for Fast Vector Quantization
Author: Sunil Arya and David M. Mount
Summarization:
For fast vector quantization, this paper introduced three algorithm:
1.standard KD-tree with incremental distance calculation, which uses the information of distance between query and boundaries to decide which path could be stopped.
2.priority kD-tree search , which maintains a priority queue of subtrees to record the sibling of the node pass down and could find the really near point with query.
3. neighborhood graphs, which use the neighborhood graph to improve precision, it will expand query to the nearest neighbor and pass down until a point which all neighbor have been parsed, this method could get best performance of the three methods.
critique:
KD-tree based quantize algorithm is efficiency, but only in low dimension vector space, how to use these methods in high dimension and could reach nearly same performance is very important now.
Author: Sunil Arya and David M. Mount
Summarization:
For fast vector quantization, this paper introduced three algorithm:
1.standard KD-tree with incremental distance calculation, which uses the information of distance between query and boundaries to decide which path could be stopped.
2.priority kD-tree search , which maintains a priority queue of subtrees to record the sibling of the node pass down and could find the really near point with query.
3. neighborhood graphs, which use the neighborhood graph to improve precision, it will expand query to the nearest neighbor and pass down until a point which all neighbor have been parsed, this method could get best performance of the three methods.
critique:
KD-tree based quantize algorithm is efficiency, but only in low dimension vector space, how to use these methods in high dimension and could reach nearly same performance is very important now.
paper critique &summarization : Probabilistic Latent Semantic Indexing
Title: Probabilistic Latent Semantic Indexing
Author: Thomas Hoffmann
summarization:
This paper proposed a pLSA model which is a automatic indexing method based on statistical latent class model. As a simple example, give the matrix of document-term, this model could extract the concept of "topic", which measures the probability of each word to each topic and the relation between topic and document. A EM model is used to iterative compute the probabilities and try to reach the maximum likelihood and the process is done then.
critique:
pLSA is a useful tool that to extract the hidden topic, which I think is very interesting. But the problem of it is the efficiency, largely computation need to be compute during the process, especially when the word number is large. So, how to enhence the efficiency would be a important issue for it.
Author: Thomas Hoffmann
summarization:
This paper proposed a pLSA model which is a automatic indexing method based on statistical latent class model. As a simple example, give the matrix of document-term, this model could extract the concept of "topic", which measures the probability of each word to each topic and the relation between topic and document. A EM model is used to iterative compute the probabilities and try to reach the maximum likelihood and the process is done then.
critique:
pLSA is a useful tool that to extract the hidden topic, which I think is very interesting. But the problem of it is the efficiency, largely computation need to be compute during the process, especially when the word number is large. So, how to enhence the efficiency would be a important issue for it.
paper critique &summarization : The structure and function of complex networks
Title : The structure and function of complex networks.
Author: M. E. Newman
summarization:
This paper introduces the basic properties of and models of network.
First, it tells us what is a network, and some types of different networks, which maybe useful in our real life.
And it starts to introduce the properties:
small-word(which famous),
transitivity(mention a good measure for density of network),
degree distributions(which tell us the long tail),
network resilience(some vertices are more important in the whole network),
miximg pattern ( a connect rule between vertices),
community structure (mentions a hierarchical algorithm to extract the cluster in network).
moreover, some models to construct the network are also be mentioned:
configuration model(the simplest model which just random connected)
Price's model(based on the theory that "The rich get richer", so generate the long tail)
Barabasi and Albert's model(based on Price's model, but undirected)
This paper also discussed some real world problem like the transmitted disease, measure the transmit speed and how to process. but I think it is still far from real case.
critique:
network provides a good tool for visualize many problems in the world, but as I have seem, most research still couldn't fit the real case well just like the semantic gap problem in image retrieval, but it provides us a direction to solve the problems.
Author: M. E. Newman
summarization:
This paper introduces the basic properties of and models of network.
First, it tells us what is a network, and some types of different networks, which maybe useful in our real life.
And it starts to introduce the properties:
small-word(which famous),
transitivity(mention a good measure for density of network),
degree distributions(which tell us the long tail),
network resilience(some vertices are more important in the whole network),
miximg pattern ( a connect rule between vertices),
community structure (mentions a hierarchical algorithm to extract the cluster in network).
moreover, some models to construct the network are also be mentioned:
configuration model(the simplest model which just random connected)
Price's model(based on the theory that "The rich get richer", so generate the long tail)
Barabasi and Albert's model(based on Price's model, but undirected)
This paper also discussed some real world problem like the transmitted disease, measure the transmit speed and how to process. but I think it is still far from real case.
critique:
network provides a good tool for visualize many problems in the world, but as I have seem, most research still couldn't fit the real case well just like the semantic gap problem in image retrieval, but it provides us a direction to solve the problems.
2009年5月12日 星期二
paper critique &summarization : Rapid Object Detection using a Boosted Cascade of Simple Features
Title : Rapid Object Detection using a Boosted Cascade of Simple Features
Author : P. Viola and M. Jones
summarization:
Three major contributions:
1. introduce a new image representation call "integral image".
there 2 reasons why using features rather than pixels dirctly:
(1) can encode ad-hoc domain knowledge from training data.
(2) feature based system much faster.
this feature compute the sum of pixel values with a rectangular regions,
and use the difference between 2, 3, or 4 regions as the feature.
The best advantage is this paper use a dynamic programming way to compute the
difference between regions efficiently.
2. learning algorithm based on AdaBoost.
the object of AdaBoost is to select small set form all feature set and train the classify.
It selects the feature which is significant and achieve fewer than 1% false negatives
and 40% false positives by the experiment.
3. the method for combining increasingly more complex classifiers in a "cascade",
which like a filter and quickly discarded the noise region. This method could reject
many of negative regions and nearly detect all of positive regions.
critique : I think two methods in this paper is attractive. one is the efficiently feature computing.
the dynamic technique reduce the total computation needed. second is the cascade
method, reject most non-target at earlier stage but still guarantee the detection of
positive region.
and there are discussion after each issue, I think it is more readable for readers.
Author : P. Viola and M. Jones
summarization:
Three major contributions:
1. introduce a new image representation call "integral image".
there 2 reasons why using features rather than pixels dirctly:
(1) can encode ad-hoc domain knowledge from training data.
(2) feature based system much faster.
this feature compute the sum of pixel values with a rectangular regions,
and use the difference between 2, 3, or 4 regions as the feature.
The best advantage is this paper use a dynamic programming way to compute the
difference between regions efficiently.
2. learning algorithm based on AdaBoost.
the object of AdaBoost is to select small set form all feature set and train the classify.
It selects the feature which is significant and achieve fewer than 1% false negatives
and 40% false positives by the experiment.
3. the method for combining increasingly more complex classifiers in a "cascade",
which like a filter and quickly discarded the noise region. This method could reject
many of negative regions and nearly detect all of positive regions.
critique : I think two methods in this paper is attractive. one is the efficiently feature computing.
the dynamic technique reduce the total computation needed. second is the cascade
method, reject most non-target at earlier stage but still guarantee the detection of
positive region.
and there are discussion after each issue, I think it is more readable for readers.
2009年4月7日 星期二
paper critique & summarization :Algorithm for Fast Vector Quantization
Title : Algorithms for Fast Vector Quantization
Authors : Sunil Arya, David M. Mount
summarization:
This paper proposed three algorithms:
The first is the standard K-d tree with incremental distance calculation, which use a offset technique, to examine surrounding buckets to check that if any point is more near but in different buckets, to find the really nearly neighbor points with query point.
The second algorithm is the priority k-d three search, which try to find the nearly neighbor before run to the all algorithm termination. It maintains a priority queue which record the priority of every possibly subtree, and search will terminate when the queue empty(means that all tree have been traversed) or the highest priority subtree has larger distance to query point than the closest point we find in the sequence step.
And the third algorithm is the Neighborhood graphs, which is a simple and greedy algorithm.
It builds graph from data points by the rule that if any two points, we can't find any point which has both shorter distance to this two point than the distance between these two points, then they have a edge. And the algorithm expand the points that is closest to the query point recursively, until arrive at a point which its neighbor are all expanded or get a predefined number. Then output the closest data point visited.
critique:
Actually, what I care about of this issue is how they perform on the high dimensions domain, the dimensions of their test data are just 8~16, for my research, the dimensions always over then 100, and we have known that at least for the standard K-d tree, we have worse performance in time issue, so maybe we need to find another quantization method in image retrieval domain.
Authors : Sunil Arya, David M. Mount
summarization:
This paper proposed three algorithms:
The first is the standard K-d tree with incremental distance calculation, which use a offset technique, to examine surrounding buckets to check that if any point is more near but in different buckets, to find the really nearly neighbor points with query point.
The second algorithm is the priority k-d three search, which try to find the nearly neighbor before run to the all algorithm termination. It maintains a priority queue which record the priority of every possibly subtree, and search will terminate when the queue empty(means that all tree have been traversed) or the highest priority subtree has larger distance to query point than the closest point we find in the sequence step.
And the third algorithm is the Neighborhood graphs, which is a simple and greedy algorithm.
It builds graph from data points by the rule that if any two points, we can't find any point which has both shorter distance to this two point than the distance between these two points, then they have a edge. And the algorithm expand the points that is closest to the query point recursively, until arrive at a point which its neighbor are all expanded or get a predefined number. Then output the closest data point visited.
critique:
Actually, what I care about of this issue is how they perform on the high dimensions domain, the dimensions of their test data are just 8~16, for my research, the dimensions always over then 100, and we have known that at least for the standard K-d tree, we have worse performance in time issue, so maybe we need to find another quantization method in image retrieval domain.
2009年3月24日 星期二
paper critique & summarization : Shape Matching and Object Recognition Using Shape Contexts
Title : Shape Matching and Object Recognition Using Shape Contexts
Author : S. Belongie, J. Malik, and J. Puzicha
Summarization:
This paper proposed a shape descriptor : shape context, to describe the coarse distribution of the rest of the shape to a point on the shape. And solve the correspondence problem by computing sum of matching errors between corresponding points to find the nearest neighbor.
In their definition, a shape is a set of edge pixels which are found by edge detector. For computing the distance between two shape, they used the shape context which compute the coarse histogram of n-1 points in the relative coordinates position by the query point, and used this histogram to compute the difference with the other shape to find the best match points-pair with the other shape.
Author : S. Belongie, J. Malik, and J. Puzicha
Summarization:
This paper proposed a shape descriptor : shape context, to describe the coarse distribution of the rest of the shape to a point on the shape. And solve the correspondence problem by computing sum of matching errors between corresponding points to find the nearest neighbor.
In their definition, a shape is a set of edge pixels which are found by edge detector. For computing the distance between two shape, they used the shape context which compute the coarse histogram of n-1 points in the relative coordinates position by the query point, and used this histogram to compute the difference with the other shape to find the best match points-pair with the other shape.
2009年3月10日 星期二
paper critique & summarization : Nonlinear Dimensionality Reduction by Locally Linear Embeddign
Title : Nonlinear Dimensionality Reduction by Locally Linear Embedding.
Author : S. T. Roweis and L. K. Saul
summarization:
This paper developed a unsupervised learning algorithm that could be used for dimension reduction.
The based ideas are:
1. Find the neighbors for each points in original coordinate system(maybe by KNN).
2. find some point in the neighbors could reconstruct the original point by linear combination.
3. map all the information above onto the low dimensions coordinate system and minimize the reconstruction error.
critique:
this approach is easy to figure out, but I like the figure 1 he draw. For PCA or MDS, they also directly project the data onto the low dimension system, so some far data would be projected into near position as the figure show. And this method use the neighbor concept to avoid this situation, it easy but make sense, I like this figure.
Author : S. T. Roweis and L. K. Saul
summarization:
This paper developed a unsupervised learning algorithm that could be used for dimension reduction.
The based ideas are:
1. Find the neighbors for each points in original coordinate system(maybe by KNN).
2. find some point in the neighbors could reconstruct the original point by linear combination.
3. map all the information above onto the low dimensions coordinate system and minimize the reconstruction error.
critique:
this approach is easy to figure out, but I like the figure 1 he draw. For PCA or MDS, they also directly project the data onto the low dimension system, so some far data would be projected into near position as the figure show. And this method use the neighbor concept to avoid this situation, it easy but make sense, I like this figure.
paper critique & summarization : Eigenfaces for Recognition
Title : Eigenfaces for Recognition
Author : Matthew Turk and Alex Pentland
summarization:
This paper developed a method to recognize human face based on projecting new faces onto the feature space, rather then depend on the 3-D information or geometry which before researches used, and then called these feature space "eigenface".
The idea is com from a technique developed by Sirovich and Kirby(1987&1990), they represented pictures of faces by using principal component analysis. They calculated a coordinate system for image compression and each coordinate is an image, they called them "eigenpicture".
And this approach has some base intuitions:
1. every human faces have some similarities, like the rough positions of eyes, nose, and mouth.
2. we can compute a average face, which would be the average of all training faces.
3. compute the differ of average face and all training faces, and keeping the M faces which correspond to the highest eigenvaluse, defined them as "face space".
4. then every face could be combined by using this eigenfaces with different weight.
critique:
This paper is similar to the other paper we read this week, they both use some component to combine the original data. And the number of component used also decides how precise the combined data with the original data. And by selecting few principal component, we could get the not bad performance and use little dimension to store the information, to get the effect of dimension reduction we want.
Author : Matthew Turk and Alex Pentland
summarization:
This paper developed a method to recognize human face based on projecting new faces onto the feature space, rather then depend on the 3-D information or geometry which before researches used, and then called these feature space "eigenface".
The idea is com from a technique developed by Sirovich and Kirby(1987&1990), they represented pictures of faces by using principal component analysis. They calculated a coordinate system for image compression and each coordinate is an image, they called them "eigenpicture".
And this approach has some base intuitions:
1. every human faces have some similarities, like the rough positions of eyes, nose, and mouth.
2. we can compute a average face, which would be the average of all training faces.
3. compute the differ of average face and all training faces, and keeping the M faces which correspond to the highest eigenvaluse, defined them as "face space".
4. then every face could be combined by using this eigenfaces with different weight.
critique:
This paper is similar to the other paper we read this week, they both use some component to combine the original data. And the number of component used also decides how precise the combined data with the original data. And by selecting few principal component, we could get the not bad performance and use little dimension to store the information, to get the effect of dimension reduction we want.
2009年3月3日 星期二
paper critique & summarization : Distinctive Image Features from Scale-Invariant Keypoints
Title: Distinctive Image Features from Scale-Invariant Keypoints
Author: D. G. Lowe
This paper proposes a approach named "Scale Invariant Feature Transform(SIFT)", which transforms image data into scale-invariant coordinates relative to local features.
the approach has 4 major stages, which introduced as follow:
1. Scale-space extrema detection - uses the DoG function to find the potential interest points, by comparing the 26 neighbor points and be selected if the DoG is larger or smaller then all others.
2. Keypoint localization - a detailed model is fit to determine location and scale. To delete the candidate points which contrast is low or on the edge.
3. Orientation assignment - one or more orientations are assigned to each keypoint location based on local image gradient directions. And the image data are transformed by this orientation to make sure the invariant of rotation.
4. Keypoint descriptor - represents these points with the parameters of location, scale, and orientation which assigned to these points.
And the keypoint descriptors generated by the above stages are highly distinctive, which is invariant in rotation, scale, and viewpoint.
Author: D. G. Lowe
This paper proposes a approach named "Scale Invariant Feature Transform(SIFT)", which transforms image data into scale-invariant coordinates relative to local features.
the approach has 4 major stages, which introduced as follow:
1. Scale-space extrema detection - uses the DoG function to find the potential interest points, by comparing the 26 neighbor points and be selected if the DoG is larger or smaller then all others.
2. Keypoint localization - a detailed model is fit to determine location and scale. To delete the candidate points which contrast is low or on the edge.
3. Orientation assignment - one or more orientations are assigned to each keypoint location based on local image gradient directions. And the image data are transformed by this orientation to make sure the invariant of rotation.
4. Keypoint descriptor - represents these points with the parameters of location, scale, and orientation which assigned to these points.
And the keypoint descriptors generated by the above stages are highly distinctive, which is invariant in rotation, scale, and viewpoint.
paper critique & summarization : Scale & Affine Invariant Interest Point Detectors
Title : Scale & Affine Invariant Interest Point Detectors
Author: K. Mikolajczyk and C. Schmid
This paper proposes a method for detecting interest points invariant to scale and affine transformations. It combines the Harris detector with Laplacian to get the goal.
Harris detector is a interest point detector, but it is not invariant to scale changes. To solve this problem, they generate multi-scale for each point, and use Laplacian-of-Faussians to find the highest percentage of correct characteristic scales. Then can generate a scale invariant detector, and they call it "Harris-Laplace detector".
And for the affine invariant interest point detector, because the Harris-Laplace detector could not reflect the real transformation of a point when it has not only scale change but also affine transformation. They use the "Second Moment Matrix" to fix the problem. But in fact, the math formulas is hard for me.
And the "Harris-Affine Interest Point Detector" which they propose have the following parts:
1. spatial localization - determined by the local maximum of the Harris function.
2. integration scale - selected at the extremum over scale of the normalized Laplacian.
3. differentiation scale - selected at the maximum of normalized isotropy
4. shape adaptation matrix - estimated with the second moment matrix.
Author: K. Mikolajczyk and C. Schmid
This paper proposes a method for detecting interest points invariant to scale and affine transformations. It combines the Harris detector with Laplacian to get the goal.
Harris detector is a interest point detector, but it is not invariant to scale changes. To solve this problem, they generate multi-scale for each point, and use Laplacian-of-Faussians to find the highest percentage of correct characteristic scales. Then can generate a scale invariant detector, and they call it "Harris-Laplace detector".
And for the affine invariant interest point detector, because the Harris-Laplace detector could not reflect the real transformation of a point when it has not only scale change but also affine transformation. They use the "Second Moment Matrix" to fix the problem. But in fact, the math formulas is hard for me.
And the "Harris-Affine Interest Point Detector" which they propose have the following parts:
1. spatial localization - determined by the local maximum of the Harris function.
2. integration scale - selected at the extremum over scale of the normalized Laplacian.
3. differentiation scale - selected at the maximum of normalized isotropy
4. shape adaptation matrix - estimated with the second moment matrix.
2009年2月24日 星期二
paper critique & summarization : How to give a good research talk
Title : How to give a good research talk
Author: Simon L Peyton Jones, John Hughes, and John Launchbury
Summarization:
This paper propose some notices that may help the paper presentation.
1. About the material, the most important is, who are your audience?
Although for the same topic, you should have different material depend on what your audience would like to listen.
2. Ddopt a non-uniform approach to your talk! Pick some really important to your audience to talk, don't waste too much time on the section which is boring.
3. Put only 6~7 things in a slide, and maybe some images or diagrams are better then only the text. "Present" your slides, not "read" the text in your slides.
4. Don't prepare the slides early. (In fact, I disagree this point, I think that spend more time on preparing slides is OK for me to have more chance to make the slides better.)
5. Attract the eyes of audience in short minute and don't over-run.
Citique:
This paper propose some simple tips for paper presentation, some is useful for me, I always spent some time on the non-important parts in my presentation. But maybe because of my english is poor, I feel the way this paper written is not so smooth for me, some sentances take my more time to figure out what it wants to present.
Author: Simon L Peyton Jones, John Hughes, and John Launchbury
Summarization:
This paper propose some notices that may help the paper presentation.
1. About the material, the most important is, who are your audience?
Although for the same topic, you should have different material depend on what your audience would like to listen.
2. Ddopt a non-uniform approach to your talk! Pick some really important to your audience to talk, don't waste too much time on the section which is boring.
3. Put only 6~7 things in a slide, and maybe some images or diagrams are better then only the text. "Present" your slides, not "read" the text in your slides.
4. Don't prepare the slides early. (In fact, I disagree this point, I think that spend more time on preparing slides is OK for me to have more chance to make the slides better.)
5. Attract the eyes of audience in short minute and don't over-run.
Citique:
This paper propose some simple tips for paper presentation, some is useful for me, I always spent some time on the non-important parts in my presentation. But maybe because of my english is poor, I feel the way this paper written is not so smooth for me, some sentances take my more time to figure out what it wants to present.
paper critique & summarization : How to read a paper
Title: How to Read a Paper
Author: S. Keshav
Published: ACM SIGCOMM 2007
summarization:
Three passes to read a paper:
First Pass: Scan the title, abstract, introduction, conclusion,
reference and the headings for each sections.
And find what type this paper is, what assumptions it has,
what contributions it claims.
Second Pass: Read the figures, diagrams, and illustrations carefully to get the key points.
Read the reference paper first if you need the background knowledge.
Third Pass: Virtually re-implement the paper.
Challenge every assumptions and consider how to solve the problem yourself.
Find the advantage and disadvantage of this paper.
For literature survay, the three passes mathod is also useful,
the first pass should be read the "recent" few papers about the field;
second pass is finding the good researcher in this area and knowing what he do the last years;
and final pass we could look around the websit of top conferences of this field to get more key papers.
Author: S. Keshav
Published: ACM SIGCOMM 2007
summarization:
Three passes to read a paper:
First Pass: Scan the title, abstract, introduction, conclusion,
reference and the headings for each sections.
And find what type this paper is, what assumptions it has,
what contributions it claims.
Second Pass: Read the figures, diagrams, and illustrations carefully to get the key points.
Read the reference paper first if you need the background knowledge.
Third Pass: Virtually re-implement the paper.
Challenge every assumptions and consider how to solve the problem yourself.
Find the advantage and disadvantage of this paper.
For literature survay, the three passes mathod is also useful,
the first pass should be read the "recent" few papers about the field;
second pass is finding the good researcher in this area and knowing what he do the last years;
and final pass we could look around the websit of top conferences of this field to get more key papers.
訂閱:
文章 (Atom)