Prakriti Banik
Spring
2014 ECE 6504 Probabilistic Graphical Models: Class Project
Virginia
Tech
In vision problems like category label segmentation
we predict structured objects. In structured prediction, we are interested in
the prediction of a structured label, where the input is a complex object.
Evaluation measures for such structured prediction problems are often
task-specific. Intersection-over-Union (IOU) accuracy is one such well known
evaluation measures (used by PASCAL VOC) which fixes background bias.
Discriminative machine learning tries to train a system to optimize such specific
measure of performance or loss.
In this project, the goal is to optimize this evaluation measure in a greedy
fashion to maximize performance. The algorithm takes as input a model that
produces properly calibrated probabilities and makes predictions that seek to
maximize expected score on the test set with a greedy optimization.
Optimization is done at corpus-level, as PASCAL VOC [5]
segmentation challenge uses corpus-level IOU accuracy. IOU accuracy is measured
according to the following equation.
Where, * is the ground truth label and is
the predicted label for all superpixels, K
is number of categories/classes.
We
consider category label segmentation, which is a structured learning problem.
Let G = (V, E) be a graph with nodes V and edges E defined over an output variable space ,
where each component of takes one of K discrete values. DivMBest [1]
algorithm reasons about such an exponentially large output space, makes a
joint-prediction and outputs a diverse set of M plausible solutions. The steps of the approach are as follows
1.
Compute feature for each superpixel.
2.
Obtain a predictive distribution Q(), which fully factorizes using Softmax
Regression.
3.
Take MAP solution of all images in the
corpus.
4.
Compute surrogate gain [2]
using Q using the following equation
of
5.
For each k, among the superpixels not labeled as k, find the superpixel with highest probability to be label k in the corpus, and flip that
superpixel to label k.
6.
Repeat until no more flips can improve the
surrogate gain.
Features: In the proposed approach
we take diverse M best solutions for
each image and for each superpixel in G,
we compute the histogram of K
discrete labels assigned in all M
solutions proposed by DivMBest algorithm and use this histogram as feature. An
example image and its M=8 diverse
best solutions are shown below.
Superpixels of an image is obtained by
taking intersection of top 150 CPMC [3]
segments. The superpixels of the above image is as follows.
We also experimented with 21 dimensional O2P
[4]
score along with the previously mentioned histogram as feature for each
superpixel. The results are shown in the next section.
The IOU accuracy of 3-fold cross-validation
vs. M DivMBest solutions plot for
softmax regression on Pascal validation dataset is shown in figure (a). The
feature used for this plot was both histogram and O2P score for each
superpixel. The predictive distribution obtained here is used for optimizing
the surrogate expected gain in the above mentioned greedy fashion. The IOU
accuracy vs. Surrogate expected gain plot is shown in figure (b).
From
figure (b) we can observe that, MAP IOU accuracy is 45.1%. Initially the IOU
accuracy goes up as surrogated expected gain goes up, but at iteration 17243,
when the optimization algorithm converges, the IOU accuracy reaches around 43.9%.
This is because the predictive distribution Q(), obtained from softmax regression does not
seem to capture to the true distribution. A better feature representation might
help to improve the predictive distribution, and the ultimate IOU accuracy.