Active Learning for Structured Probabilistic Models with Histogram Approximation

Qing Sun (Virginia Tech), Ankit Laddha (CMU) and Dhruv Batra (Virginia Tech)

Abstract

This paper studies active learning in structured probabilis- tic models such as Conditional Random Fields (CRFs). This is a challenging problem because unlike unstructured pre- diction problems such as binary or multi-class classifica- tion, structured prediction problems involve a distribution with an exponentially-large support, for instance, over the space of all possible segmentations of an image. Thus, the entropy of such models is typically intractable to compute. We propose a crude yet surprisingly effective histogram ap- proximation to the Gibbs distribution, which replaces the exponentially-large support with a coarsened distribution that may be viewed as a histogram over M bins. We show that our approach outperforms a number of baselines and results in a 90%-reduction in the number of annotations needed to achieve nearly the same accuracy as learning from the entire dataset.