ECE 5984: Advanced Topics in Computer Vision, Spring 2014

Electrical and Computer Engineering Department
, Virginia Tech

Meets: TR 3:30 pm to 4:45 pm in Hutcheson Hall (HUTCH) 207 McBryde Hall (MCB) 233.

Instructor: Devi Parikh
Email: parikh@vt.edu
Office: 440 Whittemore Hall
Office hours: By appointment.

Course overview Pre-requisite Requirements Important dates Schedule Resources

 

Prize Winners!

Best Paper Presentation: Jason Ziglar and John Peterson

Best Discussion Participant: Ramakrishna Vedantam

Best Project: Michael Cogswell

Congratulations!

 

Class projects:

Dual Channel Analytics and Tracking of Cells Experiencing the Dielectrophoretic Force (Lisa Anders)

Semantic Segmentation with Deep Learning (Michael Cosgswell)

Building High-Level Object Vocabularies (Jacob Dennis)

Application of Selective Search to Pose Estimation (Ujwal Krothapalli)

Making Intelligent and Interpretable Classification Systems (Shrenik Lad)

Towards Cascade Object Detection (John Peterson)

Person of Interest in Images (Clint Solomon)

Understanding Predictions of Structured Probabilistic Vision Systems (Qing Sun)

Improving Image Segmentation using Object Proposals (Sean Thweatt)

Understanding and Predicting Importance for Abstract Images (Ramakrishna Vedantam)

Improving SIFT Matching by Interest Points Filtering (Rabih Younes)

Salient Superpixels (Jason Ziglar)

Improving DPM accuracy with Component Selection Strategy (Peng Zhang)

  

Course overview:

This is a graduate course in computer vision. The focus of this course is to survey and critique current and state-of-the-art approaches in computer vision. We will read and analyze the strengths and weaknesses of research papers on a variety of important topics pertaining to visual recognition and identify open research questions. See the schedule for a list of topics we will cover.

Pre-requisite:

An introduction to computer vision or equivalent course.  A machine learning or pattern recognition course may be beneficial.

Requirements:

Following are the requirements to successfully complete this course:

Discussion Paper reviews Presentations Project

Important dates:

Schedule (topic and papers)

Date Topic and papers Presenter
01/21  
Introduction
 
Devi
[slides]
01/23
Research overview
 
Devi
01/28 Local features-based image descriptions

Detail:
Object Categorization by Learned Universal Visual Dictionary. J. Winn, A. Criminisi and T. Minka. ICCV 2005. [project page]

High-level:
A Performance Evaluation of Local Descriptors. K. Mikolajczyk and C. Schmid.  CVPR 2003.

Background:

Seminal paper: Object Recognition from Local Scale-Invariant Features. D. Lowe. ICCV 1999.  [code] [other implementations of SIFT] [IJCV paper]

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid and J. Ponce. CVPR 2006. [15 scenes dataset]  [pyramid match tooklit] [Matlab code]

Rabih
01/30 Object discovery

Detail:
Using Multiple Segmentations to Discover Objects and their Extent in Image Collections. B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman and A. Zisserman.  CVPR 2006. [code]

High-level:
Foreground Focus: Finding Meaningful Features in Unlabeled Images. Y. J. Lee and K. Grauman. BMV 2008. [project page]

Jacob
02/04 Object detection

Detail:
A Discriminatively Trained, Multiscale, Deformable Part Model. P. Felzenszwalb,  D.  McAllester and D. Ramanan. CVPR 2008. [code]

High-level:
Histograms of Oriented Gradients for Human Detection. N. Dalal and B. Triggs. CVPR 2005. [video] [PASCAL datasets]

Extra:

Rapid Object Detection Using a Boosted Cascade of Simple Features. P. Viola and M. Jones. CVPR 2001.

Diagnosing Error in Object Detectors. D. Hoiem, Y. Chodpathumwan and Q. Dai. ECCV 2012. [code and data]
 
Ujwal
02/06 Object proposals

Detail:
What is an Object?  B. Alexe, T. Deselaers and V. Ferrari.  CVPR 2010. [code]

High-level:
Category Independent Object Proposals. I. Endres and D. Hoiem. ECCV 2010. [project]

Extra:

Constrained Parametric Min-Cuts for Automatic Object Segmentation. J. Carreira and C. Sminchisescu. CVPR 2010. [code]
 
Qing
02/11 Segmentation

Detail:
Learning a Classification Model for Segmentation. X. Ren and J. Malik. ICCV 2003.

High-level:
Combining Top-down and Bottom-up Segmentation. E. Borenstein, E. Sharon and S. Ullman.  CVPR  workshop 2004. [data]
 
Sean (papers)
Michael (experiments)
02/13  
Classes canceled (snow)
 
N/A
02/18 Pose

Detail:
Real-Time Human Pose Recognition in Parts from a Single Depth Image.  J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman and A. Blake. CVPR 2011. [video] [project page]

High-level:
Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations. L. Bourdev and J. Malik.  ICCV 2009. [code]

Extra:

Articulated Pose Estimation using Flexible Mixtures of Parts. Y. Yang and D. Ramanan. CVPR 2011. [code]
 
Ujwal (papers)
Qing (experiments)
02/20 Context

Detail:
Object-Graphs for Context-Aware Category Discovery.  Y. J. Lee and K. Grauman.  CVPR 2010. [code]

High-level:
An Empirical Study of Context in Object Detection. S. Divvala, D. Hoiem, J. Hays, A. Efros and M. Hebert. CVPR 2009. [project page]
 
Jason
02/25 Holistic scene understanding

Detail:
TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation.  J. Shotton, J. Winn, C. Rother and A. Criminisi.  ECCV 2006. [project page] [data] [code]

High-level:
Describing the Scene as a Whole: Joint Object Detection, Scene Classification and Semantic Segmentation. J. Yao, S. Fidler and R. Urtasun. CVPR 2012. 
 
Michael
02/27 Groups of objects

Detail:
Recognition Using Visual Phrases.  M. Sadeghi and A. Farhadi.  CVPR 2011.

High-level:
Automatic Discovery of Groups of Objects for Scene Understanding. C. Li, D. Parikh and T. Chen. CVPR 2012. [project page]
 
Prakriti
03/04 Saliency

Detail:
Learning to Predict Where Humans Look. T. Judd, K. Ehinger, F. Durand, and A. Torralba. ICCV 2009. [project page]
 
High-level:
Learning to Detect a Salient Object.  T. Liu, J. Sun, N. Zheng, X. Tang, H. Shum. CVPR 2007. [results] [data] [code]
 
Extra:

A Model of Saliency-based Visual Attention for Rapid Scene Analysis.  L. Itti, C. Koch, and E. Niebur. PAMI 1998.

Clint
03/06 Project proposals due

Importance

Detail:
Reading Between the Lines: Object Localization Using Implicit Cues from Image Tags. S. J. Hwang and K. Grauman. CVPR 2010.

High-level:
Understanding and Predicting Importance in Images.  A. Berg, T. Berg, H. Daume, J. Dodge, A. Goyal, X. Han, A, Mensch, M. Mitchell, A. Sood, K. Stratos and K. Yamaguchi. CVPR 2012. [UIUC sentence dataset] [ImageClef dataset]

Extra:
 
Some Objects are More Equal Than Others: Measuring and Predicting Importance. M. Spain and P. Perona. ECCV 2008.
 
What Makes an Image Memorable?  P. Isola, J. Xiao, A. Torralba, A. Oliva. CVPR 2011. [project page] [code and data]
 
Sean
03/11  
Spring break: no class
 
N/A
03/13  
Spring break: no class
 
N/A
03/18 Action Recognition

Detail:
Learning a Hierarchy of Discriminative Space-Time Neighborhood Features for Human Action Recognition. A. Kovashka and K. Grauman. CVPR 2010.

High-level:
Action Recognition from a Distributed Representation of Pose and Appearance. S. Maji, L. Bourdev and J.  Malik. CVPR 2011. [code]

John
03/20 Global and high-level image descriptions

Detail:
Efficient Object Category Recognition Using Classemes. L. Torresani, M. Szummer and A. Fitzgibbon. ECCV 2010. [code and data]
 
High-level:
Objects as Attributes for Scene Classification. L.-J. Li, H. Su, Y. Lim and L. Fei-Fei, 1st International Workshop on Parts and Attributes, ECCV 2010.

Extra:

Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope. A. Oliva and A. Torralba. IJCV 2001. [Gist code
 
Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification.  L-J. Li, H. Su, E. Xing, L. Fei-Fei.  NIPS 2010. [code]
 
Clint (papers)
Rama (experiments)
03/25 Attributes

Detail:
Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer. C. Lampert, H. Nickisch and S. Harmeling. CVPR 2009. [project page with data]

High-level:
Describing Objects by Their Attributes. A. Farhadi, I. Endres, D. Hoiem and D. Forsyth, CVPR 2009. [data]

Extra:

Relative Attributes.  D. Parikh and K. Grauman. ICCV 2011. [code and data]
 
Peng (papers)
Shrenik (experiments)
03/27
Mid-semester project presentations
 
TBD
04/01
Mid-semester project presentations

 
TBD
04/03 Human-in-the-loop

Detail:
Visual Recognition with Humans in the Loop.  S. Branson, C. Wah, B. Babenko, F. Schroff, P. Welinder, P. Perona and S. Belongie. ECCV 2010.

High-level:
Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. S. Vijayanarasimhan and K. Grauman.  CVPR 2011.

Extra:

iCoseg: Interactive Co-segmentation with Intelligent Scribble Guidance. D. Batra, A. Kowdle, D. Parikh, J. Luo and T. Chen. CVPR 2010. [project page]
 
Jason (papers)
04/08 Crowdsourcing

Detail:
Labeling Images with a Computer Game. L. von Ahn and L. Dabbish. CHI 2004.

High-level:
Adaptively Learning the Crowd Kernel.  O. Tamuz, C. Liu, S. Belongie, O. Shamir and A. Kalai. ICML 2011.

Extra:

Crowdclustering. R. Gomes, P. Welinder, A. Krause and P. Perona. NIPS 2011.
 
Shrenik
04/10 Big data

Detail:
IM2GPS: Estimating Geographic Information From a Single Image. J. Hays and A. Efros. CVPR 2008. [project page with data and Flickr download scripts]

High-level:
Unbiased Look at Dataset Bias. A. Torralba and A. Efros. CVPR 2011. [project page]

Extra:

Scene Completion using Millions of Photographs. J. Hays and A. Efros. SIGGRAPH 2007. [project page]

80 Million Tiny Images: A Large Dataset for Non-Parametric Object and Scene Recognition. A. Torralba, R. Fergus and W. Freeman.  PAMI 2008. [project page]
 
John
04/15 Applications

Detail:
Photo Tourism: Exploring Photo Collections in 3D. N. Snavely, S. Seitz and R. Szeliski. SIGGRAPH 2006. [project page]

High-level:
LeafSnap: A Computer Vision System for Automatic Plant Species Identification.  N. Kumar, P. Belhumeur, A. Biswas, D. Jacobs, W. Kress, I. Lopez, J. Soares. ECCV 2012.

Extra:

FaceTracer: A Search Engine for Large Collections of Images with Faces.  N. Kumar, P. Belhumeur and S. Nayar.  ECCV 2008. [code, data, demo]
 
Lisa (papers)
Peng (experiments)
04/17 Human abilities

Detail:
Rapid natural scene categorization in the near absence of attention. L. Fei-Fei, R. VanRullen, C. Koch and P. Perona. PNAS 2002.

High-level:
What Do We Perceive in a Glance of a Real-World Scene? L. Fei-Fei, A. Iyer, C. Koch and P. Perona. Journal of Vision, 2007.
 
No class
(Reviews are still due)
04/22 Language and images

Detail:
Every Picture Tells a Story: Generating Sentences for Images. A. Farhadi, M. Hejrati, A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier and D. Forsyth. ECCV 2010. [UIUC sentence dataset]

High-level:
Baby Talk: Understanding and Generating Simple Image Descriptions. G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. C. Berg and T. L. Berg. CVPR 2012.

Extra:

Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers. A. Gupta and Larry S. Davis. ECCV 2008.
 
Rama
04/24
Images of people

Detail:
Names and Faces in the News. T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y. Teh, E. Learned-Miller and D. Forsyth. CVPR 2004. [project page]

High-level:
Estimating Age, Gender and Identity using First Name Priors. A. Gallagher and T. Chen. CVPR 2008. [project page]

Extra:

Exploring Photobios.  I. Kemelmacher-Shlizerman, E. Shechtman, R. Garg and S. Seitz. SIGGRAPH 2011. [project page]

Autotagging Facebook: Social Network Context Improves Photo Annotation. Z. Stone, T. Zickler and T. Darrell.  CVPR Internet Vision Workshop 2008.

 
Lisa
04/29
No class; work on projects.
 
N/A
05/01
No class; work on projects.
 
N/A
05/06
No class; work on projects.
 
N/A

05/10
 
 
Final project presentations (10:00 am to 1:30 pm)
 
All
05/12
Final project reports due
 
N/A

Resources

Other code and data:

Tutorials, workshops, summer schools:

Similar courses:

This course has been inspired by the following two courses:

Other similar courses: