ECE 6554 Spring 2016

ECE 6554: Advanced Computer Vision, Spring 2016

Electrical and Computer Engineering Department, Virginia Tech

Meets: TR 5:00 pm to 6:15 pm in Lavery Hall Room 345

Instructor: Devi Parikh
Email: parikh@vt.edu
Forum: https://scholar.vt.edu/portal/site/s16ece6554

Course overview Project video Pre-requisite Requirements Important dates Schedule Resources

Course overview:

This is a graduate course in computer vision. The focus of this course is to survey and critique published approaches in computer vision. We will read and analyze the strengths and weaknesses of research papers on a variety of important topics pertaining to visual recognition and identify open research questions. See the schedule for a list of topics we will cover.

Project video:

    S(h)ortStory: Sorting Images & Captions into Stories
    Harsh Agrawal, Arjun Chandrasekaran
    Video Spotlight



    Saliency based driver assistance
    Graham Cantor-Cooke
    Video Spotlight



    Affordable Automated Technology for Tuberculosis Screening with Ziehl-Neelsen Stained Sputum.
    Swazoo Claybon
    Video Spotlight



    Identifying Relevant vs. Irrelevant Questions for VQA
    Arijit Ray, Christopher Dusold
    Video Spotlight



    Actively Learning Visual Question Answering (VQA)
    Xiao Lin
    Video Spotlight



    Sequential Question Answering (SVQA)
    Latha Pemula
    Video Spotlight



    On Helping Stroke Rehabilitation with Computer Vision
    Jinwoo Choi
    Video Spotlight



    Enforcing Consistent Predictions in a VQA model
    Aroma Mahendru
    Video Spotlight



    Improving PCA, IPCA, and JTC Performance in Face Detection and Recognition Using Object Discovery Given Multiple Segmentations
    Abdulaziz A Alorf
    Video Spotlight



    A Siamese Network based model for binary VQA
    Sneha Mehta
    Video Spotlight

Pre-requisite:

An introduction to computer vision or equivalent course. A machine learning or pattern recognition course may be beneficial.

Requirements:

Following are the requirements to successfully complete this course:

Paper reviews Leading Discussion Presentations Project

Paper reviews (25% of your grade): Students will be required to write a detailed review of one assigned paper before each class. The reviews should be not more than one page (11 point, times new roman, 1 inch margins). Please submit the review as firstname_lastname_MM_DD.pdf (all in small letters, where MM is the month and DD is the day).

Each review should summarize the paper in 2-3 sentences, describe the approach taken, and clearly identify the main contribution of the paper. The review should describe the strengths and weaknesses of the paper. Comments on how convincing the experiments were, if the material was well presented, and how the paper can be improved and extended should be included. A good review also comments on how the paper relates to other papers we have read or you know of. Finally, identify any interesting open research questions or applications that arise from reading the paper.
Most interesting thought: From your review, pull out what you believe is your most intersting thought about the paper, and put this in the body of your email to the instructor.
Due dates: The reviews should be emailed to the instructor by 12:00 pm (noon) the day of the class (i.e. on Tuesdays and Thursdays). Late reviews will not be accepted. The three reviews with lowest scores will be dropped at the end of class. You need not submit paper reviews during the classes where you are presenting the papers (see below) or leading discussions on the paper (see below)
You can find some good example paper summaries/reviews here:

Example 1

Example 2

Example 3
Tips on how to read a paper: http://ccr.sigcomm.org/online/files/p83-keshavA.pdf
Tips on how to write a review: http://people.inf.ethz.ch/troscoe/pubs/review-writing.pdf (this is a bit specific to systems papers, but hopefully the high-level messages are useful)

Leading Discussion (15% of your grade): You will be assigned to lead discussion on the paper that you have read twice (estimated) during the semester. In one case you will be asked to argue in favor of the paper. In the other case you will be asked to argue against the paper. In each case, come prepared with 5 points of discussion (in favor or against the paper). You need not submit a review for the paper you are leading a discussion on.

Presentations (25% of your grade): Each student will be asked to present the topic associated with a class once (estimated) over the course of the semester. Each presentation should be 30 minutes long. Students should practice their talks ahead of time to make sure they are 30 minutes long -- not shorter by more than a few minutes, and certainly not longer. The talks should be well organized and polished.

The student should read the assigned papers, background papers, and look up related older and latest papers on your own to gain a good perspective on the topic as a whole. In addition to presenting this perspective and background information, students should present a paper or two in detail. For each paper being presented in detail, students should clearly state the problem statement, and motivate why the problem is interesting and important. The key technical ideas should be presented. The student should describe the experimental set-up and present the results obtained. The strengths and weaknesses of the paper should be discussed. The student should discuss how the different papers relate to each other (similarities and differences). Finally, interesting open research questions should be identified. The slides should be made as visual (with videos, images, animations) and clear as possible. The student should look at the links provided next to each paper, as well as the authors webpages for extra material such as slides, videos, extra results, etc. Students are encouraged to search for relevant material online, the links provided here are not comprehensive. Please clearly cite the source of each slide that is not your own. Even if you use slides made by the author, it is your responsibility to make sure your presentation as a whole flows well.
IMPORTANT:

Your presentation should be on the topic as a whole, and not just individual papers. Be sure to provide a cohesive overview (past and current) of the topic.
Do not present the paper that students have read and reviewed. You can mention it briefly if necessary to place it in context of the rest of your presentation.

The student can also conduct some (small-scale) experiments on one of the papers to analyze an interesting and meaningful aspect of the approach that the paper has not analyzed (e.g. different datasets, sensitivity to any parameters in the approach, etc.) to gain a more complete understanding of the paper, and to see if the approach "really works". A distilled demo-version of the main idea of the approach presented in the paper could be implemented. Authors often make their code publicly availabe. The goal is not to regenerate the results already present in the paper. You may implement it yourself, or download code if available. Again, the experimental setup, any non-trivial implementation choices you made, results obtained and conclusions one can draw from them should be described in the presentation (within the 30 minutes). Please cite any existing code or data you use for your experiments.
Example presentations:
Example 1

Example 2

Please sign up for 6 topics on this sheet by 11:59 pm Wednesday, January 20th. First-come, first-served. Write your name in the left most set of columns available for a topic. Indicate your preference for that topic. 1 = most preferred, 6 = least preferred. Finally, also indicate the chances that you might drop this class. Once you have been assigned to a topic, if you drop the class, it can severely disrupt the class schedule. So please sign up only once you are fairly certain you won’t be dropping the class. See the schedule for the semester.

Project (35% of your grade): Your project can be about extending a technique we studied in class, or empirically analyzing it. Comparisons between two approaches are also welcome. It is wonderful if you design and evaluate a novel approach to an important vision problem. Look at the scheduleto get ideas on what topics might be of interest to you for your project. Perhaps the experiments part of your presentation above might give you project ideas. Be creative! If you need help with ideas for your project please come talk to me. You can work with a partner if you like. Email the proposal as a pdf to the instructor by 11:59 pm. The following are deliverables for your project.

Proposal (25% of your project grade, due March 1st, 11:59 pm, email PDF to instructor): A 1-page description that describes the following:

Problem statement: Clearly state the goal of your project.
Related work: Briefly describe existing related work (with citations) and what your project brings to the table that these other works do not. The most relevant papers may not necessarily be papers listed on the schedule, so be sure to also look beyond the list.
Approach: Describe the technical approach you plan to employ
Experiments and results: Describe the experimental setup you will follow, which datasets you will use, which existing code you will exploit, what you will implement yourself, and what you would define as a success for the project. If you plan on collecting your own data, describe what data collection protocol you will follow. Specify if you plan on experimentally analyzing different characteristics of your approach, or if you will compare to existing techniques. Provide a list of experiments you will perform. Describe what you expect the experiments to reveal, or what is uncertain about the potential outcomes. If you have any preliminary results, please summarize those as well.

Final presentation (40% of your project grade, May 7th): A 10-minute presentation that describes the same points as the proposal in as much detail as 10 minutes allow. Describe any challenges you faced. Clearly state what has been accomplished. Any assumptions of your approach should be clearly stated. Any insights on future extensions of this project should be discussed.

Project video (35% of your project grade, due May 11th 11:59 am, email youtube link to instructor): A 1-minute youtube video summarizing the project. The video is a teaser to convey the main points, and gain the viewer's interest in wanting to know more. It should be understandable by anyone familiar wtih computer vision. See examples below:

The class will vote on best paper presentation, best project, and best discussion-participant!

Feedback is very welcome. If you have any questions or concerns about the class or the requirements, please be sure to discuss them with the instructor early on.

No laptops, cell phone or other distractions in class please.

Important dates:

January 20th 11:59 pm: Sign up for 6 topics on this sheet. Please see the schedule for the semester.
March 1st 11:59 pm: Project proposals due
May 8th: Final project presentations.
May 11th 11:59 am: Project video due.

Schedule (topic and papers)

List of papers, particularly "Seeds / pointers for presenters:” is being updated.

Date	Topic and papers	Presenter, Discussion leads
01/19	Introduction	Devi [slides]
01/21	Convolutional Neural Networks (and the deep learning "revolution") See: Dhruv Batra's Deep Learning Class (Fall 2015) https://computing.ece.vt.edu/~f15ece6504/ for more details.	Presenter: Ram [slides]
01/26	Local features-based image descriptions Write review for and discuss: Fisher kernels on visual vocabularies for image categorization. P. Florent, and C. Dance. CVPR 2007. Seeds / pointers for presenters: Local Convolutional Features With Unsupervised Training for Image Retrieval. M. Paulin, M. Douze, Z. Harchaoui, J.Mairal, F.Perronin and C. Schmid. ICCV 2015 Object Categorization by Learned Universal Visual Dictionary. J. Winn, A. Criminisi and T. Minka. ICCV 2005. [project page] A Performance Evaluation of Local Descriptors. K. Mikolajczyk and C. Schmid. CVPR 2003. Seminal paper: Object Recognition from Local Scale-Invariant Features. D. Lowe. ICCV 1999. [code] [other implementations of SIFT] [IJCV paper] Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid and J. Ponce. CVPR 2006. [15 scenes dataset] [pyramid match tooklit] [Matlab code]	Presenter: Mincan, Orson “For” discussion lead: nuo “Against” discussion lead: Dusold
01/28	Object discovery Write review for and discuss: Using Multiple Segmentations to Discover Objects and their Extent in Image Collections. B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman and A. Zisserman. CVPR 2006. [project page] Seeds / pointers for presenters: Unsupervised joint object discovery and segmentation in internet images. M. Rubinstein and A. Joulin, J. Kopf, and C. Liu. CVPR 2013. Unsupervised object discovery and localization in the wild: part-based matching with bottom-up region proposals. M. Cho, S. Kwak, C. Schmid, and J. Ponce. CVPR, 2015 Foreground Focus: Finding Meaningful Features in Unlabeled Images. Y. J. Lee and K. Grauman. BMVC 2008. [project page]	Presenter: Swazoo “For” discussion lead: Murat “Against” discussion lead: Latha
02/02	Object detection Write review for and discuss: A Discriminatively Trained, Multiscale, Deformable Part Model. P. Felzenszwalb, D. McAllester and D. Ramanan. CVPR 2008. [code] Seeds / pointers for presenters: Faster R-CNN: Towards real-time object detection with region proposal networks. S.Ren, K. He, R. Girshick and J. Sun NIPS 2015. [code] Histograms of Oriented Gradients for Human Detection. N. Dalal and B. Triggs. CVPR 2005. [video] [PASCAL datasets] Rapid Object Detection Using a Boosted Cascade of Simple Features. P. Viola and M. Jones. CVPR 2001. Diagnosing Error in Object Detectors. D. Hoiem, Y. Chodpathumwan and Q. Dai. ECCV 2012. [code and data]	Presenter: nuo “For” discussion lead: Alorf “Against” discussion lead: Mincan
02/04	Object proposals Write review for and discuss: What is an Object? B. Alexe, T. Deselaers and V. Ferrari. CVPR 2010. [code] Seeds / pointers for presenters: Edge boxes: Locating object proposals from edges. C. L. Zitnick and P. Dollar. ECCV 2014. [code] Category Independent Object Proposals. I. Endres and D. Hoiem. ECCV 2010. [project] Constrained Parametric Min-Cuts for Automatic Object Segmentation. J. Carreira and C. Sminchisescu. CVPR 2010. [code]	Presenter: Latha “For” discussion lead: Swazoo “Against” discussion lead: Murat
02/09	Segmentation Write review for and discuss: Learning a Classification Model for Segmentation. X. Ren and J. Malik. ICCV 2003. Seeds / pointers for presenters: DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection. G. Bertasius, J. Shi, L. Torresani. CVPR 2015 Combining Top-down and Bottom-up Segmentation. E. Borenstein, E. Sharon and S. Ullman. CVPR workshop 2004. [data]	Presenter: Yash “For” discussion lead: Xiao “Against” discussion lead: Abraham
02/11	Pose Write review for and discuss: Real-Time Human Pose Recognition in Parts from a Single Depth Image. J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman and A. Blake. CVPR 2011. [video] [project page] Seeds / pointers for presenters: Deeppose: Human pose estimation via deep neural networks. A. Toshev and C. Szegedy. CVPR 2014. Articulated Pose Estimation using Flexible Mixtures of Parts. Y. Yang, D. Ramanan. CVPR 2011. Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations. L. Bourdev and J. Malik. ICCV 2009. [code] Articulated Pose Estimation using Flexible Mixtures of Parts. Y. Yang and D. Ramanan. CVPR 2011. [code]	Presenter: Rich F “For” discussion lead: Orson “Against” discussion lead: Xiao
02/16	Context Write review for and discuss: Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. CVPR 2010. [code] Seeds / pointers for presenters: Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks. S. Bell, C.L. Zitnick, K. Bala, R. Girshick. arXiv:1512.04143 2015 The role of context for object detection and semantic segmentation in the wild. R. Mottaghi, X. Chen, X. Liu, N.G. Cho, S.W. Lee, S. Fidler and R. Urtasun. CVPR 2014 An Empirical Study of Context in Object Detection. S. Divvala, D. Hoiem, J. Hays, A. Efros and M. Hebert. CVPR 2009. [project page]	Presenter: Aroma “For” discussion lead: Arjun “Against” discussion lead: Mostafa
02/18	3D layout Write review for and discuss: Recovering the Spatial Layout of Cluttered Rooms. V. Hedau, D. Hoiem and D. Forsyth. ICCV 2009. [project page] Seeds / pointers for presenters: Single Image 3D Without a Single 3D Image. D. Fouhey, W. Hussain, A. Gupta, M. Heber. ICCV 2015. Unfolding an Indoor Origami World. D. Fouhey, A. Gupta, and M. Hebert. ECCV 2014. [project page] Box In the Box: Joint 3D Layout and Object Reasoning from Single Images. A. Schwing, S. Fidler, M. Pollefeys and R. Urtasun. ICCV 2013	Presenter: Pooja “For” discussion lead: Abraham “Against” discussion lead: Rich F
02/23	Holistic scene understanding Write review for and discuss: TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation. J. Shotton, J. Winn, C. Rother and A. Criminisi. ECCV 2006. [project page] [data] [code] Seeds / pointers for presenters: Describing the Scene as a Whole: Joint Object Detection, Scene Classification and Semantic Segmentation. J. Yao, S. Fidler and R. Urtasun. CVPR 2012.	Presenter: Murat “For” discussion lead: Mostafa “Against” discussion lead: Jinwoo
02/25	Groups of objects Write review for and discuss: Recognition Using Visual Phrases. M. Sadeghi and A. Farhadi. CVPR 2011. Seeds / pointers for presenters: Automatic Discovery of Groups of Objects for Scene Understanding. C. Li, D. Parikh and T. Chen. CVPR 2012. [project page]	Presenter: Alorf “For” discussion lead: Aroma “Against” discussion lead: Arjun
03/01	Project proposals due Saliency Write review for and discuss: Learning to Predict Where Humans Look. T. Judd, K. Ehinger, F. Durand, and A. Torralba. ICCV 2009. [project page] Seeds / pointers for presenters: What makes a patch distinct. R. Margolin, A. Tal and L. Zelnik-Manor. CVPR 2013. Learning to Detect a Salient Object. T. Liu, J. Sun, N. Zheng, X. Tang, H. Shum. CVPR 2007. [results] [data] [code] A Model of Saliency-based Visual Attention for Rapid Scene Analysis. L. Itti, C. Koch, and E. Niebur. PAMI 1998.	Presenter: Xiao “For” discussion lead: Graham “Against” discussion lead: Aroma
03/03	Importance Write review for and discuss: Reading Between the Lines: Object Localization Using Implicit Cues from Image Tags. S. J. Hwang and K. Grauman. CVPR 2010. Seeds / pointers for presenters: VIP: Finding Important People in Images. C.S. Mathialagan, A.C. Gallagher. and D. Batra. CVPR 2015. [demo] Understanding and Predicting Importance in Images. A. Berg, T. Berg, H. Daume, J. Dodge, A. Goyal, X. Han, A, Mensch, M. Mitchell, A. Sood, K. Stratos and K. Yamaguchi. CVPR 2012. [UIUC sentence dataset] [ImageClef dataset] Some Objects are More Equal Than Others: Measuring and Predicting Importance. M. Spain and P. Perona. ECCV 2008. What Makes an Image Memorable? P. Isola, J. Xiao, A. Torralba, A. Oliva. CVPR 2011. [project page] [code and data]	Presenter: Arjun “For” discussion lead: Arijit “Against” discussion lead: Adithya
03/08	Spring break: no class	N/A
03/10	Spring break: no class	N/A
03/15	Images of people Write review for and discuss: Names and Faces in the News. T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y. Teh, E. Learned-Miller and D. Forsyth. CVPR 2004. [project page] Seeds / pointers for presenters: Estimating Age, Gender and Identity using First Name Priors. A. Gallagher and T. Chen. CVPR 2008. [project page] Exploring Photobios. I. Kemelmacher-Shlizerman, E. Shechtman, R. Garg and S. Seitz. SIGGRAPH 2011. [project page] Autotagging Facebook: Social Network Context Improves Photo Annotation. Z. Stone, T. Zickler and T. Darrell. CVPR Internet Vision Workshop 2008.	Presenter: Arijit “For” discussion lead: Mincan “Against” discussion lead: nuo
03/17	Action Recognition Write review for and discuss: Recognizing realistic actions from videos “in the wild”. J. Liu, J. Luo, M. Shah. CVPR 2009. Seeds / pointers for presenters: Action recognition with trajectory-pooled deep-convolutional descriptors. L. Wang, Q. Yu, and T. Xiao. arXiv preprint arXiv:1505.04868 (2015) Action recognition by dense trajectories. In Computer Vision and Pattern Recognition. H. Wang, A. Kläser, C. Schmid. and C.L Liu. CVPR 2011. Action Recognition from a Distributed Representation of Pose and Appearance. S. Maji, L. Bourdev and J. Malik. CVPR 2011. [code]	Presenter: Graham “For” discussion lead: Jinwoo “Against” discussion lead: Alorf
03/22	Global and high-level image descriptors Write review for and discuss: Efficient Object Category Recognition Using Classemes. L. Torresani, M. Szummer and A. Fitzgibbon. ECCV 2010. [code and data] Seeds / pointers for presenters: CNN features off-the-shelf: an astounding baseline for recognition. Razavian AS, Azizpour H, Sullivan J, Carlsson S. CVPRW 2014. Objects as Attributes for Scene Classification. L.-J. Li, H. Su, Y. Lim and L. Fei-Fei, 1st International Workshop on Parts and Attributes, ECCV 2010. Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope. A. Oliva and A. Torralba. IJCV 2001. [Gist code] Object Bank: A High-Level Image Representation for Scene Classiﬁcation & Semantic Feature Sparsiﬁcation. L-J. Li, H. Su, E. Xing, L. Fei-Fei. NIPS 2010. [code]	Presenter: Adithya “For” discussion lead: Harsh “Against” discussion lead: Orson
03/24	No class because Devi is traveling	N/A
03/29	Attributes Write review for and discuss: Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer. C. Lampert, H. Nickisch and S. Harmeling. CVPR 2009. [project page with data] Seeds / pointers for presenters: Describing Objects by Their Attributes. A. Farhadi, I. Endres, D. Hoiem and D. Forsyth, CVPR 2009. [data] Relative Attributes. D. Parikh and K. Grauman. ICCV 2011. [code and data] Panda: Pose aligned networks for deep attribute modeling, N. Zhang, M. Paluri, MA. Ranzato, T. Darrell, L. Bourdev. CVPR 2014	Presenter: Jinwoo “For” discussion lead: Adithya “Against” discussion lead: Yash
03/31	No class because Devi is traveling	N/A
04/05	No class because Devi is traveling	N/A
04/07	No class because Devi is traveling	N/A
04/12	Language and images Write review for and discuss: Every Picture Tells a Story: Generating Sentences for Images. A. Farhadi, M. Hejrati, A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier and D. Forsyth. ECCV 2010. [UIUC sentence dataset] Seeds / pointers for presenters: Show and tell: A neural image caption generator. O. Vinyals, A. Toshev, S. Bengio. and D. Erhan. arXiv 2014. Deep Visual-Semantic Alignments for Generating Image Descriptions. A. Karpathy, L. Fei-Fei. CVPR 2015. VQA: Visual Question Answering. S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh ICCV 2015. [Project Page][code][demo] Baby Talk: Understanding and Generating Simple Image Descriptions. G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. C. Berg and T. L. Berg. CVPR 2012. Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers. A. Gupta and Larry S. Davis. ECCV 2008.	Presenter: Harsh “For” discussion lead: Yash “Against” discussion lead: Arijit
04/14	Human-in-the-loop Write review for and discuss: Visual Recognition with Humans in the Loop. S. Branson, C. Wah, B. Babenko, F. Schroff, P. Welinder, P. Perona and S. Belongie. ECCV 2010. Seeds / pointers for presenters: Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. S. Vijayanarasimhan and K. Grauman. CVPR 2011. iCoseg: Interactive Co-segmentation with Intelligent Scribble Guidance. D. Batra, A. Kowdle, D. Parikh, J. Luo and T. Chen. CVPR 2010. [project page]	Presenter: Sneha “For” discussion lead: Aishwarya “Against” discussion lead: Swazoo
04/19	Crowdsourcing Write review for and discuss: Labeling Images with a Computer Game. L. von Ahn and L. Dabbish. CHI 2004. Seeds / pointers for presenters: Adaptively Learning the Crowd Kernel. O. Tamuz, C. Liu, S. Belongie, O. Shamir and A. Kalai. ICML 2011. Crowdclustering. R. Gomes, P. Welinder, A. Krause and P. Perona. NIPS 2011. Cost-Effective HITs for Relative Similarity Comparisons. MJ. Wilber, IS. Kwak, SJ. Belongie. AAAI Human Computation and Crowdsourcing 2014	Presenter: Dusold “For” discussion lead: Latha “Against” discussion lead: Sneha
04/21	Applications Write review for and discuss: Photo Tourism: Exploring Photo Collections in 3D. N. Snavely, S. Seitz and R. Szeliski. SIGGRAPH 2006. [project page] Seeds / pointers for presenters: LeafSnap: A Computer Vision System for Automatic Plant Species Identification. N. Kumar, P. Belhumeur, A. Biswas, D. Jacobs, W. Kress, I. Lopez, J. Soares. ECCV 2012. FaceTracer: A Search Engine for Large Collections of Images with Faces. N. Kumar, P. Belhumeur and S. Nayar. ECCV 2008. [code, data, demo]	Presenter: Abraham “For” discussion lead: Rich F “Against” discussion lead: Aishwarya
04/26	No class because Devi is traveling	N/A
04/28	Human abilities Write review for and discuss: Rapid natural scene categorization in the near absence of attention. L. Fei-Fei, R. VanRullen, C. Koch and P. Perona. PNAS 2002. Seeds / pointers for presenters: What Do We Perceive in a Glance of a Real-World Scene? L. Fei-Fei, A. Iyer, C. Koch and P. Perona. Journal of Vision, 2007.	Presenter: Aishwarya “For” discussion lead: Dusold “Against” discussion lead: Graham
05/03	Big data Write review for and discuss: IM2GPS: Estimating Geographic Information From a Single Image. J. Hays and A. Efros. CVPR 2008. [project page with data and Flickr download scripts] Seeds / pointers for presenters: Unbiased Look at Dataset Bias. A. Torralba and A. Efros. CVPR 2011. [project page] Scene Completion using Millions of Photographs. J. Hays and A. Efros. SIGGRAPH 2007. [project page] 80 Million Tiny Images: A Large Dataset for Non-Parametric Object and Scene Recognition. A. Torralba, R. Fergus and W. Freeman. PAMI 2008. [project page]	Presenter: Mostafa “For” discussion lead: Sneha “Against” discussion lead: Harsh
05/08	Final project presentations 9:00 am to 1:00 pm in Whittemore 654. (Whittemore is locked on weekends. Please arrive between 8:45 and 8:55 am. Someone will be at the main entrance off of Perry St. to let you in.)
05/11	Project video due by 11:59 am (noon)

Resources

Other code and data:

Visual Object Recognition synthesis lecture by Grauman and Leibe (short book on object recognition methods)
Compiled list of recognition datasets
OpenCV (open source computer vision library)
Weka (Java data mining software)
Netlab (Matlab toolbox for data analysis techniques, written by Ian Nabney and Christopher Bishop)
CV Online
Annotated Computer Vision Bibliography
Oxford group interest point software
Andrea Vedaldi's VLFeat code, including SIFT, MSER, hierarchical k-means.
INRIA LEAR team's software, including interest points, shape features
FLANN - Fast Library for Approximate Nearest Neighbors. Marius Muja et al.
Google Goggles
Kooaba
LSH homepage
Code for downloading Flickr images, by James Hays
UW Community Photo Collections homepage
INRIA Holiday images dataset
NUS-WIDE tagged image dataset of 269K images
MIRFlickr dataset
LIBPMK feature extraction code, includes dense sampling
LIBSVM library for support vector machines
PASCAL VOC Visual Object Classes Challenge
Fast SLIC superpixels
Greg Mori's superpixel code
Berkeley Segmentation Dataset and code
Pedro Felzenszwalb's graph-based segmentation code
Mean-shift: a Robust Approach Towards Feature Space Analysis [pdf] [code, Matlab interface by Shai Bagon]
David Blei's Topic modeling code
Berkeley 3D object dataset (kinect)
Labelme Database
Stanford Event Dataset
SUN Scene and object dataset
ImageNet dataset of 15K objects and ImageNet challenge
Animals with Attributes dataset
aYahoo and aPascal attributes datasets
Attribute discovery dataset of shopping categories
Public Figures Face database with attributes
Relative attributes data
WhittleSearch relative attributes data
SUN Scenes attribute dataset
Cross-category object recognition (CORE) dataset
Leeds Butterfly Dataset
FaceTracer database from Columbia
Caltech-UCSD Birds dataset
Database of human attributes
Face detection code in OpenCV
Gallagher's Person Dataset
Face data from Buffy episode, from Oxford Visual Geometry Group
CALVIN upper-body detector code
UMass Labeled Faces in the Wild
Ivan Laptev's Space-Time Interest Points code
Hollywood activity dataset
Stanford 40 Actions still image dataset
Stanford People Playing Musical Instrument dataset
UCF activity datasets
PASCAL VOC action recognition taster challenge
TRECVID video retrieval challenge
UMich Collective Activity dataset
Egovision workshop at CVPR 2012
Amazon Mechanical Turk
Using Mechanical Turk with LabelMe
Point Cloud Library
Robot Operating System
KITTI Benchmark

Tutorials, workshops, summer schools:

Similar courses:

This course has been inspired by the following two courses:

Visual Recognition (Kristen Grauman, Texas-Austin, Fall 2012)
Learning-Based Methods in Vision (Alyosha Efros, CMU, Spring 2012)

Other similar courses:

Grounding Object Recognition and Scene Understanding (Antonio Torralba, MIT, Fall 2011)
Visual Scene Understanding (Derek Hoiem, UIUC, Spring 2009)
Statistical Models for Visual Recognition (Deva Ramanan, UCI, Winter 2009)
Object Recognition and Scene Understanding (Antonio Torralba, MIT, Fall 2008)
Scene Understanding Seminar (Aude Oliva, MIT, Fall 2006)
Selected Topics in Vision & Learning (Serge Belongie, UCSD, Spring 2011)
Learning and Inference in Vision (Bill Freeman, MIT)
Cutting Edge of Computer Vision (Fei-Fei Li, Stanford)
Recognizing People, Objects, and Scenes (Jitendra Malik, Berkeley)
Recognition Problems in Computer Vision (Greg Mori, SFU)
Vision and Learning (Jianbo Shi, UPenn)