ECE 6554: Advanced Computer Vision Spring 2017

Class Schedule

Date Topic Reading and Links Presenter
Jan 19 (Thursday) Introduction [PPT] [PDF]
Visual Recognition
Jan 24 (Tuesday) Instance recognition *Object Recognition from Local Scale-Invariant Features. D. Lowe, ICCV 1999 [PDF] [code]

Video Google: A Text Retrieval Approach to Object Matching in Videos. J. Sivic and A. Zisserman, ICCV 2003 [PDF] [demo]

Local Invariant Feature Detectors: A Survey. T. Tuytelaars and K. Mikolajczyk, Foundations and Trends in Computer Graphics and Vision, 2008 [PDF] [code] (pp. 178-188, 216-220, 254-255)

[PPT] [PDF]
Jan 26 (Thursday) Category recognition *ImageNet Classification with Deep Convolutional Neural Networks. A. Krizhevsky, I. Sutskever, G. E Hinton. NIPS 2012 [PDF]

Very Deep Convolutional Networks for Large-Scale Visual Recognition. K. Simonyan and A. Zisserman, ICLR 2015 [PDF] [code]

Deep Residual Learning for Image Recognition. K. He, X. Zhang, S. Ren, J. Sun. CVPR 2016 [PDF] [code]

Densely Connected Convolutional Networks. G. Huang, Z. Liu, K. Q. Weinberger, L. Maaten. arXiv 2016 [PDF] [code]

[PPT] [PDF]
Jan 31 (Tuesday) Supervised pretraining *How transferable are features in deep neural networks? J. Yosinski, J. Clune, Y. Bengio, and H. Lipson. NIPS 2014. [PDF] [code]

What makes ImageNet good for transfer learning? M. Huh, P. Agrawal, A. A. Efros, arXiv 2016 [PDF]

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, T. Darrell, JMLR 2014 [PDF] [code]

[PPT] [PDF]

Akrit (exp)
Feb 2 (Thursday) Understanding deep networks *Visualizing and Understanding Convolutional Networks. M. D Zeiler, R. Fergus, ECCV 2014 [PDF]

Understanding Neural Networks Through Deep Visualization. J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, H. Lipson, ICML workshop 2015 [PDF] [code]

Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. A Nguyen, J Yosinski, J Clune, CVPR 2015 [PDF] [code]

Understanding Deep Image Representations by Inverting Them. A. Mahendran, A. Vedaldi, CVPR 2015 [PDF] [code]

Pranav (topic)
Pranav (exp)
[PPT]
Rammohan (“for” discussion lead)
Sanket (“against” discussion lead)
Feb 7 (Tuesday) Detection *Rich feature hierarchies for accurate object detection and semantic segmentation. R. Girshick, J. Donahue, T. Darrell, J. Malik. CVPR 2014 [ PDF] [code]

R-FCN: Object Detection via Region-based Fully Convolutional Networks. J. Dai, Y. Li, K. He, J. Sun, NIPS 2016 [PDF] [code]

SSD: Single Shot MultiBox Detector. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, ECCV 2016 [PDF] [code]

YOLO9000: Better, Faster, Stronger J. Redmon, A. Farhadi, arXiv 2016 [PDF] [code]

[PPT] [PDF]
Yingzhou (exp) [PPT] [PDF]
Subhashree (“for” discussion lead)
Yousi (“against” discussion lead)
Feb 9 (Thursday) Segmentation *Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. ICLR 2015 [PDF] [code]

Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell, CVPR 2015 [PDF] [code]

Instance-sensitive Fully Convolutional Networks. J. Dai, K. He, Y. Li, S. Ren, J. Sun, ECCV 2016 [PDF] [code]

Xiaolong (topic, exp) [PDF]
Shuangfei (“for” discussion lead)
Shruti (“against” discussion lead)
Feb 14 (Tuesday) Siamese/Triplet Networks *FaceNet: A unified embedding for face recognition and clustering. F. Schroff, D. Kalenichenko, J Philbin, ICCV 2015 [PDF] [code]

Computing the stereo matching cost with a convolutional neural network. [PDF] [code]

Learning Deep Representations for Ground-to-Aerial Geolocalization. T.-Y. Lin, Y. Cui, S. Belongie, J. Hays, CVPR 2015 [PDF] [code]

Efficient deep learning for stereo matching. W Luo, AG Schwing, R Urtasun, CVPR 2016 [PDF] [code]

[PPT] [PDF]

Feb 16 (Thursday) Pose *Convolutional Pose Machines. S.-E. Wei, V. Ramakrishna, T. Kanade, Y. Sheikh, CVPR 2016 [PDF][code]

DeepPose: Human pose estimation via deep neural networks. A. Toshev, C. Szegedy, CVPR 2014 [PDF]

Articulated pose estimation with flexible mixtures-of-parts. Y. Yang, D. Ramanan, CVPR 2011 [PDF] [code]

Sujay (topic) [PPT] [PDF]
Amruta (“for” discussion lead)
Yingzhou (“against” discussion lead)
Representation Learning
Feb 21 (Tuesday) Attributes *Describing Objects by their Attributes, A. Farhadi, I. Endres, D. Hoiem and D. Forsyth, CVPR, 2009. [PDF] [code]

Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer. C. H. Lampert, H. Nickisch, S. Harmeling, CVPR 2009 [PDF] [code]

Relative Attributes. D. Parikh, K. Grauman, ICCV 2011 [PDF] [code]

Shuangfei (topic, exp) [PPT] [PDF]
Vikram (“for” discussion lead)
Sujay (“against” discussion lead)
Feb 23 (Thursday) Self-supervised learning *Unsupervised Visual Representation Learning by Context Prediction. C. Doersch, A. Gupta, A. A. Efros. ICCV 2015 [PDF] [code]

Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles. M. Noroozi and P. Favaro [PDF] [code]

Context Encoders: Feature Learning by Inpainting. D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, A. A. Efros, CVPR 2016 [PDF] [code]

Unsupervised Learning of Visual Representations using Videos. X. Wang, A. Gupta, ICCV 2015 [PDF] [code]

Badour (topic, exp) [PPT] [PDF]
Ashish (“for” discussion lead)
Xiaolong (“against” discussion lead)
Feb 28 (Tuesday) Generative adversarial networks *Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. A. Radford, L. Metz, S. Chintala, ICLR, 2016 [PDF] [code]

Generative Adversarial Nets. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, NIPS 2014 [PDF] [code]

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, P. Abbeel, [PDF] [code]

Akrit (topic) [PPT] [PDF]
Ben (“for” discussion lead)
Shuangfei (“against” discussion lead)
Mar 2 (Thursday) Conditional adversarial networks *Image-to-Image Translation with Conditional Adversarial Networks. P. Isola J.-Y. Zhu T. Zhou, A. A. Efros, arXiv 2016 [PDF] [code]

Scribbler: Controlling Deep Image Synthesis with Sketch and Color. P. Sangkloy, J. Lu, C. Fang, F. Yu, J. Hays, arXiv 2016 [PDF] [code]

Generative Adversarial Text to Image Synthesis. S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, H. Lee, ICML 2016 [PDF] [code]

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space A. Nguyen, J. Yosinski, Y. Bengio, A. Dosovitskiy, and J. Clune, arXiv 2016 [PDF] [code]

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. H. Zhang, T. Xu, H. Li, S. Zhang, X. Huang, X. Wang, D. Metaxas, arXiv 2016 [PDF] [code]

Jia-Bin (topic) [PPT] [PDF]
Sanket (exp) [PPT] [PDF]
Yousi (“for” discussion lead)
Pranav (“against” discussion lead)
Mar 7 (Tuesday) Spring Break (no class)
Mar 9 (Thursday) Spring Break (no class)
Mar 14 (Tuesday) Style and content *Image style transfer using convolutional neural networks. L. A. Gatys, A. S. Ecker, M. Bethge, CVPR 2016 [PDF] [code]

Perceptual Losses for Real-Time Style Transfer and Super-Resolution. J. Johnson, A. Alahi, L. Fei-Fei, ECCV 2016, [PDF] [code]

Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis. C. Li, M. Wand, CVPR 2016 [PDF] [code]

Kevin (topic) [PPT] [PDF]
Sujay (exp) [PPT] [PDF]
Sanket (“for” discussion lead)
Rammohan (“against” discussion lead)
Mar 16 (Thursday) ICCV (no class)
Activity and Event
Mar 21 (Tuesday) Research Skills and Final Project
Mar 23 (Thursday) Action recognition *Action recognition by dense trajectories. H Wang, A Kläser, C Schmid, CVPR 2011 [PDF]

Two-stream convolutional networks for action recognition in videos. K Simonyan, A Zisserman, NIPS 2014 [PDF] [code]

Large-scale video classification with convolutional neural networks. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, L. Fei-Fei, CVPR 2014 [PDF] [code]

Learning Spatiotemporal Features with 3D Convolutional Networks. D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. ICCV 2015 [PDF] [code]

Yingzhou (topic) [PPT] [PDF]
Kevin (exp) [PPT] [PDF]
Sujay (“for” discussion lead)
Amruta (“against” discussion lead)
Mar 28 (Tuesday) Active perception *Look-Ahead Before You Leap: End-to-End Active Recognition by Forecasting the Effect of Motion. D. Jayaraman and K. Grauman. ECCV 2016. [PDF]

The Curious Robot: Learning Visual Representations via Physical Interactions. L. Pinto, D. Gandhi, Y. Han, Y-L. Park, and A. Gupta. ECCV 2016. [PDF]

Learning to Poke by Poking: Experiential Learning of Intuitive Physics. P. Agrawal, A. Nair, P. Abbeel, J. Malik, S. Levine. 2016 [PDF] [project]

Shruti (topic) [PPT] [PDF]
Kevin (“for” discussion lead)
Ashish (“against” discussion lead)
Mar 30 (Thursday) Group of objects *Visual relationship detection with language priors. C Lu, R Krishna, M Bernstein, L Fei-Fei, ECCV 2016 [PDF] [code]

Recognition using Visual Phrases. A. Farhadi, M. A. Sadeghi, CVPR 2011 [PDF] [code]

Where are they looking? A. Recasens, A. Khosla*, C. Vondrick, A. Torralba, NIPS 2015 [PDF] [demo]

Yousi (exp) [PDF]
Apr 4 (Tuesday) First-person vision *Force from Motion: Decoding Physical Sensation from a First Person Video. H.S. Park, J-J. Hwang and J. Shi., CVPR 2016 [PDF] [code]

Learning to Predict Gaze in Egocentric Video. Y. Li, A. Fathi, and J. Rehg. ICCV 2013. [PDF] [data]

Yousi (topic) [PDF]
Ben (exp)
Badour (“for” discussion lead)
Subhashree (“against” discussion lead)
Multi-modality
Apr 6 (Thursday) Recurrent neural networks *Visualizing and Understanding Recurrent Networks. A. Karpathy, J. Johnson, L. Fei-Fei. ICLR 2016 Workshop. [PDF][code]

Recurrent neural network based language model. T. Mikolov, M. Karafiat, L. Burget, J. Cernock, S. Khudanpur. Interspeech 2010. [PDF] [code]

Vikram (topic) [PPT][PDF]
Vikram (exp) [PPT][PDF]
Shruti (“for” discussion lead)
Ben (“against” discussion lead)
Apr 11 (Tuesday) Language and vision *Show and Tell: A Neural Image Caption Generator O. Vinyals, A. Toshev, S. Bengio, D. Erhan, CVPR 2015 [PDF] [code]

VQA: Visual Question Answering. S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, D. Parikh, ICCV 2015 [PDF][code]

Sequence to Sequence - Video to Text. S. Venugopalan et al. ICCV 2015 [PDF] [code]

Ben (topic)
Shuangfei (exp) [PPT][PDF]
Xiaolong (“for” discussion lead)
Badour (“against” discussion lead)
Apr 13 (Thursday) Sketches *The Sketchy Database: Learning to Retrieve Badly Drawn Bunnies. P. Sangkloy, N. Burnell, C. Ham, and J. Hays. SIGGRAPH 2016 [PDF] [code]

How Do Humans Sketch Objects? M. Eitz, J. Hays, M. Alexa. SIGGRAPH 2012 [PDF] [project]

Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup. E. Simo-Serra, S. Iizuka, K. Sasaki, H. Ishikawa. SIGGRAPH 2016 [PDF] [code]

Amruta (topic) [PPT] [PDF]
Amruta (exp) [PPT] [PDF]
Yingzhou (“for” discussion lead)
Vikram (“against” discussion lead)
Apr 18 (Tuesday) Cross-modal learning *Cross-Modal Scene Networks Y. Aytar, L. Castrejon, C. Vondrick, H. Pirsiavash, A. Torralba, arXiv 2016 [PDF]

Visually Indicated Sounds. A. Owens, P. Isola, J. McDermott, A. Torralba, E. H. Adelson, W. T. Freeman, CVPR 2016 [PDF]

Learning Aligned Cross-Modal Representations from Weakly Aligned Data. L. Castrejón, Y. Aytar, C. Vondrick, H. Pirsiavash, A. Torralba, CVPR 2016 [PDF]

Joint Embeddings of Shapes and Images via CNN Image Purification. Y. Li, H. Su, CR. Qi, N. Fish, D. Cohen-Or, L. Guibas, SIGGRAPH 2015 [PDF] [code]

Ashish (topic)
Pranav (“for” discussion lead)
Akrit (“against” discussion lead)
Applications
Apr 20 (Thursday) Robotics: reinforcement learning *Playing Atari with Deep Reinforcement Learning. V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller, NIPS workshop 2013 [PDF] [code]

Human-level control through deep reinforcement learning. Mnih et al. Nature 2015 [PDF] [code]

Sanket (topic) [PPT][PDF]
Ashish (exp) [PPT][PDF]
Akrit (“for” discussion lead)
Kevin (“against” discussion lead)
Apr 25 (Tuesday) Graphics: view intepolation *DeepStereo: Learning to Predict New Views from the World's Imagery. J. Flynn, I. Neulander, J. Philbin, N. Snavely, CVPR 2016 [PDF]

Learning-Based View Synthesis for Light Field Cameras. N. K. Kalantari, T.-C. Wang, R. Ramamoorthi, SIGGRAPH Asia 2016 [PDF] [code]

Subhashree (topic)
Subhashree (exp)
Ashish (discussion lead)
Data
Apr 27 (Thursday) Big data *Learning Everything about Anything: Webly-Supervised Visual Concept Learning. S. Divvala, A. Farhadi, and C. Guestrin. CVPR 2014. [PDF] [demo]

Scene Completion using Millions of Photographs. J. Hays and A. Efros. SIGGRAPH 2007. [PDF] [code]

IM2GPS: estimating geographic information from a single image J. Hays and A. Efros. CVPR 2008 [PDF] [project]

May 2 (Tuesday)