ECE 6554: Advanced Computer Vision Spring 2017
Class Schedule
Date | Topic | Reading and Links | Presenter |
Jan 19 (Thursday) | Introduction | | [PPT] [PDF] |
| Visual Recognition | | |
Jan 24 (Tuesday) | Instance recognition | *Object Recognition from Local Scale-Invariant Features. D. Lowe, ICCV 1999 [PDF] [code]
Video Google: A Text Retrieval Approach to Object Matching in Videos. J. Sivic and A. Zisserman, ICCV 2003 [PDF] [demo]
Local Invariant Feature Detectors: A Survey. T. Tuytelaars and K. Mikolajczyk, Foundations and Trends in Computer Graphics and Vision, 2008 [PDF] [code] (pp. 178-188, 216-220, 254-255)
| [PPT] [PDF] |
Jan 26 (Thursday) | Category recognition | *ImageNet Classification with Deep Convolutional Neural Networks. A. Krizhevsky, I. Sutskever, G. E Hinton. NIPS 2012 [PDF]
Very Deep Convolutional Networks for Large-Scale Visual Recognition. K. Simonyan and A. Zisserman, ICLR 2015 [PDF] [code]
Deep Residual Learning for Image Recognition. K. He, X. Zhang, S. Ren, J. Sun. CVPR 2016 [PDF] [code]
Densely Connected Convolutional Networks. G. Huang, Z. Liu, K. Q. Weinberger, L. Maaten. arXiv 2016 [PDF] [code]
| [PPT] [PDF] |
Jan 31 (Tuesday) | Supervised pretraining | *How transferable are features in deep neural networks? J. Yosinski, J. Clune, Y. Bengio, and H. Lipson. NIPS 2014. [PDF] [code]
What makes ImageNet good for transfer learning? M. Huh, P. Agrawal, A. A. Efros, arXiv 2016 [PDF]
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, T. Darrell, JMLR 2014 [PDF] [code]
| [PPT] [PDF]
Akrit (exp) |
Feb 2 (Thursday) | Understanding deep networks | *Visualizing and Understanding Convolutional Networks. M. D Zeiler, R. Fergus, ECCV 2014 [PDF]
Understanding Neural Networks Through Deep Visualization. J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, H. Lipson, ICML workshop 2015 [PDF] [code]
Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. A Nguyen, J Yosinski, J Clune, CVPR 2015 [PDF] [code]
Understanding Deep Image Representations by Inverting Them. A. Mahendran, A. Vedaldi, CVPR 2015 [PDF] [code]
| Pranav (topic) Pranav (exp) [PPT] Rammohan (“for” discussion lead) Sanket (“against” discussion lead)
|
Feb 7 (Tuesday) | Detection | *Rich feature hierarchies for accurate object detection and semantic segmentation. R. Girshick, J. Donahue, T. Darrell, J. Malik. CVPR 2014 [ PDF] [code]
R-FCN: Object Detection via Region-based Fully Convolutional Networks. J. Dai, Y. Li, K. He, J. Sun, NIPS 2016 [PDF] [code]
SSD: Single Shot MultiBox Detector. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, ECCV 2016 [PDF] [code]
YOLO9000: Better, Faster, Stronger J. Redmon, A. Farhadi, arXiv 2016 [PDF] [code]
| [PPT] [PDF] Yingzhou (exp) [PPT] [PDF] Subhashree (“for” discussion lead) Yousi (“against” discussion lead)
|
Feb 9 (Thursday) | Segmentation | *Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. ICLR 2015 [PDF] [code]
Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell, CVPR 2015 [PDF] [code]
Instance-sensitive Fully Convolutional Networks. J. Dai, K. He, Y. Li, S. Ren, J. Sun, ECCV 2016 [PDF] [code]
| Xiaolong (topic, exp) [PDF] Shuangfei (“for” discussion lead) Shruti (“against” discussion lead)
|
Feb 14 (Tuesday) | Siamese/Triplet Networks | *FaceNet: A unified embedding for face recognition and clustering. F. Schroff, D. Kalenichenko, J Philbin, ICCV 2015 [PDF] [code]
Computing the stereo matching cost with a convolutional neural network. [PDF] [code]
Learning Deep Representations for Ground-to-Aerial Geolocalization. T.-Y. Lin, Y. Cui, S. Belongie, J. Hays, CVPR 2015 [PDF] [code]
Efficient deep learning for stereo matching. W Luo, AG Schwing, R Urtasun, CVPR 2016 [PDF] [code]
| [PPT] [PDF]
|
Feb 16 (Thursday) | Pose | *Convolutional Pose Machines. S.-E. Wei, V. Ramakrishna, T. Kanade, Y. Sheikh, CVPR 2016 [PDF][code]
DeepPose: Human pose estimation via deep neural networks. A. Toshev, C. Szegedy, CVPR 2014 [PDF]
Articulated pose estimation with flexible mixtures-of-parts. Y. Yang, D. Ramanan, CVPR 2011 [PDF] [code]
| Sujay (topic) [PPT] [PDF] Amruta (“for” discussion lead) Yingzhou (“against” discussion lead)
|
| Representation Learning | | |
Feb 21 (Tuesday) | Attributes | *Describing Objects by their Attributes, A. Farhadi, I. Endres, D. Hoiem and D. Forsyth, CVPR, 2009. [PDF] [code]
Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer. C. H. Lampert, H. Nickisch, S. Harmeling, CVPR 2009 [PDF] [code]
Relative Attributes. D. Parikh, K. Grauman, ICCV 2011 [PDF] [code]
| Shuangfei (topic, exp) [PPT] [PDF] Vikram (“for” discussion lead) Sujay (“against” discussion lead)
|
Feb 23 (Thursday) | Self-supervised learning | *Unsupervised Visual Representation Learning by Context Prediction. C. Doersch, A. Gupta, A. A. Efros. ICCV 2015 [PDF] [code]
Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles. M. Noroozi and P. Favaro [PDF] [code]
Context Encoders: Feature Learning by Inpainting. D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, A. A. Efros, CVPR 2016 [PDF] [code]
Unsupervised Learning of Visual Representations using Videos. X. Wang, A. Gupta, ICCV 2015 [PDF] [code]
| Badour (topic, exp) [PPT] [PDF] Ashish (“for” discussion lead) Xiaolong (“against” discussion lead)
|
Feb 28 (Tuesday) | Generative adversarial networks | *Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. A. Radford, L. Metz, S. Chintala, ICLR, 2016 [PDF] [code]
Generative Adversarial Nets. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, NIPS 2014 [PDF] [code]
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, P. Abbeel, [PDF] [code]
| Akrit (topic) [PPT] [PDF] Ben (“for” discussion lead) Shuangfei (“against” discussion lead)
|
Mar 2 (Thursday) | Conditional adversarial networks | *Image-to-Image Translation with Conditional Adversarial Networks. P. Isola J.-Y. Zhu T. Zhou, A. A. Efros, arXiv 2016 [PDF] [code]
Scribbler: Controlling Deep Image Synthesis with Sketch and Color. P. Sangkloy, J. Lu, C. Fang, F. Yu, J. Hays, arXiv 2016 [PDF] [code]
Generative Adversarial Text to Image Synthesis. S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, H. Lee, ICML 2016 [PDF] [code]
Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space A. Nguyen, J. Yosinski, Y. Bengio, A. Dosovitskiy, and J. Clune, arXiv 2016 [PDF] [code]
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. H. Zhang, T. Xu, H. Li, S. Zhang, X. Huang, X. Wang, D. Metaxas, arXiv 2016 [PDF] [code]
| Jia-Bin (topic) [PPT] [PDF] Sanket (exp) [PPT] [PDF] Yousi (“for” discussion lead) Pranav (“against” discussion lead)
|
Mar 7 (Tuesday) | Spring Break (no class) | | |
Mar 9 (Thursday) | Spring Break (no class) | | |
Mar 14 (Tuesday) | Style and content | *Image style transfer using convolutional neural networks. L. A. Gatys, A. S. Ecker, M. Bethge, CVPR 2016 [PDF] [code]
Perceptual Losses for Real-Time Style Transfer and Super-Resolution. J. Johnson, A. Alahi, L. Fei-Fei, ECCV 2016, [PDF] [code]
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis. C. Li, M. Wand, CVPR 2016 [PDF] [code]
| Kevin (topic) [PPT] [PDF] Sujay (exp) [PPT] [PDF] Sanket (“for” discussion lead) Rammohan (“against” discussion lead)
|
Mar 16 (Thursday) | ICCV (no class) | | |
| Activity and Event | | |
Mar 21 (Tuesday) | Research Skills and Final Project | | |
Mar 23 (Thursday) | Action recognition | *Action recognition by dense trajectories. H Wang, A Kläser, C Schmid, CVPR 2011 [PDF]
Two-stream convolutional networks for action recognition in videos. K Simonyan, A Zisserman, NIPS 2014 [PDF] [code]
Large-scale video classification with convolutional neural networks. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, L. Fei-Fei, CVPR 2014 [PDF] [code]
Learning Spatiotemporal Features with 3D Convolutional Networks. D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. ICCV 2015 [PDF] [code]
| Yingzhou (topic) [PPT] [PDF] Kevin (exp) [PPT] [PDF] Sujay (“for” discussion lead) Amruta (“against” discussion lead)
|
Mar 28 (Tuesday) | Active perception | *Look-Ahead Before You Leap: End-to-End Active Recognition by Forecasting the Effect of Motion. D. Jayaraman and K. Grauman. ECCV 2016. [PDF]
The Curious Robot: Learning Visual Representations via Physical Interactions. L. Pinto, D. Gandhi, Y. Han, Y-L. Park, and A. Gupta. ECCV 2016. [PDF]
Learning to Poke by Poking: Experiential Learning of Intuitive Physics. P. Agrawal, A. Nair, P. Abbeel, J. Malik, S. Levine. 2016 [PDF] [project]
| Shruti (topic) [PPT] [PDF] Kevin (“for” discussion lead) Ashish (“against” discussion lead)
|
Mar 30 (Thursday) | Group of objects | *Visual relationship detection with language priors. C Lu, R Krishna, M Bernstein, L Fei-Fei, ECCV 2016 [PDF] [code]
Recognition using Visual Phrases. A. Farhadi, M. A. Sadeghi, CVPR 2011 [PDF] [code]
Where are they looking? A. Recasens, A. Khosla*, C. Vondrick, A. Torralba, NIPS 2015 [PDF] [demo]
| Yousi (exp) [PDF]
|
Apr 4 (Tuesday) | First-person vision | *Force from Motion: Decoding Physical Sensation from a First Person Video. H.S. Park, J-J. Hwang and J. Shi., CVPR 2016 [PDF] [code]
Learning to Predict Gaze in Egocentric Video. Y. Li, A. Fathi, and J. Rehg. ICCV 2013. [PDF] [data]
| Yousi (topic) [PDF] Ben (exp) Badour (“for” discussion lead) Subhashree (“against” discussion lead)
|
| Multi-modality | | |
Apr 6 (Thursday) | Recurrent neural networks | *Visualizing and Understanding Recurrent Networks. A. Karpathy, J. Johnson, L. Fei-Fei. ICLR 2016 Workshop. [PDF][code]
Recurrent neural network based language model. T. Mikolov, M. Karafiat, L. Burget, J. Cernock, S. Khudanpur. Interspeech 2010. [PDF] [code]
| Vikram (topic) [PPT][PDF] Vikram (exp) [PPT][PDF] Shruti (“for” discussion lead) Ben (“against” discussion lead) |
Apr 11 (Tuesday) | Language and vision | *Show and Tell: A Neural Image Caption Generator O. Vinyals, A. Toshev, S. Bengio, D. Erhan, CVPR 2015 [PDF] [code]
VQA: Visual Question Answering. S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, D. Parikh, ICCV 2015 [PDF][code]
Sequence to Sequence - Video to Text. S. Venugopalan et al. ICCV 2015 [PDF] [code]
| Ben (topic) Shuangfei (exp) [PPT][PDF] Xiaolong (“for” discussion lead) Badour (“against” discussion lead)
|
Apr 13 (Thursday) | Sketches | *The Sketchy Database: Learning to Retrieve Badly Drawn Bunnies. P. Sangkloy, N. Burnell, C. Ham, and J. Hays. SIGGRAPH 2016 [PDF] [code]
How Do Humans Sketch Objects? M. Eitz, J. Hays, M. Alexa. SIGGRAPH 2012 [PDF] [project]
Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup. E. Simo-Serra, S. Iizuka, K. Sasaki, H. Ishikawa. SIGGRAPH 2016 [PDF] [code]
| Amruta (topic) [PPT] [PDF] Amruta (exp) [PPT] [PDF] Yingzhou (“for” discussion lead) Vikram (“against” discussion lead)
|
Apr 18 (Tuesday) | Cross-modal learning | *Cross-Modal Scene Networks Y. Aytar, L. Castrejon, C. Vondrick, H. Pirsiavash, A. Torralba, arXiv 2016 [PDF]
Visually Indicated Sounds. A. Owens, P. Isola, J. McDermott, A. Torralba, E. H. Adelson, W. T. Freeman, CVPR 2016 [PDF]
Learning Aligned Cross-Modal Representations from Weakly Aligned Data. L. Castrejón, Y. Aytar, C. Vondrick, H. Pirsiavash, A. Torralba, CVPR 2016 [PDF]
Joint Embeddings of Shapes and Images via CNN Image Purification. Y. Li, H. Su, CR. Qi, N. Fish, D. Cohen-Or, L. Guibas, SIGGRAPH 2015 [PDF] [code]
| Ashish (topic) Pranav (“for” discussion lead) Akrit (“against” discussion lead)
|
| Applications | | |
Apr 20 (Thursday) | Robotics: reinforcement learning | *Playing Atari with Deep Reinforcement Learning. V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller, NIPS workshop 2013 [PDF] [code]
Human-level control through deep reinforcement learning. Mnih et al. Nature 2015 [PDF] [code]
| Sanket (topic) [PPT][PDF] Ashish (exp) [PPT][PDF] Akrit (“for” discussion lead) Kevin (“against” discussion lead) |
Apr 25 (Tuesday) | Graphics: view intepolation | *DeepStereo: Learning to Predict New Views from the World's Imagery. J. Flynn, I. Neulander, J. Philbin, N. Snavely, CVPR 2016 [PDF]
Learning-Based View Synthesis for Light Field Cameras. N. K. Kalantari, T.-C. Wang, R. Ramamoorthi, SIGGRAPH Asia 2016 [PDF] [code]
| Subhashree (topic) Subhashree (exp) Ashish (discussion lead)
|
| Data | | |
Apr 27 (Thursday) | Big data | *Learning Everything about Anything: Webly-Supervised Visual Concept Learning. S. Divvala, A. Farhadi, and C. Guestrin. CVPR 2014. [PDF] [demo]
Scene Completion using Millions of Photographs. J. Hays and A. Efros. SIGGRAPH 2007. [PDF] [code]
IM2GPS: estimating geographic information from a single image J. Hays and A. Efros. CVPR 2008 [PDF] [project]
|
|
| May 2 (Tuesday) |
|
|
|