ECE 6554: Advanced Computer Vision Spring 2017

Department of Electrical and Computer Engineering, Virginia Tech

Class Schedule

Date	Topic	Reading and Links	Presenter
Jan 19 (Thursday)	Introduction		[PPT] [PDF]
	Visual Recognition
Jan 24 (Tuesday)	Instance recognition	*Object Recognition from Local Scale-Invariant Features. D. Lowe, ICCV 1999 [PDF] [code] Video Google: A Text Retrieval Approach to Object Matching in Videos. J. Sivic and A. Zisserman, ICCV 2003 [PDF] [demo] Local Invariant Feature Detectors: A Survey. T. Tuytelaars and K. Mikolajczyk, Foundations and Trends in Computer Graphics and Vision, 2008 [PDF] [code] (pp. 178-188, 216-220, 254-255)	[PPT] [PDF]
Jan 26 (Thursday)	Category recognition	*ImageNet Classification with Deep Convolutional Neural Networks. A. Krizhevsky, I. Sutskever, G. E Hinton. NIPS 2012 [PDF] Very Deep Convolutional Networks for Large-Scale Visual Recognition. K. Simonyan and A. Zisserman, ICLR 2015 [PDF] [code] Deep Residual Learning for Image Recognition. K. He, X. Zhang, S. Ren, J. Sun. CVPR 2016 [PDF] [code] Densely Connected Convolutional Networks. G. Huang, Z. Liu, K. Q. Weinberger, L. Maaten. arXiv 2016 [PDF] [code]	[PPT] [PDF]
Jan 31 (Tuesday)	Supervised pretraining	*How transferable are features in deep neural networks? J. Yosinski, J. Clune, Y. Bengio, and H. Lipson. NIPS 2014. [PDF] [code] What makes ImageNet good for transfer learning? M. Huh, P. Agrawal, A. A. Efros, arXiv 2016 [PDF] DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, T. Darrell, JMLR 2014 [PDF] [code]	[PPT] [PDF] Akrit (exp)
Feb 2 (Thursday)	Understanding deep networks	*Visualizing and Understanding Convolutional Networks. M. D Zeiler, R. Fergus, ECCV 2014 [PDF] Understanding Neural Networks Through Deep Visualization. J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, H. Lipson, ICML workshop 2015 [PDF] [code] Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. A Nguyen, J Yosinski, J Clune, CVPR 2015 [PDF] [code] Understanding Deep Image Representations by Inverting Them. A. Mahendran, A. Vedaldi, CVPR 2015 [PDF] [code]	Pranav (topic) Pranav (exp) [PPT] Rammohan (“for” discussion lead) Sanket (“against” discussion lead)
Feb 7 (Tuesday)	Detection	*Rich feature hierarchies for accurate object detection and semantic segmentation. R. Girshick, J. Donahue, T. Darrell, J. Malik. CVPR 2014 [ PDF] [code] R-FCN: Object Detection via Region-based Fully Convolutional Networks. J. Dai, Y. Li, K. He, J. Sun, NIPS 2016 [PDF] [code] SSD: Single Shot MultiBox Detector. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, ECCV 2016 [PDF] [code] YOLO9000: Better, Faster, Stronger J. Redmon, A. Farhadi, arXiv 2016 [PDF] [code]	[PPT] [PDF] Yingzhou (exp) [PPT] [PDF] Subhashree (“for” discussion lead) Yousi (“against” discussion lead)
Feb 9 (Thursday)	Segmentation	*Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. ICLR 2015 [PDF] [code] Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell, CVPR 2015 [PDF] [code] Instance-sensitive Fully Convolutional Networks. J. Dai, K. He, Y. Li, S. Ren, J. Sun, ECCV 2016 [PDF] [code]	Xiaolong (topic, exp) [PDF] Shuangfei (“for” discussion lead) Shruti (“against” discussion lead)
Feb 14 (Tuesday)	Siamese/Triplet Networks	*FaceNet: A unified embedding for face recognition and clustering. F. Schroff, D. Kalenichenko, J Philbin, ICCV 2015 [PDF] [code] Computing the stereo matching cost with a convolutional neural network. [PDF] [code] Learning Deep Representations for Ground-to-Aerial Geolocalization. T.-Y. Lin, Y. Cui, S. Belongie, J. Hays, CVPR 2015 [PDF] [code] Efficient deep learning for stereo matching. W Luo, AG Schwing, R Urtasun, CVPR 2016 [PDF] [code]	[PPT] [PDF]
Feb 16 (Thursday)	Pose	*Convolutional Pose Machines. S.-E. Wei, V. Ramakrishna, T. Kanade, Y. Sheikh, CVPR 2016 [PDF][code] DeepPose: Human pose estimation via deep neural networks. A. Toshev, C. Szegedy, CVPR 2014 [PDF] Articulated pose estimation with flexible mixtures-of-parts. Y. Yang, D. Ramanan, CVPR 2011 [PDF] [code]	Sujay (topic) [PPT] [PDF] Amruta (“for” discussion lead) Yingzhou (“against” discussion lead)
	Representation Learning
Feb 21 (Tuesday)	Attributes	*Describing Objects by their Attributes, A. Farhadi, I. Endres, D. Hoiem and D. Forsyth, CVPR, 2009. [PDF] [code] Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer. C. H. Lampert, H. Nickisch, S. Harmeling, CVPR 2009 [PDF] [code] Relative Attributes. D. Parikh, K. Grauman, ICCV 2011 [PDF] [code]	Shuangfei (topic, exp) [PPT] [PDF] Vikram (“for” discussion lead) Sujay (“against” discussion lead)
Feb 23 (Thursday)	Self-supervised learning	*Unsupervised Visual Representation Learning by Context Prediction. C. Doersch, A. Gupta, A. A. Efros. ICCV 2015 [PDF] [code] Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles. M. Noroozi and P. Favaro [PDF] [code] Context Encoders: Feature Learning by Inpainting. D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, A. A. Efros, CVPR 2016 [PDF] [code] Unsupervised Learning of Visual Representations using Videos. X. Wang, A. Gupta, ICCV 2015 [PDF] [code]	Badour (topic, exp) [PPT] [PDF] Ashish (“for” discussion lead) Xiaolong (“against” discussion lead)
Feb 28 (Tuesday)	Generative adversarial networks	*Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. A. Radford, L. Metz, S. Chintala, ICLR, 2016 [PDF] [code] Generative Adversarial Nets. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, NIPS 2014 [PDF] [code] InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, P. Abbeel, [PDF] [code]	Akrit (topic) [PPT] [PDF] Ben (“for” discussion lead) Shuangfei (“against” discussion lead)
Mar 2 (Thursday)	Conditional adversarial networks	*Image-to-Image Translation with Conditional Adversarial Networks. P. Isola J.-Y. Zhu T. Zhou, A. A. Efros, arXiv 2016 [PDF] [code] Scribbler: Controlling Deep Image Synthesis with Sketch and Color. P. Sangkloy, J. Lu, C. Fang, F. Yu, J. Hays, arXiv 2016 [PDF] [code] Generative Adversarial Text to Image Synthesis. S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, H. Lee, ICML 2016 [PDF] [code] Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space A. Nguyen, J. Yosinski, Y. Bengio, A. Dosovitskiy, and J. Clune, arXiv 2016 [PDF] [code] StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. H. Zhang, T. Xu, H. Li, S. Zhang, X. Huang, X. Wang, D. Metaxas, arXiv 2016 [PDF] [code]	Jia-Bin (topic) [PPT] [PDF] Sanket (exp) [PPT] [PDF] Yousi (“for” discussion lead) Pranav (“against” discussion lead)
Mar 7 (Tuesday)	Spring Break (no class)
Mar 9 (Thursday)	Spring Break (no class)
Mar 14 (Tuesday)	Style and content	*Image style transfer using convolutional neural networks. L. A. Gatys, A. S. Ecker, M. Bethge, CVPR 2016 [PDF] [code] Perceptual Losses for Real-Time Style Transfer and Super-Resolution. J. Johnson, A. Alahi, L. Fei-Fei, ECCV 2016, [PDF] [code] Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis. C. Li, M. Wand, CVPR 2016 [PDF] [code]	Kevin (topic) [PPT] [PDF] Sujay (exp) [PPT] [PDF] Sanket (“for” discussion lead) Rammohan (“against” discussion lead)
Mar 16 (Thursday)	ICCV (no class)
	Activity and Event
Mar 21 (Tuesday)	Research Skills and Final Project
Mar 23 (Thursday)	Action recognition	*Action recognition by dense trajectories. H Wang, A Kläser, C Schmid, CVPR 2011 [PDF] Two-stream convolutional networks for action recognition in videos. K Simonyan, A Zisserman, NIPS 2014 [PDF] [code] Large-scale video classification with convolutional neural networks. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, L. Fei-Fei, CVPR 2014 [PDF] [code] Learning Spatiotemporal Features with 3D Convolutional Networks. D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. ICCV 2015 [PDF] [code]	Yingzhou (topic) [PPT] [PDF] Kevin (exp) [PPT] [PDF] Sujay (“for” discussion lead) Amruta (“against” discussion lead)
Mar 28 (Tuesday)	Active perception	*Look-Ahead Before You Leap: End-to-End Active Recognition by Forecasting the Effect of Motion. D. Jayaraman and K. Grauman. ECCV 2016. [PDF] The Curious Robot: Learning Visual Representations via Physical Interactions. L. Pinto, D. Gandhi, Y. Han, Y-L. Park, and A. Gupta. ECCV 2016. [PDF] Learning to Poke by Poking: Experiential Learning of Intuitive Physics. P. Agrawal, A. Nair, P. Abbeel, J. Malik, S. Levine. 2016 [PDF] [project]	Shruti (topic) [PPT] [PDF] Kevin (“for” discussion lead) Ashish (“against” discussion lead)
Mar 30 (Thursday)	Group of objects	*Visual relationship detection with language priors. C Lu, R Krishna, M Bernstein, L Fei-Fei, ECCV 2016 [PDF] [code] Recognition using Visual Phrases. A. Farhadi, M. A. Sadeghi, CVPR 2011 [PDF] [code] Where are they looking? A. Recasens, A. Khosla*, C. Vondrick, A. Torralba, NIPS 2015 [PDF] [demo]	Yousi (exp) [PDF]
Apr 4 (Tuesday)	First-person vision	*Force from Motion: Decoding Physical Sensation from a First Person Video. H.S. Park, J-J. Hwang and J. Shi., CVPR 2016 [PDF] [code] Learning to Predict Gaze in Egocentric Video. Y. Li, A. Fathi, and J. Rehg. ICCV 2013. [PDF] [data]	Yousi (topic) [PDF] Ben (exp) Badour (“for” discussion lead) Subhashree (“against” discussion lead)
	Multi-modality
Apr 6 (Thursday)	Recurrent neural networks	*Visualizing and Understanding Recurrent Networks. A. Karpathy, J. Johnson, L. Fei-Fei. ICLR 2016 Workshop. [PDF][code] Recurrent neural network based language model. T. Mikolov, M. Karafiat, L. Burget, J. Cernock, S. Khudanpur. Interspeech 2010. [PDF] [code]	Vikram (topic) [PPT][PDF] Vikram (exp) [PPT][PDF] Shruti (“for” discussion lead) Ben (“against” discussion lead)
Apr 11 (Tuesday)	Language and vision	*Show and Tell: A Neural Image Caption Generator O. Vinyals, A. Toshev, S. Bengio, D. Erhan, CVPR 2015 [PDF] [code] VQA: Visual Question Answering. S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, D. Parikh, ICCV 2015 [PDF][code] Sequence to Sequence - Video to Text. S. Venugopalan et al. ICCV 2015 [PDF] [code]	Ben (topic) Shuangfei (exp) [PPT][PDF] Xiaolong (“for” discussion lead) Badour (“against” discussion lead)
Apr 13 (Thursday)	Sketches	*The Sketchy Database: Learning to Retrieve Badly Drawn Bunnies. P. Sangkloy, N. Burnell, C. Ham, and J. Hays. SIGGRAPH 2016 [PDF] [code] How Do Humans Sketch Objects? M. Eitz, J. Hays, M. Alexa. SIGGRAPH 2012 [PDF] [project] Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup. E. Simo-Serra, S. Iizuka, K. Sasaki, H. Ishikawa. SIGGRAPH 2016 [PDF] [code]	Amruta (topic) [PPT] [PDF] Amruta (exp) [PPT] [PDF] Yingzhou (“for” discussion lead) Vikram (“against” discussion lead)
Apr 18 (Tuesday)	Cross-modal learning	*Cross-Modal Scene Networks Y. Aytar, L. Castrejon, C. Vondrick, H. Pirsiavash, A. Torralba, arXiv 2016 [PDF] Visually Indicated Sounds. A. Owens, P. Isola, J. McDermott, A. Torralba, E. H. Adelson, W. T. Freeman, CVPR 2016 [PDF] Learning Aligned Cross-Modal Representations from Weakly Aligned Data. L. Castrejón, Y. Aytar, C. Vondrick, H. Pirsiavash, A. Torralba, CVPR 2016 [PDF] Joint Embeddings of Shapes and Images via CNN Image Purification. Y. Li, H. Su, CR. Qi, N. Fish, D. Cohen-Or, L. Guibas, SIGGRAPH 2015 [PDF] [code]	Ashish (topic) Pranav (“for” discussion lead) Akrit (“against” discussion lead)
	Applications
Apr 20 (Thursday)	Robotics: reinforcement learning	*Playing Atari with Deep Reinforcement Learning. V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller, NIPS workshop 2013 [PDF] [code] Human-level control through deep reinforcement learning. Mnih et al. Nature 2015 [PDF] [code]	Sanket (topic) [PPT][PDF] Ashish (exp) [PPT][PDF] Akrit (“for” discussion lead) Kevin (“against” discussion lead)
Apr 25 (Tuesday)	Graphics: view intepolation	*DeepStereo: Learning to Predict New Views from the World's Imagery. J. Flynn, I. Neulander, J. Philbin, N. Snavely, CVPR 2016 [PDF] Learning-Based View Synthesis for Light Field Cameras. N. K. Kalantari, T.-C. Wang, R. Ramamoorthi, SIGGRAPH Asia 2016 [PDF] [code]	Subhashree (topic) Subhashree (exp) Ashish (discussion lead)
	Data
Apr 27 (Thursday)	Big data	*Learning Everything about Anything: Webly-Supervised Visual Concept Learning. S. Divvala, A. Farhadi, and C. Guestrin. CVPR 2014. [PDF] [demo] Scene Completion using Millions of Photographs. J. Hays and A. Efros. SIGGRAPH 2007. [PDF] [code] IM2GPS: estimating geographic information from a single image J. Hays and A. Efros. CVPR 2008 [PDF] [project]
	May 2 (Tuesday)		Page generated 2017-04-25 00:32:40 Eastern Daylight Time, by jemdoc.