I am recently a Ph.D. student in the Virginia Tech Computer Vision Lab, advised by Prof. Devi Parikh, and also work closely with Prof. Dhruv Batra. My research interests lie at the intersection of computer vision and natural language processing. My primary research for now is about vision&language modeling using deep learning.

Prior to that, I received M.S. degree in Computer Science from University at Buffalo and worked with Prof. Jason Corso. I earned my Bachelor degree from Nanjing University of Post and Telecommunications, Nanjing, China. Here is my full CV.

Email: jiasenlu at vt dot edu


One paper is accepted by CVPR 2017! Source code on the way!
I will join Facebook AI Research for an internship in Spring 2017!
One paper is accepted by NIPS 2016!
I joined MetaMind for an internship in Summer 2016, working with Dr. Richard Socher and Dr. Caiming Xiong .
The torch version code for our arXiv paper Hiearchical Co-Attention has been released!
Our new work on visual question answering is posted at arXiv, source code on the way!

Publications [google scholar]

Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning.
Jiasen Lu, Caiming Xiong, Devi Parikh, Richard Socher.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
[pdf] [more visualization demo] [code]
VQA: Visual Question Answering.
Aishwarya Agrawal*, Jiasen Lu*, Stanislaw Antol*, Margaret Mitchell, C. Lawrence Zitnick, Devi Parikh, Dhruv Batra.
International Journel of Computer Vision (IJCV)
[pdf] [code] [project page]
Hierarchical Question-Image Co-Attention for Visual Question Answering.
Jiasen Lu, Jianwei Yang, Dhruv Batra, Devi Parikh.
Neural Inforamtion Processing Systems (NIPS) 2016.
[pdf] [code]
VQA: Visual Question Answering.
Stanislaw Antol*, Aishwarya Agrawal*, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh.
International Conference on Computer Vision (ICCV), 2015
[pdf] [code] [project page]
Human Action Segmentation with Hierarchical Supervoxel Consistency.
Jiasen Lu, Ran Xu and Jason J. Corso
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015
[pdf] [code]
Improving Word Representations via Global Visual Context.
Ran Xu, Jiasen Lu, Caiming Xiong, Zhi Yang, Jason J. Corso.
NIPS workshop on Learning Semantics, 2014
[pdf] [code]

Projects [github]

Implement deeper LSTM and normalized CNN Visual Question Answering model.

Implement (Convolutional) Deep Structured Semantic Model in Torch, with a new Sparse Linear and Sparse Temporal Convolution layer.


Reviewer for Neural Information Processing Systems (NIPS) 2016, June-July 2016
Student organizer for VQA Challenge Workshop at CVPR 2016, June 2016
Student Volunteer for IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015