Textbook Question Answering


The TQA dataset encourages work on the task of Multi-Modal Machine Comprehension (M3C) task. The M3C task builds on the popular Visual Question Answering (VQA) and Machine Comprehension (MC) paradigms by framing question answering as a machine comprehension task, where the context needed to answer questions is provided and composed of both text and images. The dataset constructed to showcase this task has been built from a middle school science curriculum that pairs a given question to a limited span of knowledge needed to answer it.


The training and validation sets can be downloaded here. The test set will be made available in June 2017.


If you find TQA helpful in your work, please cite:

title={Are You Smarter Than A Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension},
author={Aniruddha Kembhavi and Minjoon Seo and Dustin Schwenk and Jonghyun Choi and Ali Farhadi and Hannaneh Hajishirzi},
booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},