ViGIL Spotlight Talk @ NeurIPS 2019

Abstract

This talk will be introducing our recent paper, VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering. Here, we investigate the feasibility of EQA -type tasks by building a novel benchmark, which contains pairs of questions and videos generated in the House3D environment. While removing the navigation and action selection requirements from EQA , we increase the difficulty of the visual reasoning component via a much larger question space, tackling the sort of complex reasoning questions that make QA tasks challenging. By designing and evaluating several VQA -style models on the dataset, we establish a novel way of evaluating EQA feasibility given existing methods, while highlighting the difficulty of the problem even in the most ideal setting.

Date
Dec 13, 2019 10:30 AM
Location
Vancouver Convention Center
1055 Canada Pl, Vancouver, BC, V6C 0C3, Canada
Avatar
Dr Cătălina Cangea
Staff Research Scientist

Staff Research Scientist with a decade of ML experience, former co-lead of Generative Music at Google DeepMind, with a PhD from the University of Cambridge, and inhaler of music :)