Seminar from Peter Vickers, Phd student at The University Of Sheffield
Localization: IC2 Boardroom, online
Speaker: Peter Vickers
Multimodal Reasoning: Datasets, Models, and even some commonsense?
My PhD project involves researching multimodal question answering, looking at text, vision, and knowledge graphs. How can we pose difficult questions for AI to answer, which enforce complex reasoning over multiple modalities? What biases come into play when we create datasets to capture these questions? And do the state-of-the-art models for answering these questions actually attend to the data, or do they exploit shortcuts? I attempt to answer these questions and outline how we can ask genuinely challenging, complex questions which require deep, grounded multimodal understanding and complex reasoning.