top of page

ZurichNLP #16

Wed 21 May

|

Zürich

Ivan Vulić (University of Cambridge/Google DeepMind) about vision-language models for spatial reasoning and Catalina Torres (University of Zurich) about Swiss German's unique grammar.

ZurichNLP #16
ZurichNLP #16

Time & Location

21 May 2025, 18:00 – 20:00

Zürich, OAT ETH Zurich (14th floor), Andreasstrasse 5, 8050 Zürich, Switzerland

About the Event

Ivan Vulić (University of Cambridge/Google DeepMind): Guiding Vision-Language Models to Climb the Mountain of Spatial Reasoning


Large Vision-Language Models (VLMs) have demonstrated impressive performance in general vision-language tasks. However, even the most recent and most powerful VLMs still struggle even with simple spatial understanding and reasoning capabilities. In this talk, I will first provide a brief overview of our recent work on creating new benchmarks and improving evaluation of VLMs for a range of spatial reasoning tasks. I will then outline our novel methodology related to enhancing spatial reasoning capabilities of VLMs, with a focus on spatial navigation tasks, such as multi-modal visualization-of-thought and purely visual planning.

Catalina Torres (University of Zurich): How Swiss German Helps us Understand Grammar Better

Share This Event

© ZurichAI 2022-2100

  • Grey LinkedIn Icon
  • Grey Facebook Icon
bottom of page