Our Research Mission

We aim to advance the foundations of robust and reliable artificial intelligence. Our research explores how AI systems can perceive, reason, and generalize in the presence of real-world complexity, ambiguity, and uncertainty. We are driven by the belief that robust perception is essential for AI to interact safely and meaningfully with the world around it.

Research Areas

Robustness of AI

We study how to build AI systems that remain reliable, consistent, and safe when deployed in the real world—where data is messy, environments change, and uncertainty is the norm. Our research focuses on the foundations of robust perception: how machines can interpret sensory input accurately and generalize their understanding across diverse and shifting conditions. We aim to identify the failure modes of modern AI systems and develop principled methods to improve their resilience, from model design and training paradigms to evaluation protocols. Ultimately, our goal is to enable AI systems that can perceive and reason effectively in the open world.

Multi-modal AI

We explore how to build AI systems that can perceive and reason across multiple modalities—such as images, text, audio, and beyond—to form a coherent understanding of the world. Inspired by the human brain's ability to integrate diverse sensory inputs into unified representations, we aim to develop AI systems that exhibit similar multi-modal coherence. Our research focuses on learning aligned representations across modalities, developing models that can ground language in perception, and enabling AI to perform complex reasoning that spans both visual and linguistic understanding. We are especially interested in the challenges of ambiguity, grounding, and cross-modal generalization, all of which are fundamental to building AI that can interact naturally and intelligently with the world.

Generative AI

We investigate how to build generative AI systems that can produce and manipulate content across a wide range of modalities, including text, images, audio, and video. Generative models hold enormous potential for creativity, communication, and problem-solving—but realizing that potential requires addressing key challenges around coherence, controllability, and safety. Our research explores how to design models that generate content with consistency and fidelity, follow user intent accurately, and remain reliable under diverse and complex prompts. We are particularly interested in the foundations of generative alignment: how to ensure that model outputs are not only plausible and creative, but also grounded, trustworthy, and appropriate for real-world use.