IC Colloquium: Generative AI Models that Learn from Bad Data

Thumbnail

Event details

Date 09.02.2026
Hour 10:1511:15
Location Online
Category Conferences - Seminars
Event Language English
By: Giannis Daras - MIT
IC Faculty candidate

Abstract
In this talk, we will introduce a principled and practical framework for training generative models with imperfect samples. Recent progress in Generative AI is fuelled by the availability of large-scale, high-quality datasets. However, in many practical applications, high-quality samples are scarce, expensive, or altogether impossible to obtain. Even in data-rich settings (such as the image domain), we are approaching the limits of high-quality human-generated data. We will show how to leverage imperfect data sources, including low-quality, corrupted, synthetic, and out-of-distribution samples, which are cheaper and more widely available. We will instantiate the framework for diffusion models, one of the most powerful classes of generative models, and highlight applications in Computer Vision and Computational Biology that achieve state-of-the-art results for image generation and de novo protein design, respectively. Finally, we will discuss extensions to the textual domain, implications for memorization, privacy, data pricing and data collection, and a future where generative models and datasets co-evolve, refining each other to transcend the original data and explore the solution space.

Bio
Giannis Daras is a Postdoctoral Associate at the Massachusetts Institute of Technology (MIT) supervised by Antonio Torralba and Costis Daskalakis. Giannis obtained his Ph.D. from the Computer Science department of UT Austin under the supervision of Alex Dimakis. Giannis works on important practical and theoretical questions around deep generative models with a focus on training and sampling generative models in the presence of data corruption. His work has found applications across scientific fields such as Computer Vision, Computational Biology, Medical Imaging, Robotics, Economics, and Neuroscience.

Giannis has been nominated as a Rising Star in AI by the University of Michigan, has earned the Best Contribution Award at the Biomedical and Astronomical Signal Processing (BASP) conference, and has published 23 research papers (including 13 first-author works) at top-tier Machine Learning venues, including one Oral Presentation (top 0.3%) and two Spotlight presentations (top 3.2%) at NeurIPS. Giannis has also been supported by multiple fellowships, including the Graduate Dean’s Prestigious Fellowship (UT Austin), Onassis, Bodossakis, Leventis and Gerondellis Ph.D. Fellowships. His work has sparked the interest of the public following coverage by news outlets such as The Independent and the New York Post.

More information

Practical information

  • General public
  • Free

Contact

  • Hosts: Nicolas Flammarion and Michael Gastpar

Share