Mapping Materials Science: a multi-modal toolbox to curate broad synthesis procedure databases from scientific literature

Thumbnail

Event details

Date 28.04.2026
Hour 15:1516:15
Speaker Magdalena Lederbauer
Location Online
Category Conferences - Seminars
Event Language English
Abstract
Predicting how to synthesize a material is a fundamental challenge in materials discovery, since procedural knowledge is scattered across decades of literature in formats inaccessible to data-driven methods. Without linking how a material is made to how well it performs, predictive models cannot learn to optimize synthesis procedures for target properties.
Here, we present LeMat-Synth, a modular open-source toolbox that transforms unstructured materials science literature into linked, machine-readable synthesis-performance databases at scale. Applied to 81k open-access papers, LeMat-Synth curates structured synthesis procedures spanning 35 synthesis methods and 16 material classes. We demonstrate generalizability through two case studies in thermocatalysis for ammonia decomposition and superconductor discovery, where linked synthesis protocols and digitized performance figures enable data-driven insights previously invisible at scale. Together, these position LeMat-Synth as a generalizable data infrastructure layer that aims to contribute to autonomous materials discovery.

Biography
Magdalena Lederbauer is a PhD student in chemical engineering and computer science at MIT, working at the intersection of machine learning and the chemical sciences. Her research spans data infrastructure for materials discovery, structure elucidation from mass spectrometry and reaction discovery in catalysis. Trained as a chemist at ETH Zurich, she received the ETH Medal, S.&N. Blank Prize and the Willi Studer Prize for her MSc. As an Entalpic Research Fellow, she leads the working group for large-language-model-driven synthesis data extraction at LeMaterial, an open-source initiative in collaboration with Hugging Face that builds a community-driven ecosystem for materials science data and artificial intelligence.

Practical information

  • Informed public
  • Free

Organizer

  • Philippe Schwaller

Contact

  • Sarina Kopf

Tags

MLSeminar1

Share