Molecular set representation learning

Thumbnail

Event details

Date 24.10.2023
Hour 15:1516:15
Speaker Daniel received his BSc in computer science at the Bern University of Applied Sciences in 2013 and his MSc in Bioinformatics and Computational Biology at the University of Bern in 2016. In 2020 he received his PhD in Chemistry and Molecular Sciences for his thesis “Scalable Methods for the Exploration and Visualization of Large Chemical Spaces” from the University of Bern under the supervision of  Prof. Jean-Louis Reymond. His main research interest is efficient machine learning and data visualisation applied to natural sciences, focusing on the intersection of chemistry and biology. After a two-year stay as a permanent research staff member at  IBM Research in the Team of Teodoro Laino working on machine learning for biocatalysis, he started as a postdoctoral researcher in the group of  Prof. Pierre Vandergheynst at EPFL.
Location
Category Conferences - Seminars
Event Language English

Computational representation of molecules can take many forms, including graphs, string-encodings of graphs, binary vectors, or learned embeddings in the form of real-valued vectors. These representations are then used in downstream classification and regression tasks using a wide range of machine-learning models. However, existing models come with limitations, such as the requirement for clearly defined chemical bonds, which often do not represent the true underlying nature of a molecule. Here, we propose a framework for molecular machine learning tasks based on set representation learning. We show that learning on sets of atomic invariants alone reaches the performance of state-of-the-art graph-based models on the most-used chemical benchmark data sets and that introducing a set representation layer into graph neural networks can surpass the performance of established methods in the domains of chemistry, biology, and material science. We introduce specialised set representation-based neural network architectures for reaction yield and protein-ligand binding affinity prediction. Overall, we show that the technique we denote molecular set representation learning is both an alternative and an extension to graph neural network architectures for machine learning tasks on molecules, molecule complexes, and chemical reactions.

Practical information

  • General public
  • Free

Organizer

  • Andres M Bran, Rebecca Neeser, Yannick Calvino, Philippe Schwaller

Contact

  • Andres M Bran, Rebecca Neeser, Yannick Calvino, Philippe Schwaller

Tags

MLSeminar1

Share