Current challenges in the alignment techniques of foundation models
Event details
Date | 31.10.2024 |
Hour | 19:00 › 20:00 |
Speaker | Charbel-Raphaël Ségerie |
Location | |
Category | Conferences - Seminars |
Event Language | English |
Are current AI safety measures enough? Join us to examine their effectiveness against future threats!
Description:
- Introducing a framework to examine the progression of AI development, focusing on the growth in agency and generality among AI models. This trend implies that future iterations may exhibit novel types of malfunctions not seen in present-day models.
- Overviewing existing technical safety measures for AI: how they mitigate current failure modes, and their potential to address future issues.
- Presenting safety as a characteristic of the socio-technical system in which technical development takes place, discussing defense in depth strategies, organizational safety culture, and the role of third-party auditors.
- Introducing BELLS: a practical assessment tool for evaluating the resilience of large language model supervision systems. This is our main technical project, that we presented at the ICML conference.
Links
Practical information
- General public
- Registration required
Organizer
- Safe AI Lausanne
Contact
- Ines Altemir, Agatha Duzan