lunch&LEARN: Challenges for AI in Multimodal STEM Assessments: a Human-AI Comparison

Thumbnail

Event details

Date 10.04.2025
Hour 12:1513:00
Speaker Anna Sotnikova
Location Online
Category Conferences - Seminars
Event Language English

Generative AI systems have rapidly advanced, with multimodal input capabilities enabling reasoning beyond text-based tasks. In education, these advancements could influence assessment design and question answering, presenting both opportunities and challenges.

To investigate these effects, Anna Sotnikova and her team introduced a high-quality dataset of 201 university-level STEM questions, manually annotated with features such as image type, role, problem complexity, and question format.

Their study analyzed how these features affect generative AI performance compared to students. They assessed the GPT model family using five prompting strategies and compared results to an average of 546 student responses per question.

While models correctly answered on average 58.5% of questions using majority vote aggregation, human participants consistently outperformed AI on questions involving visual components.

Interestingly, human performance remained stable across question features but varied by subject, whereas AI performance was susceptible to both subject matter and question features.

Closing the session, Anna will provide actionable insights for educators, demonstrating how question design can enhance academic integrity by leveraging features that challenge current AI systems without increasing the cognitive burden for students.

Practical information

  • General public
  • Registration required
  • This event is internal

Organizer

Tags

lunch&LEARN AI assessment

Share