Events

Upcoming

📍 Location: Maria-von-Linden-Str. 1, 4th floor meeting room (A-401)
🕑 Time: Mondays at 18:00
💬 Open to Everyone – feel free to join any meetup that sounds interesting to you.


02.03.2026[Talk] “From Machine Psychology to AI Safety: Studying Deception, Emergent Misalignment, and Self-Awareness in LLMs” by Thilo Hagendorff (Group Leader at University of Stuttgart)
09.03.2026[Paper Discussion] “Who’s in Charge? Disempowerment Patterns in Real-World LLM Usage” by Sharma et al. (2026) https://arxiv.org/pdf/2601.19062
16.03.2026[Talk] by Daniel Tan, Topic TBA 😉

Note on Background Knowledge: While we don’t require you to know a lot about AI Safety, our current events are targeted towards people who are already familiar with the basics of Machine Learning / AI Safety (e.g., having taken an AI Safety Fundamentals course by BlueDot or read about it). At the start of the next semester, we will have more beginner-friendly events as well.

Get Involved

Want to present a paper, lead a discussion, or simply learn more about AI safety? We’d love to hear from you.
👉 Let us know what topics/papers you want to discuss or sign up to present you research.
👉 Join our WhatsApp group for updates on upcoming events.
👉 Check out our resources for an introduction to AI safety topics.
👉 New to AI Safety or Tübingen? Reach out if you’d like to schedule a one-on-one meeting with one of our organizers. We invite you for a coffee and can give you targeted advice! ☕️

Past Events

23.02.2026[Social] Speed-Friending & What are you working on?
Feel free to prepare a slide talking about what you are currently working on or what your research interests are! This event is also suited for people rather new to AI Safety! You are also welcome if you haven’t attended previous meet-ups 🙂
16.02.2026[Social] Casual Dinner @ Irish Pub Tübingen
09.02.2026[Talk] by David Schmotz on SKILL-INJECT: Measuring Agent Vulnerability to Skill File Attacks
02.02.2026[Talk] by Jeanne Salle on Measuring and Forecasting Autonomous Capabilities (if you want to read something beforehand, you can check out this blog post)
26.01.2026[Paper Discussion] Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment
19.01.2026[Paper Discussion]🫘 Spilling the beans – Teaching LLMs to Self-Report Their Hidden Objectives and its predecessor paper
💬 Teaching Models to Verbalize Reward
Hacking in Chain-of-Thought Reasoning
12.01.2026[Paper Discussion] Activation Orcales: Training and Evaluating LLM as General-Purpose Activation Explainers
15.12.2025Intro to Scalable Oversight by Ameya Prabhu (Bethgelab)
01.12.2025Quick introduction + discussing Anthropic’s recent paper on “Natural Emergent Misalignment in Production RL”