Sitemap

Pages

Publications

Forecasting AI progress: A research agenda

A survey on experts’ opinions on forecasting AI progress.

Ross Gruetzemacher, Florian E. Dorner, Niko Bernaola-Alvarez, Charlie Giattino, David Manheim

Technological Forecasting and Social Change 170, 120909 (2021)

Do Personality Tests Generalize to Large Language Models?

Language models’ answers to personality tests markedly deviate from typical human responses.

Florian E. Dorner, Tom Sühr, Samira Samadi, Augustin Kelava (Equal contribution)

Socially Responsible Language Modelling Research Workshop (at NeurIPS 2023)

On Evaluating Methods vs. Evaluating Models

Should benchmarks evaluate LLMs, or the methods used to train them?

Olawale Elijah Salaudeen, Florian E. Dorner, and Peter Hase

Evaluating the Evolving LLM Lifecycle Workshop at NeurIPS 2025 (Oral)

Tools