Scott Emmons

I am a research scientist at Google DeepMind focused on AI safety and alignment. I am wrapping up my PhD at UC Berkeley’s Center for Human-Compatible AI, advised by Stuart Russell. I previously cofounded, a 501(c)3 research nonprofit that incubates and accelerates beneficial AI research agendas.

I am interested in both the theory and practice of AI alignment. I have helped characterize how RLHF can lead to deception when the AI sees more than the human, develop multimodal attacks and benchmarks for open-ended agents, and use mechanistic interpretability to find evidence of learned look-ahead in a chess-playing neural network.

Curriculum Vitae

scott at scottemmons dot com


Open-Source Software