Peter Henderson
Where the technical findings about language-model behavior actually matter — in audits, regulation, and legal liability — and what the gap between "we measured this" and "this changes what's allowed" looks like.
Henderson trained both as a computer scientist and as a lawyer, which gives his published work an unusual posture: every empirical result about LLM behavior gets traced through to the regulatory or liability consequence it implies, or doesn't. His 2018 paper "Deep Reinforcement Learning that Matters" was an early demonstration that reproducibility failures in ML are not just an academic embarrassment but a basis for distrusting downstream claims that policy then has to act on. The POLARIS Lab at Princeton continues that line for the LLM era — foundation-model risk assessments, legal-compliance audits, and the kind of methodologically careful empirical work that actually survives use by regulators.