Ashish Sabharwal
What classical formal-reasoning research can contribute to evaluating whether modern LLMs actually reason — and how to tell that from reasoning-shaped language that happens to land on the right answer.
Sabharwal came to LLM research from a background in formal reasoning and satisfiability — areas where "does the system reason correctly" has a precise meaning, measured against logical specifications and verifiable by operational tests. That background reads through his more recent work on LLM reasoning evaluation: a tendency to look at whether the structure of the reasoning matches the structure the problem requires, on top of whether the final answer happens to be correct. ARC and similar benchmarks are an output of that posture — designed so that surface fluency does not substitute for actual inference.