Torsten Hoefler — Whom to read in AI

Chain-of-Thought prompting (2022) showed that a language model produces better answers when allowed to write its intermediate reasoning out loud, and Tree-of-Thoughts (2023) extended that into a branching search. Graph of Thoughts, which Hoefler's ETH group introduced in 2024, takes the next step — letting the model treat reasoning as a directed graph where intermediate states can be merged, scored, and backtracked from. The framework comes with the kind of systems-engineering instincts you'd expect from someone whose other day-job is architecting ML on a national supercomputer: tracking the computational cost of each reasoning expansion, not just its accuracy gain.

Worth following when: you want to think about LLM reasoning as a structured search problem where compute spent on reasoning is a budget you have to allocate, not a free resource.
Topics: structured reasoning over LLM intermediate states (Chain → Tree → Graph of Thoughts); compute-aware design of reasoning prompts; ML systems engineering at HPC scale.
Key works: Graph of Thoughts (2024, senior author); broader work on ML systems and scaling at ETH Zürich and CSCS; HPC-side contributions to large-scale model training infrastructure.