Before Dodge's work, most LLM-efficiency publications compared results obtained under incompatible conditions — different hyperparameter-search budgets, different dataset variants, undocumented normalization steps that made head-to-head numbers misleading. The "Show Your Work" reporting standard he formulated and pushed at NeurIPS and ACL as a reviewer requirement turned reproducibility from a convention into an enforceable line. The same instinct drives his Green AI work — report not just accuracy but compute cost, because "percent correct" is not a comparable number across labs without it.

Worth following when
you want to know what would actually have to be true of an LLM publication for the result it reports to be reproducible by someone else.
Topics
reporting standards for ML research (Show Your Work); Green AI methodology and energy-cost accounting; what open-model evaluation infrastructure looks like (OLMo, Dolma, DataDecide).
Key works
"Show Your Work: Improved Reporting of Experimental Results" (2019); "Green AI" position paper (2019, co-author); OLMo and Dolma open-model evaluation tooling (2024).