Jian-Guang Lou — Whom to read in AI

Lou's MSR Asia research line has worked on the rare LLM-evaluation problem where correctness is unambiguous: code generation. Either the generated program compiles, runs, and produces the right output, or it doesn't — which means the evaluation methodology can sidestep most of the LLM-as-judge problems that plague open-ended generation evaluation. His group's contributions to in-context-learning evaluation and program-synthesis benchmarks built on this advantage: a literature that knows what "correct" means and uses that as leverage to interrogate other parts of model behavior.

Worth following when: you want to study LLM evaluation methodology in the rare setting where the ground truth is unambiguous — and to see what that clarity buys you that open-ended evaluation lacks.
Topics: program-synthesis evaluation methodology; in-context-learning rigor for code generation; the contrast between unambiguous-ground-truth and open-ended evaluation.
Key works: body of work on program synthesis and code generation evaluation at Microsoft Research Asia (2018 onward); in-context-learning evaluation publications; ACM Distinguished Member contributions to applied LLM research.