Diyi Yang — Whom to read in AI

Yang led the 2023 "Is ChatGPT a General-Purpose NLP Solver?" study that gave the first sober task-by-task answer to a question everyone had been assuming: across more than twenty established NLP tasks, ChatGPT was strong on a few, mediocre on most, and bad on some — a pattern that complicated the narrative of broad generality. Her SALT Lab continues the harder line of inquiry: language models in roles where the right answer depends on social context — counselor, conflict mediator, persuasion target — and where the failure modes look different from what standard benchmarks reveal.

Worth following when: you want to know how LLMs behave outside the kinds of tasks they were tuned for, especially tasks where the human stakes are higher than benchmark scores.
Topics: task-level evaluation of LLMs on established NLP benchmarks; computational social science with language models; social context as a dimension of evaluation.
Key works: "Is ChatGPT a General-Purpose NLP Solver?" (2023); SALT Lab publications on socially-grounded NLP (2022, ongoing).