Maarten Sap — Whom to read in AI

Sap's earlier work on COMET (Commonsense Transformers, 2019) built one of the field's first attempts at making machine-readable representations of everyday social inferences — that someone who borrowed money will probably want to repay it, that an apology implies prior wrongdoing. The same lens runs through his LLM-era work: what kinds of social and commonsense knowledge are LLMs producing fluently versus mimicking superficially, and where do the failures cluster. RealToxicityPrompts and adjacent benchmarks made that question quantitative for safety-relevant cases — models trained on the open web inherit the toxic patterns of that web in measurable ways, even when their alignment training tries to paper over the inheritance.

Worth following when: you want to evaluate LLM behavior on social and commonsense reasoning, not just factual QA, with benchmarks that distinguish surface compliance from underlying capability.
Topics: commonsense knowledge representation in neural models (COMET); LLM behavior on social and moral reasoning; toxicity benchmarks (RealToxicityPrompts) and what they actually measure.
Key works: COMET commonsense transformers (2019, co-lead author); RealToxicityPrompts (2020, lead author); ongoing CMU and AI2 publications on social and safety NLP.