Sap's earlier work on COMET (Commonsense Transformers, 2019) built one of the field's first attempts at making machine-readable representations of everyday social inferences — that someone who borrowed money will probably want to repay it, that an apology implies prior wrongdoing. The same lens runs through his LLM-era work: what kinds of social and commonsense knowledge are LLMs producing fluently versus mimicking superficially, and where do the failures cluster. RealToxicityPrompts and adjacent benchmarks made that question quantitative for safety-relevant cases — models trained on the open web inherit the toxic patterns of that web in measurable ways, even when their alignment training tries to paper over the inheritance.

Worth following when
you want to evaluate LLM behavior on social and commonsense reasoning, not just factual QA, with benchmarks that distinguish surface compliance from underlying capability.
Topics
commonsense knowledge representation in neural models (COMET); LLM behavior on social and moral reasoning; toxicity benchmarks (RealToxicityPrompts) and what they actually measure.
Key works
COMET commonsense transformers (2019, co-lead author); RealToxicityPrompts (2020, lead author); ongoing CMU and AI2 publications on social and safety NLP.