David Jurgens
Whether language models that handle factual questions cleanly can also handle the kind of social knowledge that determines what humans actually mean when they say things.
The 2023 paper "Do LLMs Understand Social Knowledge?", with Jurgens as senior co-author, ran modern LLMs through a battery of tests built from sociolinguistics research — implicature, indirect speech acts, conventionalized politeness markers, the social inferences a competent human listener makes without thinking about them. The results were mixed enough to matter: LLMs that scored high on factual QA benchmarks made systematic errors on social inferences that any reasonably socialized human gets right, especially when the social context was non-American or non-mainstream. For ai100, this connects to a question we have to think about — when an engine recommends one brand instead of another, what social signal is the engine reading from the query, and would a human reading the same query interpret that signal the same way.