← Back to the list
Christopher Potts
What linguistic structure language models actually represent — and what they only seem to.
Potts comes at language models from a linguistics-and-philosophy background, which gives him an unusual angle on what current evaluation methodology lets us conclude. Stanford Sentiment Treebank, which his group co-built more than a decade ago, is still cited in nearly every paper about sentiment classification — partly for the dataset itself, partly because it made compositional structure rather than word polarity the unit of evaluation. His more recent work on dynamic adversarial benchmarks like DynaSent argues a quieter point: a static test set goes stale the moment it appears, because subsequent models train on its echo in the corpus.