The standard pipeline for studying bias in language models has been English-first by default: train on English, evaluate on English, draw conclusions, then port to other languages as an afterthought. Tsvetkov's UW group runs the inverse pipeline — start with the languages and communities that English-centric methods don't reach, ask which bias and harm categories actually surface in those settings, and let the multilingual data shape the analytical categories from the start. For ai100, which evaluates engines in five language regions, this is the methodological backing for treating each locale's bias profile as its own object of study.

Worth following when
you need to evaluate model bias or harm in a non-English context and want the literature that takes multilingual ethics as a primary concern.
Topics
multilingual bias and harm evaluation; low-resource language NLP; ethics in NLP research as it intersects with language diversity.
Key works
"Demystifying Prompts in Language Models via Perplexity Estimation" (2023, co-author); long body of work on low-resource and multilingual NLP; UW publications on bias evaluation across languages.