Most public LLM safety evaluation uses category schemes developed by Western labs — bias, toxicity, hate speech, jailbreak resistance, all benchmarked against English-language datasets and US/EU regulatory expectations. Huang's CoAI group at Tsinghua, with the more recent AISafetyLab framework, runs an explicit parallel for Chinese-language LLMs: what counts as "harmful" or "biased" output looks structurally different when the regulatory environment is different, the cultural taboos are different, and the deployed-model surface reaches users with different expectations. For ai100, which evaluates engines that serve five language regions, this kind of locale-aware safety evaluation is the only honest version — pretending one safety taxonomy fits all five locales would be its own form of bias.

Worth following when
you need to evaluate LLM safety in non-Anglophone deployment contexts, especially Chinese-language environments where the regulatory and cultural categories diverge.
Topics
locale-aware LLM safety evaluation; dialogue-system safety and abuse detection; the Chinese-language open-LLM ecosystem (ChatGLM lineage).
Key works
ChatGLM open language model series (2022 onward, key contributor); AISafetyLab safety-evaluation framework (2023, ongoing); CoAI group publications on conversational AI safety.