Shi co-authored the 2023 "Siren's Song" hallucination survey with Yue Zhang, but came at the problem from the deployment side — the NLP center of a major Chinese tech platform whose products reach roughly a billion users. The same taxonomy of hallucination types (factuality vs. faithfulness, intrinsic vs. extrinsic) reads differently when the question is operational: which categories actually show up in production traffic, which can be screened by a downstream filter, and which require changing the underlying training run. Worth reading alongside the academic side of the same survey because the deployment perspective surfaces failure modes that pure benchmark work misses.

Worth following when
you want to know which hallucination categories break things in production versus which only matter in academic benchmarks.
Topics
hallucination categories in deployed LLM products; industrial NLP at scale; the gap between benchmark behavior and production behavior.
Key works
"Siren's Song in the AI Ocean: A Survey on Hallucination in LLMs" (2023, co-author); ongoing Tencent AI Lab NLP Center publications.