Xing Xie
Connecting the recommender-systems tradition of measuring user-facing AI behavior with the new evaluation challenges modern LLMs pose.
Xie's published career has crossed several adjacent fields — recommender systems, spatial data mining, responsible AI infrastructure — that all had to answer the same operational question: what does this AI system actually do for the user, and how do you measure that against the user's actual interests rather than against a benchmark you defined. His co-authorship of the 2023 "Survey on Evaluation of LLMs" reads as that long career meeting the LLM moment: many of the methodological frames for measuring fairness, drift, or unintended user-facing effects in recommender systems transfer directly to language models, often without modification. For ai100, which evaluates how language models shape what users hear about brands, this is the closest precedent literature — recommender-system evaluation methodology applied to a new substrate.