Zhaochun Ren
Whether a language model can be trusted with the job that's currently done by an information retrieval system — and which parts of that job it actually does well.
"Is ChatGPT Good at Search?" (2023, with Ren as senior author) is one of the cleanest empirical studies of whether large language models can replace traditional ranking components in a retrieval pipeline. The answer turns out to be: for re-ranking a small candidate set, yes, quite well; for first-stage retrieval over a large corpus, no, not really — and the gap between those two tasks is one most LLM-centric papers gloss over. Ren's broader work in conversational IR pushes the question further: when retrieval happens inside a multi-turn conversation, the system is doing several different things at once, and they should not all be benchmarked the same way.