Furu Wei
Using a language model's own parametric knowledge to make retrieval find the document the user was actually looking for.
The standard direction of information flow in retrieval-augmented systems is from retrieval to generation: fetch documents first, then write the answer. Query2doc (2023, with Wei as senior author) inverts that order at the front end — given the user's query, have the language model first generate a plausible-looking answer document, then use that synthetic document as additional signal in the retrieval step. The trick works because such generated documents, while unreliable on specific facts, are usually reliable on what kind of document the user implicitly expects to find — and that's exactly the signal vector retrievers need.