Most academic IR research runs on web-scale or news-scale collections — corpora where the retrieval problem is well-defined and the documents are reasonably clean. Huang's IR&KM Lab at York has spent two decades on a more representative case: retrieval over the kind of mixed, heterogeneous, sometimes structured data that actual enterprises hold, where the failure modes look different from anything a benchmark captures. His recent LLM-evaluation work extends the same lens — how do retrieval-augmented language models behave when the corpus they retrieve from is the kind of repository an actual organization runs on, not a curated research collection.

Worth following when
you want IR research that takes seriously the gap between benchmark corpora and the kind of data RAG systems actually encounter once deployed.
Topics
information retrieval over heterogeneous and enterprise-scale data; the corpus-side conditions of RAG behavior; systematic evaluation of LLM-IR hybrids on non-web data.
Key works
body of work on IR for big-data and enterprise corpora (2000s onward, York IR&KM Lab); ongoing systematic LLM-IR evaluation publications.