The 2023 "LM vs LM" paper from Globerson's group set up a cross-examination protocol: one model produces a claim, a second model asks follow-up questions designed to probe for inconsistencies, and the original claim is flagged if the follow-ups force a contradiction. The setup gives the field something it had been missing — a factuality-detection method that doesn't require labeled ground truth, doesn't require model internals, and doesn't require trusting either model on its own. Globerson's longer arc is in machine-learning theory, which shapes how the work reads: LLM evaluation as a problem about adversarial sampling and information geometry, rather than as a problem about prompt engineering.

Worth following when
you want factuality detection methodology grounded in theoretical first principles rather than empirical recipe-hunting.
Topics
adversarial LM-vs-LM evaluation protocols; factuality detection without labeled data; learning theory as a lens on LLM behavior.
Key works
"LM vs LM: Detecting Factual Errors via Cross-Examination" (2023); broader ML-theory work informing the evaluation methodology.