Nathan Lambert
What happens to a language model between "trained on the internet" and "answering your question the way it does" — and how to study that step in public.
Most of the post-training stack — RLHF, preference learning, instruction tuning — is developed inside closed labs and rarely described publicly. Lambert maintains an open counter-line: Tülu is the open post-training pipeline with published recipes; the RLHF Book is the only systematic treatment of the subject available without an NDA; his Interconnects newsletter is read across the industry as the primary open source on what actually goes on inside model labs during post-training. For ai100 this matters because post-training determines what the model says: pre-training fixes the corpus of knowledge, post-training decides which parts of it surface in any given answer.