Goldberg wrote one of the standard NLP textbooks for the neural era (2017) and has spent the years since pushing on a question that gets harder as models get more capable: is the explanation an LLM gives for its answer the actual cause of that answer, or a plausible-sounding cover story produced after the fact? His work on faithfulness in generated explanations is a steady reminder that "the model said why it did X" and "we know why the model did X" are different claims, and most current evaluation methodology conflates them.

Worth following when
you want someone who treats LLM-generated explanations with the same skepticism applied to the answers themselves.
Topics
faithfulness of generated explanations and chains of thought; the limits of post-hoc model interpretability; NLP as a field — its history and its current confusions.
Key works
Neural Network Methods for NLP (2017); ongoing publications on faithfulness in NLG; public NLP commentary as a longer body of work.