Jure Leskovec
What graph structure adds to the kinds of reasoning and retrieval problems language models currently handle without it — and what gets missed when relationships in data are flattened into text.
Leskovec's node2vec (2016) was one of the first methods to make graph-structured data work as input to standard ML pipelines — embed nodes as vectors that preserve neighborhood information, then use the embeddings as features anywhere a vector goes. The Stanford Network Analysis Project (SNAP) datasets and methods that followed became the default substrate for graph-ML research. For ai100, where the question of "how does a model know what brand to mention" sits in graph-structured-reasoning territory — which entities are connected to which contexts in training data — the graph perspective is a usefully orthogonal angle on what current LLM evaluation tends to measure only on the surface.