11:30 am–12:30 pm
Jones 303 5747 S. Ellis Ave.
Yiqiao Zhong
Department of Statistics
University of Wisconsin-Madison
Title: Compositionality in Large Language Models: Emergence, Generalization, and Geometry
Abstract: Large language models (LLMs) have demonstrated remarkable reasoning abilities through novel techniques such as in-context learning and chain-of-thought (CoT) reasoning. Empirically, key reasoning skills often emerge only at larger scales or after prolonged training. Yet the underlying mechanism of LLM reasoning---how compositional representations are formed and organized---remains poorly understood.
In this talk, I present recent progress toward uncovering emergent compositional structure through controlled synthetic experiments on small transformers and targeted intervention studies on modern LLMs. First, I show that learning a key compositional structure is essential for out-of-distribution generalization, and that this process undergoes sharp phase transitions during training. At a critical stage, an intermediate low-dimensional “bridge subspace” emerges, serving as a shared representation connecting multiple layers. Second, using arithmetic composition as a minimal testbed for CoT reasoning, I demonstrate that autoregressive training on reasoning traces exhibits distinct reasoning phases. In particular, causally faithful reasoning emerges only when training noise lies below a critical threshold.
Together, these findings suggest that core statistical principles such as low-dimensional subspaces and causality may provide key foundations for advancing the interpretability and transparency of LLMs.
Bio: Yiqiao Zhong is currently an assistant professor at the University of Wisconsin-Madison, Department of Statistics. Yiqiao obtained his PhD from Princeton University, advised by Prof. Jianqing Fan, and was a postdoc at Stanford University, advised by Prof. Andrea Montanari and Prof. David Donoho. His research interests are the scientific foundations of large language models, including interpretability, visualization, and statistical theory.