Statistics Colloquium: Yiqiao Zhong | Department of Statistics

11:30 am–12:30 pm Jones 303

5747 S. Ellis Ave.

Yiqiao Zhong
Department of Statistics
University of Wisconsin-Madison

Title: Compositionality in Large Language Models: Emergence, Generalization, and Geometry

Abstract: Large language models (LLMs) have demonstrated remarkable reasoning abilities through novel techniques such as in-context learning and chain-of-thought (CoT) reasoning. Empirically, key reasoning skills often emerge only at larger scales or after prolonged training. Yet the underlying mechanism of LLM reasoning---how compositional representations are formed and organized---remains poorly understood.

In this talk, I present recent progress toward uncovering emergent compositional structure through controlled synthetic experiments on small transformers and targeted intervention studies on modern LLMs. First, I show that learning a key compositional structure is essential for out-of-distribution generalization, and that this process undergoes sharp phase transitions during training. At a critical stage, an intermediate low-dimensional “bridge subspace” emerges, serving as a shared representation connecting multiple layers. Second, using arithmetic composition as a minimal testbed for CoT reasoning, I demonstrate that autoregressive training on reasoning traces exhibits distinct reasoning phases. In particular, causally faithful reasoning emerges only when training noise lies below a critical threshold.

Together, these findings suggest that core statistical principles such as low-dimensional subspaces and causality may provide key foundations for advancing the interpretability and transparency of LLMs.

Bio: Yiqiao Zhong is currently an assistant professor at the University of Wisconsin-Madison, Department of Statistics. Yiqiao obtained his PhD from Princeton University, advised by Prof. Jianqing Fan, and was a postdoc at Stanford University, advised by Prof. Andrea Montanari and Prof. David Donoho. His research interests are the scientific foundations of large language models, including interpretability, visualization, and statistical theory.

Event Type

Apr 20