2026

Statistics Colloquium: Yiqiao Zhong
11:30 am–12:30 pm Jones 303
Yiqiao Zhong
Department of Statistics
University of Wisconsin-Madison
Title: Compositionality in Large Language Models: Emergence, Generalization, and Geometry
Abstract: Large language models (LLMs) have demonstrated remarkable reasoning abilities through novel techniques such as in-context learning and chain-of-thought (CoT) reasoning. Empirically, key reasoning skills often emerge only at larger scales or after prolonged training. Yet the underlying mechanism of LLM reasoning—-how compositional representations are formed and organized—-remains poorly understood.
In this talk, I present recent progress toward uncovering emergent compositional structure through controlled synthetic experiments on small transformers and targeted intervention studies on modern LLMs. First, I show that learning a key compositional structure is essential for out-of-distribution generalization, and that this process undergoes sharp phase transitions during training. At a critical stage, an intermediate low-dimensional “bridge subspace” emerges, serving as a shared representation connecting multiple layers. Second, using arithmetic composition as a minimal testbed for CoT reasoning, I demonstrate that autoregressive training on reasoning traces exhibits distinct reasoning phases. In particular, causally faithful reasoning emerges only when training noise lies below a critical threshold.
Together, these findings suggest that core statistical principles such as low-dimensional subspaces and causality may provide key foundations for advancing the interpretability and transparency of LLMs.
Student Seminars: Jinwen Yang
4:30–6:00 pm Jones 111
Friday, April 17, 2026, at 4:30 PM, in Jones 111, 5747 S. Ellis Ave.
Dissertation Defense Presentation
Jinwen Yang, Department of Statistics, The University of Chicago
“Scaling Up and Speeding Up Classical Optimization on Modern Computing Architectures”
Student Seminars: Jeonghwan Lee
2:30–3:00 pm Jones 111
Friday, April 17, 2026, at 2:30 PM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Jeonghwan Lee, Department of Statistics, The University of Chicago
“TBA”
Student Seminars: Xiaohan Zhu
11:00 am–12:30 pm DSI Room 103
Thursday, April 16, 2026, at 11:00 AM, in DSI Room 103, 5460 S University Ave.
Dissertation Defense Presentation
Xiaohan Zhu, Department of Statistics, The University of Chicago
“Overfitting and Generalizing with MDL and (PAC) Bayesian Learning in Supervised Classification”
Students Seminar: Yuguan Wang
10:00–11:00 am Jones 111
Wednesday, April 15, 2026, at 10:00 AM, in Jones 111, 5747 S. Ellis Ave.
Dissertation Defense Presentation
Yuguan Wang, Department of Statistics, The University of Chicago
“Fast Algorithms via Compressed Moment Representations”

Statistics Colloquium: Ping-Shou Zhong
11:30 am–12:30 pm Jones 303
Ping-Shou Zhong
Professor
University of Illinois Chicago
Title: On the Adaptivity and Scalability of Kernel Methods for Testing and Prediction
Abstract: Kernel methods are widely used for prediction and hypothesis testing. In this talk, I will introduce two approaches that enhance the scalability and adaptivity of kernel methods. In the first part, we introduce a new family of adaptive, distribution-free tests of independence based on binary expansion coefficients. By characterizing independence through cross-covariances of multiscale interaction terms, the proposed method is applicable in a general setting and does not require the reproducing kernel Hilbert space assumption. The resulting tests admit an explicit kernel representation, enabling efficient computation while reducing sensitivity to kernel choice. In the second part, we propose an informative sub-data selection method for large-scale kernel learning. This method identifies a representative subset of observations, enabling model training on a substantially reduced yet informative sample. The approach provides a principled form of data reduction and integrates naturally with existing kernel approximation and sketching techniques.
Student Seminar: Syris Shao
10:30–11:00 am Jones 111
Monday, April 13, 2026, at 10:30 AM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Syris Shao, Department of Statistics, The University of Chicago
“Comparing Univariate and Multivariate Models to Forecast Portfolio Value-at-Risk During COVID-19”
Student Seminar: Linzhe Teng
1:30–2:00 pm Jones 111
Friday, April 10, 2026, at 1:30 PM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Linzhe Teng, Department of Statistics, The University of Chicago
“GLIMES-Based Differential Expression with Comparative Low-Dimensional Representations in Paired scRNA-seq of BBIBP-CorV Vaccination”
DSI Distinguished Speaker Series: Lillian Lee
12:00–1:30 pm DSI 105
Lillian Lee
Charles Roy Davis Professor of Computer Science
Cornell University
Title: Taking a turn for the better? Pivoting and pivotal moments in consequential conversations
Abstract: So much of human interaction occurs as conversations, and it is both fascinating and imperative to analyze them. Recently, my co-authors and I have turned to texting-based conversations between mental-health therapists or crisis counselors and their clients, seeking to identify “key” moments in these exchanges:
(1) A “pivoting” moment corresponds to a *redirection* of the conversation introduced by one party that is accepted/followed by the other. We develop a probabilistic measure of how much an utterance immediately redirects the flow of the conversation, accounting for both the intention and the actual realization of such a change.
(2) In a *pivotal* moment, the conversation’s outcome hangs in the balance: how one responds can put the conversation on substantially diverging trajectories leading to significantly different results. We formalize this intuition by estimating the variance in expectation of outcome depending on what might be said next.
We find significant correlates of our measures in real human conversations on widely-used platforms. For example, the patients in our longer-term mental-health-therapy data who redirected less in their first few sessions were significantly more likely to eventually express dissatisfaction with their therapist and terminate the relationship; and the staff responses in our crisis-counseling data had greater estimated impact on disengagement rates during pivotal moments than in non-.
Joint work with Vivian Nguyen, Cristian Danescu-Niculescu-Mizil, Thomas D. Hull, and Sang Min (Dave) Jung.
Student Seminar: Claire Tseng
10:00–11:00 am Jones 111
Wednesday, April 8, 2026, at 10:00 AM, in Jones 111, 5747 S. Ellis Avenue
Dissertation Proposal Presentation
Claire Tseng, Department of Statistics, The University of Chicago
“Bias and Implied Beliefs in Large Language Models for Economic Expectations”