Past Events

2026

Statistics Colloquium: Yiqiao Zhong

11:30 am–12:30 pm Jones 303

Yiqiao Zhong
Department of Statistics
University of Wisconsin-Madison

Title: Compositionality in Large Language Models: Emergence, Generalization, and Geometry

Abstract: Large language models (LLMs) have demonstrated remarkable reasoning abilities through novel techniques such as in-context learning and chain-of-thought (CoT) reasoning. Empirically, key reasoning skills often emerge only at larger scales or after prolonged training. Yet the underlying mechanism of LLM reasoning—-how compositional representations are formed and organized—-remains poorly understood.

In this talk, I present recent progress toward uncovering emergent compositional structure through controlled synthetic experiments on small transformers and targeted intervention studies on modern LLMs. First, I show that learning a key compositional structure is essential for out-of-distribution generalization, and that this process undergoes sharp phase transitions during training. At a critical stage, an intermediate low-dimensional “bridge subspace” emerges, serving as a shared representation connecting multiple layers. Second, using arithmetic composition as a minimal testbed for CoT reasoning, I demonstrate that autoregressive training on reasoning traces exhibits distinct reasoning phases. In particular, causally faithful reasoning emerges only when training noise lies below a critical threshold.

Together, these findings suggest that core statistical principles such as low-dimensional subspaces and causality may provide key foundations for advancing the interpretability and transparency of LLMs.

Apr 20

Student Seminars: Jinwen Yang

4:30–6:00 pm Jones 111

Friday, April 17, 2026, at 4:30 PM, in Jones 111, 5747 S. Ellis Ave.
Dissertation Defense Presentation
Jinwen Yang, Department of Statistics, The University of Chicago
“Scaling Up and Speeding Up Classical Optimization on Modern Computing Architectures”

Apr 17

Student Seminars: Jeonghwan Lee

2:30–3:00 pm Jones 111

Friday, April 17, 2026, at 2:30 PM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Jeonghwan Lee, Department of Statistics, The University of Chicago
“TBA”

Apr 17

Student Seminars: Xiaohan Zhu

11:00 am–12:30 pm DSI Room 103

Thursday, April 16, 2026, at 11:00 AM, in DSI Room 103, 5460 S University Ave.
Dissertation Defense Presentation
Xiaohan Zhu, Department of Statistics, The University of Chicago
“Overfitting and Generalizing with MDL and (PAC) Bayesian Learning in Supervised Classification”

Apr 16

Students Seminar: Yuguan Wang

10:00–11:00 am Jones 111

Wednesday, April 15, 2026, at 10:00 AM, in Jones 111, 5747 S. Ellis Ave.
Dissertation Defense Presentation
Yuguan Wang, Department of Statistics, The University of Chicago
“Fast Algorithms via Compressed Moment Representations”

Apr 15

Statistics Colloquium: Ping-Shou Zhong

11:30 am–12:30 pm Jones 303

Ping-Shou Zhong
Professor
University of Illinois Chicago

Title: On the Adaptivity and Scalability of Kernel Methods for Testing and Prediction

Abstract: Kernel methods are widely used for prediction and hypothesis testing. In this talk, I will introduce two approaches that enhance the scalability and adaptivity of kernel methods. In the first part, we introduce a new family of adaptive, distribution-free tests of independence based on binary expansion coefficients. By characterizing independence through cross-covariances of multiscale interaction terms, the proposed method is applicable in a general setting and does not require the reproducing kernel Hilbert space assumption. The resulting tests admit an explicit kernel representation, enabling efficient computation while reducing sensitivity to kernel choice. In the second part, we propose an informative sub-data selection method for large-scale kernel learning. This method identifies a representative subset of observations, enabling model training on a substantially reduced yet informative sample. The approach provides a principled form of data reduction and integrates naturally with existing kernel approximation and sketching techniques.

Apr 13

Student Seminar: Syris Shao

10:30–11:00 am Jones 111

Monday, April 13, 2026, at 10:30 AM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Syris Shao, Department of Statistics, The University of Chicago
“Comparing Univariate and Multivariate Models to Forecast Portfolio Value-at-Risk During COVID-19”

Apr 13

Student Seminar: Linzhe Teng

1:30–2:00 pm Jones 111

Friday, April 10, 2026, at 1:30 PM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Linzhe Teng, Department of Statistics, The University of Chicago
“GLIMES-Based Differential Expression with Comparative Low-Dimensional Representations in Paired scRNA-seq of BBIBP-CorV Vaccination”

Apr 10

DSI Distinguished Speaker Series: Lillian Lee

12:00–1:30 pm DSI 105

Lillian Lee
Charles Roy Davis Professor of Computer Science
Cornell University

Title: Taking a turn for the better? Pivoting and pivotal moments in consequential conversations

Abstract: So much of human interaction occurs as conversations, and it is both fascinating and imperative to analyze them. Recently, my co-authors and I have turned to texting-based conversations between mental-health therapists or crisis counselors and their clients, seeking to identify “key” moments in these exchanges:

(1) A “pivoting” moment corresponds to a *redirection* of the conversation introduced by one party that is accepted/followed by the other. We develop a probabilistic measure of how much an utterance immediately redirects the flow of the conversation, accounting for both the intention and the actual realization of such a change.

(2) In a *pivotal* moment, the conversation’s outcome hangs in the balance: how one responds can put the conversation on substantially diverging trajectories leading to significantly different results. We formalize this intuition by estimating the variance in expectation of outcome depending on what might be said next.

We find significant correlates of our measures in real human conversations on widely-used platforms. For example, the patients in our longer-term mental-health-therapy data who redirected less in their first few sessions were significantly more likely to eventually express dissatisfaction with their therapist and terminate the relationship; and the staff responses in our crisis-counseling data had greater estimated impact on disengagement rates during pivotal moments than in non-.

Joint work with Vivian Nguyen, Cristian Danescu-Niculescu-Mizil, Thomas D. Hull, and Sang Min (Dave) Jung.

Apr 9

Student Seminar: Claire Tseng

10:00–11:00 am Jones 111

Wednesday, April 8, 2026, at 10:00 AM, in Jones 111, 5747 S. Ellis Avenue
Dissertation Proposal Presentation
Claire Tseng, Department of Statistics, The University of Chicago
“Bias and Implied Beliefs in Large Language Models for Economic Expectations”

Apr 8