Events

Sign Up Below to Receive Communications from the Statistics Department

Subscribe

* indicates required
Types of Communication from Department

Statistics Colloquium: Cynthia Rudin

11:30 am–12:30 pm Jones 303

Cynthia Rudin
Department of Computer Science
Duke University

Title: Many Good Models Leads To…

Abstract: As it turns out, many good models leads to amazing things! The Rashomon Effect, coined by Leo Breiman, describes the phenomenon that there exist many equally good predictive models for the same dataset. This phenomenon happens for many real datasets and when it does, it sparks both magic and consternation, but mostly magic. In light of the Rashomon Effect, my collaborators and I propose to reshape the way we think about machine learning, particularly for tabular data problems in the nondeterministic (noisy) setting. I’ll address how the Rashomon Effect impacts (1) the existence of simple-yet-accurate models, (2) flexibility to address user preferences, such as fairness and monotonicity, without losing performance, (3) algorithm choice, specifically, providing advanced knowledge of which algorithms might be suitable for a given problem, (4) public policy, and (5) scientific discovery. I’ll also discuss a theory of when the Rashomon Effect occurs and why: interestingly, noise in data leads to a large Rashomon Effect. My goal is to illustrate how the Rashomon Effect can have a massive impact on the use of machine learning for complex problems in society.

I’ll be mainly discussing the paper “Amazing Things Come From Having Many Good Models” (ICML spotlight, 2024) which is joint work with Chudi Zhong, Lesia Semenova, Margo Seltzer, Ronald Parr, Jiachang Liu, Srikar Katta, Jon Donnelly, Harry Chen, and Zachery Boner.

Mar 23

Statistics Colloquium: Csaba Szepesvari

11:30 am–12:30 pm Jones 303

Csaba Szepesvari
Department of Computing Science
University of Alberta


Title: TBA

Abstract: TBA

Mar 30

Statistics Colloquium: Marco Avella Medina

11:30 am–12:30 pm Jones 303

Marco Avella Medina
Department of Statistics
Columbia University

Title:

Abstract:

Apr 6

Student Seminar: Claire Tseng

10:00–11:00 am Jones 111

Monday, April 8, 2026, at 10:00 AM, in Jones 111, 5747 S. Ellis Avenue
Dissertation Proposal Presentation
Claire Tseng, Department of Statistics, The University of Chicago
“TBA”

Apr 8

Students Seminar: Yuguan Wang

10:00–11:00 am Jones 111

Wednesday, April 15, 2026, at 10:00 AM, in Jones 111, 5747 S. Ellis Ave.
Dissertation Defense Presentation
Yuguan Wang, Department of Statistics, The University of Chicago
“TBA”

Apr 15

Statistics Colloquium: Yiqiao Zhong

11:30 am–12:30 pm Jones 303

Yiqiao Zhong
Department of Statistics
University of Wisconsin-Madison

Title: Compositionality in Large Language Models: Emergence, Generalization, and Geometry

Abstract: Large language models (LLMs) have demonstrated remarkable reasoning abilities through novel techniques such as in-context learning and chain-of-thought (CoT) reasoning. Empirically, key reasoning skills often emerge only at larger scales or after prolonged training. Yet the underlying mechanism of LLM reasoning—-how compositional representations are formed and organized—-remains poorly understood.

In this talk, I present recent progress toward uncovering emergent compositional structure through controlled synthetic experiments on small transformers and targeted intervention studies on modern LLMs. First, I show that learning a key compositional structure is essential for out-of-distribution generalization, and that this process undergoes sharp phase transitions during training. At a critical stage, an intermediate low-dimensional “bridge subspace” emerges, serving as a shared representation connecting multiple layers. Second, using arithmetic composition as a minimal testbed for CoT reasoning, I demonstrate that autoregressive training on reasoning traces exhibits distinct reasoning phases. In particular, causally faithful reasoning emerges only when training noise lies below a critical threshold.

Together, these findings suggest that core statistical principles such as low-dimensional subspaces and causality may provide key foundations for advancing the interpretability and transparency of LLMs.

Apr 20

Students Seminar: Kiho Park

1:00–3:00 pm Room 103

Thursday, April 23, 2026, at 1:00 PM, in Room 103, 5460 S University Ave.
Dissertation Defense Presentation
Kiho Park, Department of Statistics, The University of Chicago
“TBA”

Apr 23

Statistics Colloquium: Stefan Wager

11:30 am–12:30 pm Jones 303

Stefan Wager
Department of Statistics
Stanford University

Title: TBA

Abstract: TBA

Apr 27

Statistics Colloquium: Aravindan Vijayaraghavan

11:30 am–12:30 pm Jones 303

Aravindan Vijayaraghavan
Department of Computer Science
Northwestern University

Title: TBA

Abstract: TBA

May 11

Statistics Colloquium: David Blei

11:30 am–12:30 pm Jones 303

David Blei
Departments of Statistics and Computer Science
Columbia University

Title: TBA

Abstract: TBA

 

May 18