Events: Lectures

Statistics Colloquium: Marco Avella Medina

11:30 am–12:30 pm Jones 303

Marco Avella Medina
Department of Statistics
Columbia University

Title:  A Theoretical Framework for M-Posteriors: Frequentist Guarantees and Robustness Properties

Abstract: We provide a theoretical framework for a wide class of generalized posteriors that can be viewed as the natural Bayesian posterior counterpart of the class of M-estimators in the frequentist world. We call the members of this class M-posteriors and show that they are asymptotically normally distributed under mild conditions on the M-estimation loss and the prior. In particular, an M-posterior contracts in probability around a normal distribution centered at an M-estimator, showing frequentist consistency and suggesting some degree of robustness depending on the reference M-estimator. We formalize the robustness properties of the M-posteriors by a new characterization of the posterior influence function and a novel definition of breakdown point adapted for posterior distributions. We illustrate the wide applicability of our theory in various popular models and discuss extensions to variational inference. We illustrate their empirical relevance of our results in some numerical examples.

This is based on joint work with Juraj Marusic and Cynthia Rush

Apr 6

DSI Distinguished Speaker Series: Lillian Lee

12:00–1:30 pm DSI 105

Lillian Lee
Charles Roy Davis Professor of Computer Science
Cornell University

Title: Taking a turn for the better? Pivoting and pivotal moments in consequential conversations

Abstract: So much of human interaction occurs as conversations, and it is both fascinating and imperative to analyze them. Recently, my co-authors and I have turned to texting-based conversations between mental-health therapists or crisis counselors and their clients, seeking to identify “key” moments in these exchanges:

(1) A “pivoting” moment corresponds to a *redirection* of the conversation introduced by one party that is accepted/followed by the other. We develop a probabilistic measure of how much an utterance immediately redirects the flow of the conversation, accounting for both the intention and the actual realization of such a change.

(2) In a *pivotal* moment, the conversation’s outcome hangs in the balance: how one responds can put the conversation on substantially diverging trajectories leading to significantly different results. We formalize this intuition by estimating the variance in expectation of outcome depending on what might be said next.

We find significant correlates of our measures in real human conversations on widely-used platforms. For example, the patients in our longer-term mental-health-therapy data who redirected less in their first few sessions were significantly more likely to eventually express dissatisfaction with their therapist and terminate the relationship; and the staff responses in our crisis-counseling data had greater estimated impact on disengagement rates during pivotal moments than in non-.

Joint work with Vivian Nguyen, Cristian Danescu-Niculescu-Mizil, Thomas D. Hull, and Sang Min (Dave) Jung.

Apr 9

Statistics Colloquium: Yiqiao Zhong

11:30 am–12:30 pm Jones 303

Yiqiao Zhong
Department of Statistics
University of Wisconsin-Madison

Title: Compositionality in Large Language Models: Emergence, Generalization, and Geometry

Abstract: Large language models (LLMs) have demonstrated remarkable reasoning abilities through novel techniques such as in-context learning and chain-of-thought (CoT) reasoning. Empirically, key reasoning skills often emerge only at larger scales or after prolonged training. Yet the underlying mechanism of LLM reasoning—-how compositional representations are formed and organized—-remains poorly understood.

In this talk, I present recent progress toward uncovering emergent compositional structure through controlled synthetic experiments on small transformers and targeted intervention studies on modern LLMs. First, I show that learning a key compositional structure is essential for out-of-distribution generalization, and that this process undergoes sharp phase transitions during training. At a critical stage, an intermediate low-dimensional “bridge subspace” emerges, serving as a shared representation connecting multiple layers. Second, using arithmetic composition as a minimal testbed for CoT reasoning, I demonstrate that autoregressive training on reasoning traces exhibits distinct reasoning phases. In particular, causally faithful reasoning emerges only when training noise lies below a critical threshold.

Together, these findings suggest that core statistical principles such as low-dimensional subspaces and causality may provide key foundations for advancing the interpretability and transparency of LLMs.

Apr 20

Statistics Colloquium: Stefan Wager

11:30 am–12:30 pm Jones 303

Stefan Wager
Department of Statistics
Stanford University

Title: TBA

Abstract: TBA

Apr 27

Statistics Colloquium: Aravindan Vijayaraghavan

11:30 am–12:30 pm Jones 303

Aravindan Vijayaraghavan
Department of Computer Science
Northwestern University

Title: TBA

Abstract: TBA

May 11

Statistics Colloquium: David Blei

11:30 am–12:30 pm Jones 303

David Blei
Departments of Statistics and Computer Science
Columbia University

Title: TBA

Abstract: TBA

 

May 18