Events: Statistics Colloquium

Joint Statistics and DSI Colloquium: Jiaqi Zhang

4:00–5:00 pm DSI 105

Jiaqi Zhang
PhD Candidate
Massachusetts Institute of Technology

Title: Modeling Large-Scale Interventions

Abstract: Complex causal mechanisms among genes govern cellular functions in health and disease. Understanding these mechanisms can accelerate therapeutic discovery but remains challenging due to the large number of genes and their intricate dependencies. Recent advances in experimental technologies are making this problem increasingly tractable: it is now possible to systematically intervene on individual genes or gene combinations in single cells and measure their downstream effects, enabling empirical identification and validation of causal relationships. However, interventional data are high-dimensional, making interpretation challenging, and costly to collect.

In this talk, I will present our work tackling these challenges from three aspects. First, we introduced causal representation theories and algorithms with identifiability guarantees to uncover latent variables behind high-dimensional data. Second, we developed a method to model interventional data that can predict the effects of novel interventions with high accuracy, incorporating both distributional shifts and prior domain knowledge. Finally, we showed how predictive intervention modeling can improve future experimental design, illustrated by an application where we predicted and validated previously unknown T-cell regulators with therapeutic potential for cancer immunotherapy.

Feb 26

Billingsley Lectures on Probability: Christophe Garban

5:00–6:00 pm Kent 120

Billingsley Lectures on Probability

Reception immediately following the lecture at 6:10 pm, in Jones 111, 5747 S Ellis Ave.

Christophe Garban
Université Lyon 1/Courant - NYU

Title: Continuous Symmetry and Phase Transitions in Lattice Spin Systems

Abstract: A central problem in statistical physics is to understand how spins placed on the lattice Z^d interact and collectively organize at different temperatures. When the spins take values in a discrete set — for instance in the celebrated Ising model, where \sigma_x\in\{−1,+1\} — the mechanisms governing phase transitions are by now relatively well understood.

The situation changes dramatically when the spins take values in a continuous space, such as the unit circle S^1 in the XY model or the unit sphere S^2 in the classical Heisenberg model. In this setting, new phenomena appear, and the behavior depends strongly on whether the underlying symmetry is Abelian or non-Abelian. In particular, the non-Abelian case remains far more mysterious.

In this talk, I will introduce the mathematics of spin systems with continuous symmetry, emphasizing their deep connections with analysis, including harmonic functions, harmonic maps, and geometric analysis. I will also describe some recent results and open problems in the area.

No prior background in statistical physics or probability will be assumed. Based on joint works with J. Aru, D. van Engelenburg, P. Dario, N. de Montgolfier, A. Sepúlveda and T. Spencer.

Reception immediately following the lecture at 6:10 pm, in Jones 111, 5747 S Ellis Ave.

Feb 26

Joint Statistics and DSI Colloquium: Soledad Villar

2:00–3:00 pm DSI 105

Soledad Villar
Assistant Professor
Johns Hopkins University

Title: Machine Learning and Symmetries

Abstract: Symmetries play a significant role in machine learning. In scientific applications, they often arise as constraints imposed by physical laws. More broadly, symmetries emerge whenever objects admit multiple ways to express them (for example, in graph machine learning). In addition, modern machine learning models are heavily overparameterized, so many distinct sets of parameters can represent the same function, revealing further underlying symmetries.

In this talk, we describe methods for incorporating symmetries into machine learning models using classical tools from algebra, including invariant theory and Galois theory. A particularly interesting feature of symmetry-preserving models is that they can be defined independently of the size or dimension of the input. The formalization of this setting, known as any-dimensional machine learning, is inspired by ideas from representation stability. In this talk we present a theoretical framework for understanding the assumptions imposed by such models, which allows us to align learning models with data of varying sizes and learning tasks in a principled way.

Any-dimensional models use a fixed set of parameters and can be evaluated on data of varying sizes. Hyperparameter transfer considers the complementary setting, in which the data are fixed while the model size varies, and studies how optimal hyperparameters (such as the learning rate) can be transferred from smaller models to larger ones. If time permits, we will also discuss recent connections between any-dimensional machine learning and hyperparameter transfer.

Mar 5