Past Events

2026

Joint Computer Science and Data Science Institute Seminar: Shreya Shankar

2:30–3:30 pm DSI 105

Shreya Shankar
PhD Candidate in the Data Systems and Foundations Group
University of California, Berkeley

Title: Building Effective Unstructured Data Systems

Abstract: Databases and other data systems have successfully democratized data-oriented computation across domains, thanks to decades of research in system internals and end-user interfaces. However, such systems center on structured (i.e., tabular) data; unstructured data—the vast majority of data—has largely been ignored. Large language models (LLMs) now give us a building block for unstructured data analysis, and we face the same questions as in the early days of data systems—e.g., how should users author queries? How do we efficiently execute queries at scale?—but many well-established tenets from traditional data systems no longer hold. In my talk, I will present DocETL, a system I developed for unstructured data analysis. I will discuss how we had to rethink query optimization under these new assumptions, optimizing user-written pipelines for both accuracy and efficiency—as well as end-user interfaces for authoring, iterating on, and debugging pipelines. DocETL is open-source with 3.5k+ GitHub stars; our hosted interface has supported 4.1k+ pipelines across 30+ S&P-500 industries. Query optimization ideas from our work have been adopted in databases such as Snowflake and BigQuery, and our interface design principles have been adopted by companies like LangChain and OpenAI.

Feb 18

Student Seminar: Yushuo Li

2:00–2:30 pm Jones 111

Wednesday, February 18, 2026, at 2:00 PM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Yushuo Li, Department of Statistics, The University of Chicago
“Asymptotically Optimal Conformal Prediction for Classification”

Feb 18

Student Seminar: Buning (Erica) Fan

1:30–2:00 pm Jones 111

Wednesday, February 18, 2026, at 1:30 PM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Buning Fan, Department of Statistics, The University of Chicago
“Comparing Bayesian Software Platforms for Three-Level Mixed Effects Location Scale Models”

Feb 18

Student Seminar: Zixuan Qin

1:00–1:30 pm Jones 111

Wednesday, February 18, 2026, at 1:00 PM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Zixuan Qin Department of Statistics, The University of Chicago
“Operator Learning and Bispectrum-Guided Diffusion for Functional Multi-Reference Alignment”

Feb 18

Joint Statistics and DSI Colloquium: Ana-Andreea Stoica

2:00–3:00 pm DSI 105

Ana-Andreea Stoica
Research Group Leader in the Social Foundations of Computation Department
Max Planck Institute for Intelligent Systems

Title: Designing for Society: AI in Networks, Markets, and Platforms

Abstract: AI systems increasingly mediate how people access information, economic opportunities, and essential services. Yet when deployed in social environments—online platforms, labor markets, and information ecosystems—AI interacts with complex human behavior, strategic incentives, and structural inequality. This talk focuses on foundational challenges and opportunities for AI systems: how to design and evaluate algorithmic interventions in complex social environments. I will present recent work on causal inference under competing treatments, which formalizes how competition for user attention and strategic behavior among experimenters distort experimental data and invalidate naïve estimates of algorithmic impact. By modeling experimentation as a strategic data acquisition problem, we show how evaluation itself becomes an optimization problem, and we derive mechanisms that recover meaningful estimates despite interference and competition. I connect this problem to deriving foundational properties of AI systems that enable responsible and efficient algorithmic design. Beyond this case study, the talk highlights broader implications for the design and evaluation of AI systems in networks, markets, and platforms. I argue that responsible deployment requires rethinking evaluation methodologies to account for incentives, feedback loops, and system-level effects, and I outline how algorithmic and statistical tools can support more accountable and socially aligned AI systems.

 

 

Feb 17

Student Seminar: Jose Cruzado

2:00–2:30 pm Jones 111

Monday, February 16, 2026, at 2:00 PM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Jose Cruzado, Department of Statistics, The University of Chicago
“Expected Gradient Outer Product Reparameterization in Deep ConvolutionalNetworks”

Feb 16

Student Seminar: Kaushik Kancharla

9:00–9:30 am Jones 111

Wednesday, February 11, 2026, at 9:00 AM, in Jones 111, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Kaushik Kancharla, Department of Statistics, The University of Chicago
“The Intraday Dynamics of the Volatility Term Structure”

Feb 11

Student Seminar: Hunter Chen

1:00–1:30 pm Kent 106

Tuesday, February 10, 2026, at 1:00 PM, in Kent 106, 1427 East 60th Street
Master’s Thesis Presentation
Hunter Chen, Department of Statistics, The University of Chicago
“Empirical Bayes learning from selectively reported confidence intervals”

Feb 10

Department of Computer Science and Data Science Institute Presents: Weijia Shi

2:30–3:30 pm DSI 105

Weijia Shi
PhD Candidate
University of Washington

Title: Breaking the Language Model Monolith

Abstract: Language models (LMs) are typically monolithic: a single model storing all knowledge and serving every use case. This design presents significant challenges; they often generate factually incorrect statements, require costly retraining to add or remove information, and face serious privacy and copyright issues. In this talk, I will discuss how to break this monolith by introducing modular architectures and training algorithms that separate capabilities across composable components. I’ll cover two forms of modularity: (1) External modularity, which augments LMs with external tools like retrievers to improve factuality and reasoning; and (2) internal modularity, which builds inherently modular LMs from decentrally trained components to enable flexible composition and an unprecedented level of control.

Feb 9

Student Seminar: Haewon Hwang

1:00–1:30 pm Jones 226

Monday, February 9, 2026, at 1:00 PM, in Jones 226, 5747 S. Ellis Avenue
Master’s Thesis Presentation
Haewon Hwang, Department of Statistics, The University of Chicago
“Media Violence and Criminal Behavior: evidence from local movie demand”

Feb 9