Mark your calendars! [RI PhD Thesis Proposal - Willa Potosnak]
Artur Dubrawski
awd at cs.cmu.edu
Mon Apr 27 14:09:19 EDT 2026
Willa will be giving her proposal presentation on Monday next week during
our usual brainstorming session time slot, but in Gates Hall. Please come
and join. Intellectual fun guaranteed!
Cheers,
Artur
---------- Forwarded message ---------
From: RI PhD Program Manager <ri-phd-manager at andrew.cmu.edu>
Date: Mon, Apr 27, 2026, 1:52 PM
Subject: RI PhD Thesis Proposal - Willa Potosnak
To: RI People <ri-people at andrew.cmu.edu>
*RI EVENT CALENDAR*
<https://www.ri.cmu.edu/event/forecasting-at-scale-with-efficient-deep-learning-architectures/>
*Date: *May 4th, 2026
*Time: *4:30 PM-6:00 PM
*Location*: GHC 4405
*Zoom Link*
<https://cmu.zoom.us/j/93920625206?pwd=WsTYphnRXRfhUaBpKKbbeIaTgDMoro.1>
*Type: *RI PhD Thesis Proposal
*Who: *Willa Potosnak
*Title:* Forecasting at Scale with Efficient Deep Learning Architectures
*Abstract:*
Time Series Foundation Models (TSFMs) have scaled rapidly, with publicly
reported pretraining corpora growing from 1.23 billion to 1 trillion data
points between 2024 and 2026, an approximately 800× increase in two years.
Recent work has further supplemented real-world data with synthetic data to
expose models to broader time series patterns. Yet, this data-centric
paradigm raises a fundamental question: *must intelligent forecasting rely
solely on scale, or can intentional architectural design unlock better
generalization? *This thesis proposes that more intelligently and
efficiently leveraging existing data, rather than scale alone, is key to
achieving better forecasting generalization. We pursue this through three
parallel architectural themes: exploiting cross-channel structure beyond
temporal patterns, enabling zero-shot generalization through structured
composition, and reducing gradient and forecast variance by design. Each
theme aims to enhance generalization with available data while treating
computational efficiency as a core design principle.
In this thesis, we demonstrate that scale is not the only path to
generalization by: developing multivariate architectures that leverage
cross-channel dependencies efficiently while reducing forecast error;
showing that architectures can generalize beyond their training
distribution in both patterns and concepts; and verifying variance-aware
architectural designs that extract richer training signals from existing
data, provably reducing gradient variance while reducing forecast error and
improving calibration.
Within the first theme, we further propose pretraining strategies for
multivariate TSFMs to investigate whether data balancing and curriculum
learning can improve downstream generalization given the same pretraining
corpora. Within the second theme, we propose an additional dimension of
generalization, extending beyond pattern and concept generalization to
horizon generalization, an important consideration for TSFMs applied across
diverse tasks and domains. Overall, this work contributes new insights into
advancing time series forecasting generalization through efficient
architectural design.
*Committee:*
Artur Dubrawski, Chair
John Dolan
Barnabás Póczos
Michael W. Mahoney (University of California, Berkeley)
*Thesis Link*
<https://drive.google.com/file/d/1oYYdgxvLaW4iF-LWHqzpWHMuXA677QYv/view?usp=sharing>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20260427/383e0c0c/attachment.html>
More information about the Autonlab-users
mailing list