Fwd: RI PhD Thesis Proposal - Willa Potosnak

Mon May 4 10:44:28 EDT 2026

It is impossible diminish the intellectual feast we will partake in when we
come to see Willa's proposals, but the cookies she makes are more than
delicious!

Please join us today at 4:30 in Gates 4405.

Cheers
Artur

---------- Forwarded message ---------
From: Willa Potosnak <wpotosna at andrew.cmu.edu>
Date: Mon, May 4, 2026 at 9:50 AM
Subject: Fwd: RI PhD Thesis Proposal - Willa Potosnak
To: RI People <ri-people at andrew.cmu.edu>

Hi everyone,

Just a reminder that this is at 4:30pm today!

Coffee and pastries will be provided.

Best,
Willa Potosnak

---------- Forwarded message ---------
From: RI PhD Program Manager <ri-phd-manager at andrew.cmu.edu>
Date: Mon, Apr 27, 2026 at 1:52 PM
Subject: RI PhD Thesis Proposal - Willa Potosnak
To: RI People <ri-people at andrew.cmu.edu>

*RI EVENT CALENDAR*
<https://www.ri.cmu.edu/event/forecasting-at-scale-with-efficient-deep-learning-architectures/>

*Date:  *May 4th, 2026
*Time:  *4:30 PM-6:00 PM
*Location*: GHC 4405
*Zoom Link*
<https://cmu.zoom.us/j/93920625206?pwd=WsTYphnRXRfhUaBpKKbbeIaTgDMoro.1>
*Type: *RI PhD Thesis Proposal
*Who: *Willa Potosnak

*Title:* Forecasting at Scale with Efficient Deep Learning Architectures

*Abstract:*
Time Series Foundation Models (TSFMs) have scaled rapidly, with publicly
reported pretraining corpora growing from 1.23 billion to 1 trillion data
points between 2024 and 2026, an approximately 800× increase in two years.
Recent work has further supplemented real-world data with synthetic data to
expose models to broader time series patterns. Yet, this data-centric
paradigm raises a fundamental question: *must intelligent forecasting rely
solely on scale, or can intentional architectural design unlock better
generalization? *This thesis proposes that more intelligently and
efficiently leveraging existing data, rather than scale alone, is key to
achieving better forecasting generalization. We pursue this through three
parallel architectural themes: exploiting cross-channel structure beyond
temporal patterns, enabling zero-shot generalization through structured
composition, and reducing gradient and forecast variance by design. Each
theme aims to enhance generalization with available data while treating
computational efficiency as a core design principle.

In this thesis, we demonstrate that scale is not the only path to
generalization by: developing multivariate architectures that leverage
cross-channel dependencies efficiently while reducing forecast error;
showing that architectures can generalize beyond their training
distribution in both patterns and concepts; and verifying variance-aware
architectural designs that extract richer training signals from existing
data, provably reducing gradient variance while reducing forecast error and
improving calibration.

Within the first theme, we further propose pretraining strategies for
multivariate TSFMs to investigate whether data balancing and curriculum
learning can improve downstream generalization given the same pretraining
corpora. Within the second theme, we propose an additional dimension of
generalization, extending beyond pattern and concept generalization to
horizon generalization, an important consideration for TSFMs applied across
diverse tasks and domains. Overall, this work contributes new insights into
advancing time series forecasting generalization through efficient
architectural design.

*Committee:*
Artur Dubrawski, Chair
John Dolan
Barnabás Póczos
Michael W. Mahoney (University of California, Berkeley)

*Thesis Link*
<https://drive.google.com/file/d/1oYYdgxvLaW4iF-LWHqzpWHMuXA677QYv/view?usp=sharing>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20260504/ad2d1d95/attachment.html>