Feb 18 at 12pm (GHC 6115) -- Keegan Harris (CMU) -- Should You Use Your Large Language Model to Explore or Exploit?
Victor Akinwande
vakinwan at andrew.cmu.edu
Tue Feb 18 11:30:28 EST 2025
Reminder: Keegan's talk is happening in about 30 minutes.
On Thu, Feb 13, 2025 at 3:42 PM Victor Akinwande <vakinwan at andrew.cmu.edu>
wrote:
> Dear all,
>
> We look forward to seeing you next *Tuesday (02/18) from 12:00-1:00 PM
> (ET)* for the next talk of CMU AI Seminar, sponsored by SambaNova Systems
> <https://sambanova.ai/>. The seminar will be held in *GHC 6115* with
> pizza provided and will be streamed on Zoom.
>
> To learn more about the seminar series or to see the future schedule,
> please visit the seminar website (http://www.cs.cmu.edu/~aiseminar/).
>
> Next Tuesday (02/18) Keegan Harris (CMU) will be giving a talk titled:
> "Should You Use Your Large Language Model to Explore or Exploit?".
>
>
> *Abstract*
> In-context (supervised) learning is the ability of an LLM to perform new
> prediction tasks by conditioning on examples provided in the prompt,
> without any updates to internal model parameters. Although supervised
> learning is an important capability, many applications demand the use of ML
> models for downstream decision making. Thus, in-context reinforcement
> learning (ICRL) is a natural next frontier. In this talk, we investigate
> the extent to which contemporary LLMs can solve ICRL tasks. We begin by
> deploying LLMs as agents in simple multi-armed bandit environments,
> specifying the environment description and interaction history entirely
> in-context. We experiment with several frontier models and find that they
> do not engage in robust decision making behavior without substantial
> task-specific mitigations. Motivated by this observation, we then use LLMs
> to explore and exploit in silos in various (contextual) bandit tasks. We
> find that while the current generation of LLMs often struggle to exploit,
> in-context mitigations may be used to improve performance on small-scale
> tasks. On the other hand, we find that LLMs do help at exploring large
> action spaces with inherent semantics, by suggesting suitable candidates to
> explore. This talk is based on joint work with Alex Slivkins, Akshay
> Krishnamurthy, Dylan Foster, and Cyril Zhang.
>
>
> *Speaker bio: *
> Keegan Harris is a final-year Machine Learning PhD candidate at CMU,
> where he is advised by Nina Balcan and Steven Wu, and does research on
> machine learning for decision making. He has been recognized as a Rising
> Star in Data Science and his research is supported by an NDSEG Fellowship.
> He is also the head editor of the ML at CMU blog. Previously, Keegan spent
> two summers as an intern at Microsoft Research and graduated from Penn
> State with BS degrees in Computer Science and Physics.
>
>
> *In person: GHC 6115*
> Zoom Link:* https://cmu.zoom.us/j/93599036899?pwd=oV45EL19Bp3I0PCRoM8afhKuQK7HHN.1
> <https://cmu.zoom.us/j/93599036899?pwd=oV45EL19Bp3I0PCRoM8afhKuQK7HHN.1>*
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/ai-seminar-announce/attachments/20250218/5742e875/attachment.html>
More information about the ai-seminar-announce
mailing list