Oct 29 at 12pm (GHC 6115) -- Virginia Smith (CMU) -- A Reality Check for Vibes-Based ML Safety

Victor Akinwande vakinwan at andrew.cmu.edu
Wed Oct 23 12:28:23 EDT 2024


Dear all,

We look forward to seeing you next *Tuesday (10/29) from 12:00-1:00 PM (ET)*
for the next talk of this semester's CMU AI Seminar, sponsored by SambaNova
Systems <https://sambanova.ai>. The seminar will be held in *GHC 6115* with
pizza provided and will be streamed on Zoom.

To learn more about the seminar series or to see the future schedule,
please visit the seminar website (http://www.cs.cmu.edu/~aiseminar/).

Next Tuesday (10/29), Virginia Smith (CMU) will be giving a talk titled "A
Reality Check for Vibes-Based ML Safety".

*Abstract:* Machine learning applications are increasingly reliant on
black-box pretrained models. To ensure safe use of these models, techniques
such as unlearning, guardrails, and watermarking have been proposed to curb
model behavior and audit usage. Unfortunately, while these post-hoc
approaches give positive safety ‘vibes’ when evaluated in isolation, our
work shows that existing techniques are quite brittle when deployed as part
of larger systems. In a series of recent works, we show that: (a) small
amounts of auxiliary data can be used to 'jog' the memory of unlearned
models; (b) current unlearning benchmarks obscure deficiencies in both
finetuning and guardrail-based approaches; and (c) simple, scalable attacks
erode existing LLM watermarking systems and reveal fundamental trade-offs
in watermark design. Taken together, these results highlight major
deficiencies in the practical use of post-hoc ML safety methods. We end by
discussing promising alternatives to ML safety, which instead aim to ensure
safety by design during the development of ML systems.

*Bio: *Virginia Smith is the Leonardo Associate Professor of Machine
Learning at Carnegie Mellon University. Her current work addresses
challenges related to safety and efficiency in large-scale machine learning
systems.


*In person: GHC 6115Zoom Link:
 https://cmu.zoom.us/j/99510233317?pwd=ZGx4aExNZ1FNaGY4SHI3Qlh0YjNWUT09
<https://cmu.zoom.us/j/99510233317?pwd=ZGx4aExNZ1FNaGY4SHI3Qlh0YjNWUT09>*


- Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/ai-seminar-announce/attachments/20241023/e958053c/attachment.html>


More information about the ai-seminar-announce mailing list