Jan 23 at 12pm (NSH 3305) -- Ashique KhudaBukhsh (RIT) -- Down the Toxicity Rabbit Hole: A Novel Framework to Bias Audit LLMs -- AI Seminar sponsored by SambaNova Systems

Tue Jan 16 10:44:56 EST 2024

Dear all,

We look forward to seeing you *next Tuesday (01/23) from 12:00-1:00 PM (ET)
*for the first talk of this semester's CMU AI Seminar, sponsored by
SambaNova Systems (https://sambanova.ai). The seminar will be held in *NSH
3305 *with pizza provided and will be streamed on Zoom.

To learn more about the seminar series or to see the future schedule,
please visit the seminar website (http://www.cs.cmu.edu/~aiseminar/).

Next Tuesday (01/23), Ashique KhudaBukhsh (RIT) will be giving a talk
titled "Down the Toxicity Rabbit Hole: A Novel Framework to Bias Audit
LLMs".

///////////////////
*Talk Abstract:* How safe is generative AI for disadvantaged groups? This
paper conducts a bias audit of large language models (LLMs) through a
novel toxicity rabbit hole framework introduced here. Starting with a
stereotype, the framework instructs the LLM to generate more toxic content
than the stereotype. Every subsequent iteration it continues instructing
the LLM to generate more toxic content than the previous iteration until
the safety guardrails (if any) throw a safety violation or it meets some
other halting criteria (e.g., identical generation or rabbit hole depth
threshold). Our experiments reveal highly disturbing content, including but
not limited to antisemitic, misogynistic, racist, Islamophobic, and
homophobic generated content, perhaps shedding light on the underbelly of
LLM training data, prompting deeper questions about AI equity and alignment.

*Speaker Bio: *Ashique KhudaBukhsh is an assistant professor at the
Golisano College of Computing and Information Sciences, Rochester Institute
of Technology (RIT). His current research lies at the intersection of NLP
and AI for Social Impact as applied to: (i) globally important events
arising in linguistically diverse regions requiring methods to tackle
practical challenges involving multilingual, noisy, social media texts;
(ii) polarization in the context of the current US political crisis; and
(iii) auditing AI systems and platforms for unintended harms. In addition
to having his research been accepted at top artificial intelligence
conferences and journals, his work has also received widespread
international media attention that includes coverage from The New York
Times, BBC, Wired, Times of India, The Indian Express, The Daily Mail,
VentureBeat, and Digital Trends.
///////////////////

*In person: NSH 3305*
*Zoom Link:*
https://cmu.zoom.us/j/99510233317?pwd=ZGx4aExNZ1FNaGY4SHI3Qlh0YjNWUT09

- Victor Akinwande
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/ai-seminar-announce/attachments/20240116/65ad305e/attachment.html>