Connectionists: Call for Anecdotes -- AI Finds A Way

Aaron Dharna aadharna at gmail.com
Sat Feb 10 18:10:37 EST 2024


Dear colleagues,

tl;dr Please submit (to aifindsaway at gmail.com) any stories you know of
where AI acted in a way that surprised its creators, especially if it could
be seen as unsafe (e.g. hacking a reward function, finding a loophole in an
environment or experimental design, goal misgeneralization, etc.).

Ever encountered an AI that cleverly outmaneuvered your experimental
design, or revealed unexpected flaws in your reward functions?

We (Aaron Dharna, Joel Lehman, Victoria Krakovna, and Jeff Clune) are
writing a paper about how AI Finds A Way to surprise us.

We're gathering such stories to expand our previous work The Surprising
Creativity of Digital Evolution: https://arxiv.org/abs/1803.03453 to the
deep learning setting, highlighting the importance of AI safety and the
unpredictable nature of our work.

We aim to record the true accounts of as many anecdotes as possible
regarding AI (of any type, including RL, ML, etc.) surprising its creators
and users. Therefore, your experiences are crucial for this endeavor.

We hope you can help create a definitive account of these fascinating and
sometimes ominous anecdotes so we can inform AI safety discussions, either
by submitting and/or spreading the word of this Call for Anecdotes.

Please send your anecdotes to aifindsaway at gmail.com by March 1st, 2024.

Please feel free to share the following call far and wide:
https://docs.google.com/document/d/1BhRWzkIYRUDjU5zon-ILXINPL4VqZp2JZXNsTjekBPk/edit?usp=sharing
<https://docs.google.com/document/d/1BhRWzkIYRUDjU5zon-ILXINPL4VqZp2JZXNsTjekBPk/edit?usp=sharing>

Let's illuminate the path forward together with insights from our
collective research adventures.

Cheers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20240210/c47ebbc8/attachment.html>


More information about the Connectionists mailing list