<div dir="ltr"><div><span id="gmail-m_-8226866263796245709gmail-docs-internal-guid-de01524d-7fff-4447-d846-0991c222fc82"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">It's a great title! I believe the word that characterizes it is "sardonic."</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Yet speaking of not actually solving the stated problem, how is it possible that connectionists failed so miserably to get AI working?</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Check out Gary's other publications: </span><a href="https://scholar.google.com/citations?user=5Aut7EEAAAAJ&hl=en&oi=sra" target="_blank" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">https://scholar.google.com/citations?user=5Aut7EEAAAAJ&hl=en&oi=sra</span></a></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">He seriously knows what's up. Gary knows how to compute on the edge of chaos.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Me personally, having been trained in the Randy O'Reilly lab, I watched ten years of failure to get models working. </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">And then suddenly, Eureka! It was basically LSTM all along. But didn't Mozer have BPTT before that, and, guys, seriously, didn't we have the SVD in the late 1800s? “Oh, but you have to train it in a very special way! You need dropout and layer-wise semi-supervised pre-training.” I think the papers published before and since then have shown this to not be true. </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">"The Unreasonable Effectiveness of Data" by Norvig: </span><a href="https://research.google.com/pubs/archive/35179.pdf" target="_blank" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">https://research.google.com/pubs/archive/35179.pdf</span></a></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Um, OK, let's run with that. We are </span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-style:italic;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">still</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> messing around with MNIST (tiny data) and surprisingly, those methods tend to generalize. So while what Norvig is getting at regarding the effectiveness of simple models on big data is still a good point, why wasn't AI solved long before 1990? Overclocking CPUs is not a new thing. You had the speed, you had the data, and 640K of RAM is actually literally enough to get by on.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">In fact, the problem is exactly this simple: I recently had the idea to take the backprop implementation from scikit learn and implement the following pseudocode to solve mnist:</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Generate a random vector the size of an image with numbers of any scale or distribution (I did not try bignum, but did try very big ints). Train on this vector. Test on every image in the training set to see if any new images were correctly classified. If yes, keep the model, if no, revert to the previous model and generate a new random vector. Repeat until you master the training set, or get bored. And then observe that you generalize very well to the test set.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">That method is totally brain-dead. It <i>literally</i> worked on my first try. And I think we can do the same thing with random matrix multiplies. In fact, I have seen O'Reilly train brain models by approximating a correlation matrix by fiddling with basically a single floating point round-off error and checking to see if the correlations increased.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">And Sutton & Barto recently published that RL is </span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-style:italic;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">basically</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> just brute force. "RL" boils down to: when you get lucky, don't be an idiot and forget what you saw. This turns out to not be pushing the problem back into the human brain. It's actually that easy. And it's not just for MDP's - POMDPs feel the same love.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Personally, and I just feel the need to get this on the record, I think there's a third option, and it's literally The Third Option: </span><a href="https://en.wikipedia.org/wiki/The_Third_Option" target="_blank" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">https://en.wikipedia.org/wiki/The_Third_Option</span></a></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">I believe we had AI in the 70s and probably the 50s, 40s and maybe earlier to a point. Actually though, an "SVD" on punch cards would not actually be that onerous. Given that I was funded by the CIA (technically IARPA) and nearly every other defense org that funds academics, I'm wondering how it's possible that O'Reilly got all that defense funding in the first place while we seemingly thrashed our weights trying to solve an easy problem.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">In case I die tomorrow, let me just put the question out there, as your diligent and friendly connectionist moderator for over ten years now: Was "Connectionism" and "PDP" etc. actually a CIA operation? Is DeepMind a staged rollout of AI that's been backdoored by NIST-level mathematical geniuses? Is Schmidhuber then a spy as well?</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Actually, how many spies are on this list. Raise your hands :) I'm serious.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">It's actually an important question. The problem turned out to be so easy, and connectionists apparently failed so badly, that if we are really just now getting a handle on it we might actually be looking at a singularity. On the other hand, it would be </span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">amazingly good news</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> if the CIA got, say, 30 years or more ahead of this problem. Which is an excellent reason for me to ask in and of itself. Are we getting some help here from Uncle Sam or not?</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Sincerely,</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Brian Mingus</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><a href="http://linkedin.com/in/brianmingus" target="_blank">http://linkedin.com/in/brianmingus</a></span></p><a href="https://scholar.google.com/citations?user=T_sFnwoAAAAJ&hl=en" target="_blank">https://scholar.google.com/citations?user=T_sFnwoAAAAJ&hl=en</a><br></span></div><div><br></div><div>PS: Is anyone looking to hire a strong generalist in the, say, AI Safety space, or anything fun that doesn't lead to the end of the world? If so hit me up!</div></div>