<div dir="ltr"><div class="gmail_default" style=""><div class="gmail_default" style="color:rgb(11,83,148)">Dear faculty and students:<br><br>We look forward to seeing you next Tuesday, Nov. 5th, at noon in <b>NSH 3305 </b>for our <span class="gmail-il">AI</span> <span class="gmail-il">Seminar</span> sponsored by Apple. To learn more about the <span class="gmail-il">seminar</span> series, please visit the <a href="http://www.cs.cmu.edu/~aiseminar/" target="_blank">website</a>. </div><div class="gmail_default" style="color:rgb(11,83,148)">On Tuesday, Vaishnavh Nagarajan will give the following talk:</div><div class="gmail_default" style=""><div style="color:rgb(11,83,148)"><b>Title: </b>Uniform Convergence May Be Unable to Explain Generalization in Deep Learning</div><div style=""><br><b style="color:rgb(11,83,148)">Abstract:</b><font color="#0b5394"> In this talk, I will present our work that casts doubt on the ongoing pursuit of using uniform convergence to explain generalization in deep learning.<br><br>Over the last couple of years, research in deep learning theory has focused on developing newer and more refined generalization bounds (using Rademacher complexity, covering numbers, PAC-Bayes etc.,) to help us understand why overparameterized deep networks generalize well. Although these bounds are quite different on the surface, essentially, they are 'implementations' of a single learning-theoretic technique called uniform convergence.<br><br>While it is well-known that many of these existing bounds are numerically large, through a variety of experiments, we first bring to light another crucial and more concerning aspect of these bounds: in practice, these bounds can increase with the dataset size. Guided by these observations, we then present specific scenarios where uniform convergence provably fails to explain generalization in deep learning. That is, in these scenarios, even though a deep network trained by stochastic gradient descent (SGD) generalizes well, any uniform convergence bound would be vacuous, however carefully it is applied.<br><br>Through our work, we call for going beyond uniform convergence to explain generalization in deep learning.<br><br>This is joint work with Zico Kolter.</font></div></div></div>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><span style="font-size:13px;border-collapse:collapse;color:rgb(136,136,136)"><b>Han Zhao<br>Machine Learning Department</b></span></div><div><span style="font-size:13px;border-collapse:collapse;color:rgb(136,136,136)"><b>School of Computer Science<br>Carnegie Mellon University<br>Mobile: +1-</b></span><b style="color:rgb(136,136,136);font-size:13px">412-652-4404</b></div></div></div></div></div></div>