<div dir="ltr">A gentle reminder that the talk will be tomorrow (Tuesday).</div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Apr 28, 2017 at 6:16 PM, Adams Wei Yu <span dir="ltr"><<a href="mailto:weiyu@cs.cmu.edu" target="_blank">weiyu@cs.cmu.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Dear faculty and students,</div><div><br></div><div>We look forward to seeing you Next Tuesday, May 2, at noon in NSH 3305 for AI lunch. To learn more about the seminar and lunch, please visit the AI Lunch webpage.</div><div><br></div><div>On Tuesday, <a href="http://www.cs.cmu.edu/~tdick/" target="_blank">Travis Dick</a> will give a talk : </div><div><br></div><div>Title: Data Driven Resource Allocation for Distributed Learning</div><div><br></div><div>Abstract: </div><div><div>The goal of distributed machine learning is to build useful models from more data than can be processed by a single machine. In this talk I will present a new data-dependent approach for partitioning large datasets onto multiple machines motivated by the fact that similar data points often belong to the same or similar classes, and more generally, classification rules of high accuracy tend to be "locally simple but globally complex" (Vapnik and Bottou, 1993). We present an in-depth analysis of our approach, provide new algorithms with provable worst-case guarantees, analysis proving existing scalable heuristics perform well in natural non worst-case conditions, and techniques for extending the partitioning of a small sample to the entire dataset. We overcome novel technical challenges to satisfy important conditions for accurate distributed learning, including fault tolerance and balancedness. We empirically compare our approach with baselines based on random partitioning, balanced partition trees, and locality sensitive hashing, showing that we achieve significantly higher accuracy on both synthetic and real world image and advertising datasets. We also demonstrate that our technique strongly scales with the available computing power.</div><div><br></div><div>This is joint work with Mu Li, Krishna Pillutla, Colin White, Nina Balcan, and Alex Smola. In Partial Fulfillment of the Speaking Requirement.</div></div><div><br></div><div><br></div></div>

</blockquote></div><br></div>