From awd at cs.cmu.edu Mon Dec 3 16:54:04 2018 From: awd at cs.cmu.edu (Artur Dubrawski) Date: Mon, 3 Dec 2018 16:54:04 -0500 Subject: Fwd: RI Ph.D. Thesis Defense: Matt Barnes In-Reply-To: References: Message-ID: Team, Do not miss this joyful event! Artur ---------- Forwarded message --------- From: Suzanne Muth Date: Mon, Dec 3, 2018 at 1:42 PM Subject: RI Ph.D. Thesis Defense: Matt Barnes To: ri-people at cs.cmu.edu Date: 10 December 2018 Time: 9:00 a.m. Place: GHC 8102 Type: Ph.D. Thesis Defense Who: Matt Barnes Topic: Learning with Clusters Abstract: Clustering, the problem of grouping similar data, has been extensively studied since at least the 1950's. As machine learning becomes more prominent, clustering has evolved from primarily a data analysis tool into an integrated component of complex robotic and machine learning systems, including those involving dimensionality reduction, anomaly detection, network analysis, image segmentation and classifying groups of data. With this integration into multi-stage systems comes a need to better understand interactions between pipeline components. Changing parameters of the clustering algorithm will impact downstream components and, quite unfortunately, it is usually not possible to simply backpropagate through the entire system. Instead, it is common practice to take the output of the clustering algorithm as ground truth at the next module of the pipeline. We show this false assumption causes subtle and dangerous behavior for even the simplest systems -- empirically biasing results by upwards of 25%. We address this gap by developing scalable estimators and methods to both quantify and compensate the impact of clustering errors on downstream learners. Our work is agnostic to the choice of other components of the machine learning systems, and requires few assumptions on the clustering algorithm. Theoretical and empirical results demonstrate our methods and estimators are superior to the current naive approaches, which do not account for clustering errors. We also develop several new clustering algorithms and prove theoretical bounds for existing algorithms, to be used as inputs to our error-correction methods. Not surprisingly, we find that learning on clusters of data is both theoretically and empirically easier as the number of clustering errors decreases. Thus, our work is two-fold. We attempt to provide the best clustering possible as well as establish how to effectively learn on inevitably noisy clusters. Thesis Committee Members: Artur Dubrawski, Chair Geoff Gordon Kris Kitani Beka Steorts, Duke University A copy of the thesis document is available at: *goo.gl/cNPSfY * -------------- next part -------------- An HTML attachment was scrubbed... URL: From eyolcu at cs.cmu.edu Tue Dec 4 13:28:03 2018 From: eyolcu at cs.cmu.edu (Emre Yolcu) Date: Tue, 4 Dec 2018 13:28:03 -0500 Subject: GPU1 Message-ID: Hi, Has anybody had any luck running things on GPU1? Judging by the 8 GPUs sitting idle I assume there's a problem. Emre -------------- next part -------------- An HTML attachment was scrubbed... URL: From predragp at andrew.cmu.edu Fri Dec 7 18:38:54 2018 From: predragp at andrew.cmu.edu (Predrag Punosevac) Date: Fri, 7 Dec 2018 18:38:54 -0500 Subject: GPU2 rebooted due to the kernel crash Message-ID: Somebody has done it again by passing too many jobs at once and GPU2 had to be hard rebooted. Best, Predrag From awd at cs.cmu.edu Sun Dec 9 14:42:29 2018 From: awd at cs.cmu.edu (Artur Dubrawski) Date: Sun, 9 Dec 2018 14:42:29 -0500 Subject: Fwd: RI Ph.D. Thesis Defense: Matt Barnes In-Reply-To: References:

Message-ID: Team, this is a reminder about Matt's thesis defense tomorrow (Monday) at 9am in Gates Hall 8102. See you there! Artur ---------- Forwarded message --------- From: Artur Dubrawski Date: Mon, Dec 3, 2018 at 4:54 PM Subject: Fwd: RI Ph.D. Thesis Defense: Matt Barnes To: Team, Do not miss this joyful event! Artur ---------- Forwarded message --------- From: Suzanne Muth Date: Mon, Dec 3, 2018 at 1:42 PM Subject: RI Ph.D. Thesis Defense: Matt Barnes To: ri-people at cs.cmu.edu Date: 10 December 2018 Time: 9:00 a.m. Place: GHC 8102 Type: Ph.D. Thesis Defense Who: Matt Barnes Topic: Learning with Clusters Abstract: Clustering, the problem of grouping similar data, has been extensively studied since at least the 1950's. As machine learning becomes more prominent, clustering has evolved from primarily a data analysis tool into an integrated component of complex robotic and machine learning systems, including those involving dimensionality reduction, anomaly detection, network analysis, image segmentation and classifying groups of data. With this integration into multi-stage systems comes a need to better understand interactions between pipeline components. Changing parameters of the clustering algorithm will impact downstream components and, quite unfortunately, it is usually not possible to simply backpropagate through the entire system. Instead, it is common practice to take the output of the clustering algorithm as ground truth at the next module of the pipeline. We show this false assumption causes subtle and dangerous behavior for even the simplest systems -- empirically biasing results by upwards of 25%. We address this gap by developing scalable estimators and methods to both quantify and compensate the impact of clustering errors on downstream learners. Our work is agnostic to the choice of other components of the machine learning systems, and requires few assumptions on the clustering algorithm. Theoretical and empirical results demonstrate our methods and estimators are superior to the current naive approaches, which do not account for clustering errors. We also develop several new clustering algorithms and prove theoretical bounds for existing algorithms, to be used as inputs to our error-correction methods. Not surprisingly, we find that learning on clusters of data is both theoretically and empirically easier as the number of clustering errors decreases. Thus, our work is two-fold. We attempt to provide the best clustering possible as well as establish how to effectively learn on inevitably noisy clusters. Thesis Committee Members: Artur Dubrawski, Chair Geoff Gordon Kris Kitani Beka Steorts, Duke University A copy of the thesis document is available at: *goo.gl/cNPSfY * -------------- next part -------------- An HTML attachment was scrubbed... URL: From predragp at andrew.cmu.edu Wed Dec 12 14:57:22 2018 From: predragp at andrew.cmu.edu (Predrag Punosevac) Date: Wed, 12 Dec 2018 14:57:22 -0500 Subject: Clock on compute nodes In-Reply-To: References: Message-ID: <20181212195722.v-RRWyPhT%predragp@andrew.cmu.edu> Emre Yolcu wrote: > Hi Predrag, > > This is a very minor annoyance, so feel free to ignore if you're busy. > Somebody brought up before that time was set incorrectly on some compute > nodes. For instance, it seems gpu9 is 3 minutes ahead of Pittsburgh time, > and gpu8 is 9 minutes ahead. When you have time could you check whether the > sync is working correctly? Fixed! Thank you so much for the report. Clock drifting is not a minor issue and it is very serious security problem and the best indication of failing hardware. Unfortunately CMU School of Computer Science is blocking NTP protocol so I run two internal NTP servers which synchronize time via HTTP protocol as well some other tricks. In this particular instance the culprit(s) were dead chronicd clients on GPU8 and GPU9 machines. I fixed them now. It is also interesting to remind everyone how General Theory if relativity affects our computing infrastructure. The biggest scientific leap General Relativity Theory made was abandoning the concept of absolute time. Absolute space concept was abandoned already in special theory of relativity. The concept of absolute space-time was introduced by Isaac Newton in order to preserve Galilean principle of relativity which was supposed to be 4th Newton's law (originally he wanted to have 6 before reducing to three). One may refer to Arnold's Mathematical Methods of Classical Mechanics to learn about those delicate points. Most graduate text in physics a mute about those points which makes them irrelevant. Only Poincare was able to demonstrate invariance of law's of nature without Galilean principle of relativity thereby giving a major push to Theory of Relativity. Long story short time depends on the inertial frame which means that no two computers have the same time. Clock synchronization is essentially averaging in order to enable us to work on the network. Cheers, Predrag > > Best, > > Emre From awd at cs.cmu.edu Wed Dec 12 15:42:24 2018 From: awd at cs.cmu.edu (Artur Dubrawski) Date: Wed, 12 Dec 2018 15:42:24 -0500 Subject: Clock on compute nodes In-Reply-To: <20181212195722.v-RRWyPhT%predragp@andrew.cmu.edu> References: <20181212195722.v-RRWyPhT%predragp@andrew.cmu.edu> Message-ID: You never know what you are going to get when you ask a sysadmin to please fix a computer clock :) Thanks Predrag and Emre. Artur On Wed, Dec 12, 2018 at 2:58 PM Predrag Punosevac wrote: > Emre Yolcu wrote: > > > Hi Predrag, > > > > This is a very minor annoyance, so feel free to ignore if you're busy. > > Somebody brought up before that time was set incorrectly on some compute > > nodes. For instance, it seems gpu9 is 3 minutes ahead of Pittsburgh time, > > and gpu8 is 9 minutes ahead. When you have time could you check whether > the > > sync is working correctly? > > Fixed! Thank you so much for the report. Clock drifting is not a minor > issue and it is very serious security problem and the best indication of > failing hardware. Unfortunately CMU School of Computer Science is > blocking NTP protocol so I run two internal NTP servers which > synchronize time via HTTP protocol as well some other tricks. In this > particular instance the culprit(s) were dead chronicd clients on GPU8 > and GPU9 machines. I fixed them now. > > It is also interesting to remind everyone how General Theory if > relativity affects our computing infrastructure. The biggest scientific > leap General Relativity Theory made was abandoning the concept of > absolute time. Absolute space concept was abandoned already in special > theory of relativity. The concept of absolute space-time was introduced > by Isaac Newton in order to preserve Galilean principle of relativity > which was supposed to be 4th Newton's law (originally he wanted to have > 6 before reducing to three). One may refer to Arnold's Mathematical > Methods of Classical Mechanics to learn about those delicate points. > Most graduate text in physics a mute about those points which makes them > irrelevant. > > Only Poincare was able to demonstrate invariance of law's of nature > without Galilean principle of relativity thereby giving a major > push to Theory of Relativity. > > Long story short time depends on the inertial frame which means that no > two computers have the same time. Clock synchronization is essentially > averaging in order to enable us to work on the network. > > Cheers, > Predrag > > > > > > Best, > > > > Emre > -------------- next part -------------- An HTML attachment was scrubbed... URL: From awd at cs.cmu.edu Thu Dec 13 15:01:30 2018 From: awd at cs.cmu.edu (Artur Dubrawski) Date: Thu, 13 Dec 2018 15:01:30 -0500 Subject: Fwd: Introducing the AI Transformation Playbook In-Reply-To: References: Message-ID: Cool stuff coming from a former Autonian! Artur ---------- Forwarded message --------- From: Andrew Ng Date: Thu, Dec 13, 2018 at 2:10 PM Subject: Introducing the AI Transformation Playbook To: Introducing the AI Transformation Playbook Dear friends, I'm excited to share with you the new AI Transformation Playbook . Drawn from my experience leading Google Brain, Baidu's AI Group, and Landing AI, this 5-step Playbook provides a roadmap for your company to transform into a great AI company. You might already realize from *Machine Learning Yearning* that AI teams need to develop new workflows to build successful AI projects. The Playbook will teach you how to build upon single AI projects to help a whole company use AI. I hope that this Playbook will help you usher your company into the AI era. Andrew Ng Download a free copy of the AI Transformation Playbook [image: Facebook] [image: Twitter] [image: LinkedIn] *Copyright ? 2018 Landing AI, All rights reserved.* 2445 Faber Place, Suite 200, Palo Alto, CA 94303 Want to change how you receive these emails? You can update your preferences or unsubscribe from this list . -------------- next part -------------- An HTML attachment was scrubbed... URL: From predragp at andrew.cmu.edu Thu Dec 20 21:27:53 2018 From: predragp at andrew.cmu.edu (Predrag Punosevac) Date: Thu, 20 Dec 2018 21:27:53 -0500 Subject: GPU10 and LOV6 added Message-ID: <20181221022753.nJ8E9VD3g%predragp@andrew.cmu.edu> Dear Autonians, I just semi-provisioned two more computing nodes to our computing infrastructure. GPU10 and LOV6. You might log in and wonder around. However, they are not fully functional at this time. This is full status report GPU10 is provisioned with 192 GB of RAM, 2 x Intel Xeon Silver 4110 (2.10 GHz, 8-Core, 11 MB Smart Cache ), and 4 GeForce 1081Ti GPU cards. It has 400GB of fast SSD scratch directory. At this time I am getting the following output from NVIDIA driver root at gpu10$ lspci |grep NVIDIA 3b:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1) 3b:00.1 Audio device: NVIDIA Corporation GP102 HDMI Audio Controller (rev a1) 86:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1) 86:00.1 Audio device: NVIDIA Corporation GP102 HDMI Audio Controller (rev a1) af:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1) af:00.1 Audio device: NVIDIA Corporation GP102 HDMI Audio Controller (rev a1) root at gpu10$ nvidia-smi Unable to determine the device handle for GPU 0000:3B:00.0: Unable to communicate with GPU because it is insufficiently powered. This may be because not all required external power cables are attached, or the attached cables are not seated properly. which indicates that server is not functional. I would have to open the server and check if the cables are properly seated before I can say if we are dealing with the real problem or not. LOV6 is a CPU computing node which currently have 192 GB of RAM and one Intel(R) Xeon(R) Gold 6152 CPU @ 2.10GHz with 22 cores (44 threads). I have ordered yesterday another CPU which will arrive after the holidays. I also have an additional 192 GB of RAM. Once the server is fully functional it will have 88 treads and 384 GB of RAM. OS is running of fast SSD. Happy Holidays! Predrag From predragp at andrew.cmu.edu Thu Dec 20 21:31:34 2018 From: predragp at andrew.cmu.edu (Predrag Punosevac) Date: Thu, 20 Dec 2018 21:31:34 -0500 Subject: ARI dead RAM Message-ID: <20181221023134.iWRsCQ5-d%predragp@andrew.cmu.edu> Dear Autonians, When trying to log into ari.int.autonlab.org you might notice a warning. I am aware of a dead RAM module. I am dealing with the vendor. It will take some time to get the replacement. Cheers, Predrag