gpu24 and gpu25 added to the cluster

Jeff Schneider jeff4 at andrew.cmu.edu
Wed Dec 15 23:16:13 EST 2021


This is great news Predrag!!  Thanks for all your work getting these 
into production!

Jeff.


On 12/15/2021 10:21 PM, Predrag Punosevac wrote:
> Dear Autonians,
>
> I just finished provisioning two new GPU nodes. The purchase was 
> approved by Dr. Schneider in July but the order was not placed until 
> late August due to CMU internal issues just in time to be affected by 
> supply chain disruption. The servers were finally shipped on 11/24/2021
> and received last Wednesday 12/8/2021. To add the final insult to the 
> injury the nodes were not tagged until Monday afternoon. I had 
> literally to hunt down people to do the work.
> I spent half a day yesterday getting power cables and other misc 
> supplies. Thus they are only done today. However, I think they are 
> definitely worth the trouble.
>
> Each server comes with 8 NVIDIA RTX A6000 connected by high-speed GPU 
> interconnect NVIDIA links beside PCIe. Each server has 2 AMD EPYC 7502 
> 32-Core Processors for a total of 128 threads per server. These CPUs 
> are almost as fast as your desktop processors 3.5 GHz.
> Each server has 512GB of RAM and 2TB of scratch. These servers have 
> 24 2'5" HDD bays so they could potentially be used as a storage space. 
> I don't have 2'5" HDDs in the lab right now to populate the bays.
>
> There is one thing which is for now done suboptimally. Namely the 
> servers were shipped with 1Gbs copper NIC and 10Gbs fiber optical NIC. 
> I could not locate long enough optical cables in our lab yesterday but 
> I will try to address this issue soon. I have exactly 2 optical 
> connectors on the switch so it is down to cabling.
>
> Have fun and sorry for a long delay.
>
> Predrag


More information about the Autonlab-users mailing list