GPU10 and LOV6 added

Predrag Punosevac predragp at andrew.cmu.edu
Thu Dec 20 21:27:53 EST 2018


Dear Autonians,

I just semi-provisioned two more computing nodes to our computing
infrastructure. GPU10 and LOV6. You might log in and wonder around.
However, they are not fully functional at this time. This is full status
report

GPU10 is provisioned with 192 GB of RAM, 2 x Intel Xeon Silver 4110
(2.10 GHz, 8-Core, 11 MB Smart Cache ), and 4 GeForce 1081Ti GPU cards. 
It has 400GB of fast SSD scratch directory. At this time I am getting
the following output from NVIDIA driver

root at gpu10$ lspci |grep NVIDIA
3b:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX
1080 Ti] (rev a1)
3b:00.1 Audio device: NVIDIA Corporation GP102 HDMI Audio Controller
(rev a1)
86:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX
1080 Ti] (rev a1)
86:00.1 Audio device: NVIDIA Corporation GP102 HDMI Audio Controller
(rev a1)
af:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX
1080 Ti] (rev a1)
af:00.1 Audio device: NVIDIA Corporation GP102 HDMI Audio Controller
(rev a1)
root at gpu10$ nvidia-smi
Unable to determine the device handle for GPU 0000:3B:00.0: Unable to
communicate with GPU because it is insufficiently powered.
This may be because not all required external power cables are
attached, or the attached cables are not seated properly.


which indicates that server is not functional. I would have to open the
server and check if the cables are properly seated before I can say if
we are dealing with the real problem or not. 


LOV6 is a CPU computing node which currently have 192 GB of RAM and one 
Intel(R) Xeon(R) Gold 6152 CPU @ 2.10GHz with 22 cores (44 threads). I
have ordered yesterday another CPU which will arrive after the holidays.
I also have an additional 192 GB of RAM. Once the server is fully
functional it will have 88 treads and 384 GB of RAM. OS is running of
fast SSD.


Happy Holidays!
Predrag


More information about the Autonlab-users mailing list