apc00EDFC power overloading
Predrag Punosevac
predragp at andrew.cmu.edu
Sun Apr 18 23:33:34 EDT 2021
Dear Autonians,
Just a quick heads up. About 10 minutes ago I started getting warnings that
BANK1 on the PDU apc00EDFC is near overload. The following GPU machines are
connected to that bank
gpu21, gpu22, and gpu23. I have never seen this before. These gpu servers
are 50K machines and are probably drawing more electricity than older GPU
nodes. They must have never been hit as hard as today for me to see
immanent power outrage. If you are running anything on those GPU nodes plus
gpu20 which is connected to BANK2 on the same PDU please try to scale down
a bit. Otherwise, it is probably safe to assume that servers will not live
long enough to produce results.
Best,
Predrag
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20210418/990feb7b/attachment.html>
More information about the Autonlab-users
mailing list