A Lot of Clusters are Non-functional

Predrag Punosevac predragp at andrew.cmu.edu
Fri May 12 15:41:33 EDT 2023


There is nothing to be thankful about. A few days worth of work was lost
due to misuse, lack of debugging, and poor design of experiments. Just
imagine the loss of revenue if these were iTunes servers running holiday
sales...

Predrag

On Fri, May 12, 2023 at 3:36 PM Chenghui Zhou <chenghuz at andrew.cmu.edu>
wrote:

> Thanks for looking into it! They seem to be working now!
>
> On Fri, May 12, 2023 at 3:34 PM Predrag Punosevac <predragp at andrew.cmu.edu>
> wrote:
>
>> Hi Chenghui,
>>
>>
>> Let me clarify my previous post. The number of machines were zombified by
>> users, not by me or some other supernatural power. I know better than to
>> change any configuration a few days before the major conference.
>>
>> As you might know, the cluster is a shared resource. I have zero control
>> over the behaviour of such a large, diverse group. I can  advise,
>> recommend, or educate about the topics I know something about. I don't
>> debug other people's code. I don't have any magic powers.
>>
>> Predrag
>>
>> On Fri, May 12, 2023 at 3:24 PM Chenghui Zhou <chenghuz at andrew.cmu.edu>
>> wrote:
>>
>>> Hello Predrag,
>>>
>>> As NeurIPS is nearing, it is really important for us to have clusters
>>> functioning to run some of the lat experiments. However, right now, I
>>> notice that at least 5 clusters that I'm able to log in do not have my
>>> working directory in it, same issue as Ifi reported. The rest of the
>>> functional clusters are all full. Could you please look into that, thank
>>> you!
>>>
>>> Chenghui
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20230512/ed16d269/attachment-0001.html>


More information about the Autonlab-users mailing list