<div dir="ltr">The good news is that it involves only the machines you are listing. It seems that other machines were not affected. How sure are you that those scripts Ian and you were running don't involve heavy read/write?<div><br></div><div>Predrag</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jan 19, 2023 at 2:52 PM Predrag Punosevac <<a href="mailto:predragp@andrew.cmu.edu">predragp@andrew.cmu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi Viraj,<div><br></div><div>Sorry for a bit of delay. I was attending some NFS calls for proposals. I did a bit of poking. It looks like a tangled file system to me. I can't get </div><div><br></div><div>df -h</div><div><br></div><div>to produce the output. That is never a good sign. Autofs works as expected. I am surprised that more people didn't report this. Not sure what to do about it as the reboot is probably unwarranted. Somebody is really messing up with the file server. </div><div><br></div><div>Predrag</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jan 19, 2023 at 9:42 AM Viraj Mehta <<a href="mailto:virajm@andrew.cmu.edu" target="_blank">virajm@andrew.cmu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Predrag,<br>
<br>
Hope you are well this morning. I was kinda shocked to notice that I can’t access GPU nodes 2,3,4,11,12,17,20. I am not sure what caused this but I was running jobs on some of these machines that all stopped producing output around 8:23 last night. These are super critical for the ICML deadline next Thursday and I would like to restart them ASAP. I am not entirely sure what happened here as I don’t think they are terribly write-heavy or anything like that. Please let me know if they are able to be restored to normal function as I urgently need them. <br>
<br>
If I did anything that was responsible for them crashing, please let me know as well so I don’t repeat it. I am under some time pressure so am running a fairly large number of jobs right now.<br>
<br>
Thanks,<br>
Viraj</blockquote></div>
</blockquote></div>