Unable to access /zfsauton/datasets
    Predrag Punosevac 
    predragp at andrew.cmu.edu
       
    Mon Apr 18 15:41:17 EDT 2022
    
    
  
In the case anybody cares, the crash was caused by the failing 12TB HDD.
The failing HDD caused ZFS to work overtime in order to recover errors
which in turn lead to the excessive use of memory. Once all 192GB of RAM
and 16 GB of swap were used, the server crashed.
Also in the case anyone cares, the solution was emailed to me by our
sysinfo log monitoring system. I just had to interpret it.
Best,
Predrag
On Mon, Apr 18, 2022 at 1:21 PM Predrag Punosevac <predragp at andrew.cmu.edu>
wrote:
> /zfsauton/datasets are now available. I spoke too soon. The datasets NFS
> shares are hosted on the largest file server (Ourea) we had in the lab
> purchased 2.5 years ago by Dr. Schneider. That is a relatively new
> hardware. The server rebooted and it took a long time to clear the file
> system. I didn't find any reason for a reboot in the past 10-15 minutes
> since the server came back on line and I had access to it.
>
> I am not going to second guess today what was the problem. If one of the
> PMx guys wants to join this forensic investigation they are more than
> welcome.
>
> Predrag
>
> On Mon, Apr 18, 2022 at 11:33 AM Predrag Punosevac <
> predragp at andrew.cmu.edu> wrote:
>
>> I am aware. I am still investigating. The server stopped responding to
>> ping requests about an hour ago. I don't want to second guess but it looks
>> as a serious hardware issue. The server is 7 year old.
>>
>> On Mon, Apr 18, 2022, 11:26 AM Swapnil Pande <swapnilp at andrew.cmu.edu>
>> wrote:
>>
>>> Hi Predrag,
>>>
>>> Hope you're doing well.
>>>
>>> I am having trouble accessing datasets stored on `/zfsauton/datasets`.
>>> Running `ls` in /zfsauton seems to hang. Do you know what might be the
>>> problem?
>>>
>>> Thanks for your help!
>>>
>>> Regards,
>>> Swapnil
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20220418/c94fc118/attachment-0001.html>
    
    
More information about the Autonlab-users
mailing list