autofs issues

Predrag Punosevac predragp at andrew.cmu.edu
Tue May 5 14:16:09 EDT 2020


Hi Ifi,

Thank you for bringing this to my attention. autosf demon appears to
be broken at least on the machines running Red Hat 8.1. (gpu15-gpu19).
I have upgraded and rebooted gpu15 and it didn't fix the problem. I
started autosf from the command line. These symptoms are typical when
upstream breaks SEL Linux policies. Knowing how Red Hat operates I
would not hold my breath  that this will be fixed very soon. No big
deal I will start autosf manually on all affected machines

GPU15-19

All CPU nodes have been upgraded last week to Red Hat 7.8 and they
seems to work find. GPU1-14 run mostly Red Hat 7.7 as the upgrade is
time consuming due to the proprietary Nvidia drivers.

Please report issues like this without second thought as that is the
only way to fix them.

Best,
Predrag

On Tue, May 5, 2020 at 9:44 AM Ifigeneia Apostolopoulou
<iapostol at andrew.cmu.edu> wrote:
>
> Hi (again!) Predrag and hope you are well!
>
> I think you were right:  tmux was the culprit. But I'm still facing a problem for gpu15,gpu17,gpu18. any suggestions? see below
>
> Thanks (again!) :)
>
>
> Last login: Tue May  5 08:40:24 2020 from 192.168.6.115
>
> Could not chdir to home directory /zfsauton3/home/iapostol: No such file or directory
>
> iapostol at gpu15$ pwd
>
> /
>
> iapostol at gp
>
>
>
>
>
> On Wed, Apr 29, 2020 at 8:27 AM Predrag Punosevac <predragp at andrew.cmu.edu> wrote:
>>
>> Ifigeneia Apostolopoulou <iapostol at andrew.cmu.edu> wrote:
>>
>> > Hi Predrag,
>> >
>>
>> Hi Ifi
>>
>> > I just wanted to bring it to your attention:
>> >
>> > no gateway is currently working for me. I may occasionally be able to
>> > login
>>
>> I just checked
>>
>> lop2.autonlab.org
>> lop1.autonlab.org
>> lion.auton.cs.cmu.edu
>>
>> and I have no problem login. I have logged first with my regular account
>> to eliminate possibility that LDAP services are down. Then I have used
>> my root account to log as a you to check autofs daemon.  Please see
>> below.
>>
>> root at lop2$ su - iapostol
>> Last login: Wed Apr 29 07:47:06 EDT 2020 from
>> c-73-154-131-241.hsd1.pa.comcast.net on pts/25
>> root at lop2$ pwd
>> /zfsauton3/home/iapostol
>>
>>
>> lop1# su - iapostol
>> -bash-5.0$ pwd
>> /zfsauton3/home/iapostol
>> -bash-5.0$ uname -a
>> OpenBSD lop1.int.autonlab.org 6.6 GENERIC.MP#8 amd64
>>
>>
>>
>>
>> [root at lion ~]# su - iapostol
>> Last failed login: Wed Apr 29 00:34:26 EDT 2020 from
>> c-73-154-131-241.hsd1.pa.comcast.net on ssh:notty
>> There were 4 failed login attempts since the last successful login.
>> -bash-4.2$ pwd
>> /zfsauton3/home/iapostol
>> -bash-4.2$ uname -a
>> Linux lion.auton.cs.cmu.edu 3.10.0-1127.el7.x86_64 #1 SMP Wed Apr 8
>> 08:26:53 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
>>
>>
>> Now lion is showing an interesting output. It shows that you had tried 4
>> times to log with incorrect credentials. That would definitely put you
>> on the banned list at least for a while. However, if I have to put my
>> money on your problems I would guess that there is a DNS problem. I made
>> sure that the Auton Lab DNS servers are working as advertised so I will
>> point to your personal DNS. There is some remote chance that you are
>> experiencing that weird routing problem, reported by ram and me, when
>> NSA breaks CMU routing tables and blocks bunch of residential ISP from
>> reaching CMU.
>>
>>
>> > but still can't find anything in my home directory ://
>> >
>> >
>> > iapostol at lop2.autonlab.org:/zfsauton3/home/iapostol/
>> >
>> > iapostol at lop2.autonlab.org's password:
>> >
>> > Could not chdir to home directory /zfsauton3/home/iapostol: Input/output
>> > error
>>
>>
>> This was actually more interesting part of your report. I immediately
>> assumed that my auto.nfs file got corrupted or that autofs daemon is not
>> working properly. I had a problem with autofs on lion.auton.cs.cmu.edu
>> so I am not running it out of systemd. It is manually started. However,
>> as of this morning autofs works both on lop2.autonlab.org and
>> lion.auton.cs.cmu.edu which you can see from the above output.
>>
>> lop1.autonlab.org doesn't run autofs daemon as it is runs of OpenBSD
>> which doesn't have a modern autofs daemon. In order for you to log into
>> lop1.autonlab.org I created a tiny local home directories which are
>> needed for you to ssh to computing nodes. I would not expect that you
>> see anything inside your home directory on lop1.autonlab.org.
>>
>>
>> > > Just a quick heads up. bash.autonlab.org (my desktop) just crashed
>> > > again. I have no idea what happened nor I care too much about it. There
>> > > are other three shell gateways.
>>
>> I do know what is the problem with bash. Bash is a NUC machine. After
>> upgrade to Red Hat 7.7 a network driver regression (reported by multiple
>> people including me) was introduced which caused network interface to
>> crap out. I typically manually select an older stable kernel when I
>> reboot bash but this time around I realized that somebody else rebooted
>> machine and grub boot-loader just picked new broken kernel. That thin is
>> now going to rotten for a while. FYI Red Hat dismissed my bug report
>> since we are not paid customers :-) For Red Hat/IBM the problem doesn't
>> exist.
>>
>>
>> Best,
>> Predrag
>>
>> P.S. I almost forgot. If you had things running out of tmux or screen
>> make sure you log out first before you try to recommenct to the Auton
>> Lab. I have seen all sorts of weird things happening because of that.


More information about the Autonlab-users mailing list