ipython hangs on Auton cluster

Viraj Mehta virajm at andrew.cmu.edu
Wed Aug 19 19:35:07 EDT 2020


Actually, I figured this out: for everyone having trouble with iPython/jupyter, here’s a solution:

1. Get into a python environment that has ipython installed
2. Run `ipython profile create`
3. Run `cd ~/.ipython/profile_default`
4. Edit the file in there called ipython_config.py by finding the option for c.HistoryAccessor.hist_file and setting it to ‘:memory:’ ( this will mean your command history isn’t saved between ipython sessions but whatever, you could also point this at scratch).

Hope this is helpful — not sure how to add this to the wiki but it might be good to do so.

Viraj

> On Aug 19, 2020, at 5:46 PM, Predrag Punosevac <predragp at andrew.cmu.edu> wrote:
> 
> Your report indicates that my gut feeling that SQLite database is the culprit seems to be correct.  Per our documentation
> 
> https://www.autonlab.org/autonlab_wiki/aetiquette.html#don-ts <https://www.autonlab.org/autonlab_wiki/aetiquette.html#don-ts>
> 
> Use your scratch directory to store Jupiter sqlite database!
> 
> You placed your SQLite database onto the NFS share (zfsauton2) and you are surprised that it is incoherent. I hope you understand now better the lack of urgency in my responses. 
> 
> Predrag
> 
> On Wed, Aug 19, 2020 at 6:03 PM Viraj Mehta <virajm at andrew.cmu.edu <mailto:virajm at andrew.cmu.edu>> wrote:
> Hi Predrag & Users,
> 
> I have a clue as to what is wrong with our cluster. Had a few processes running which broke due to this sqlite error from ipython: <PastedGraphic-1.png>
> I’d imagine this is what is wrong with all our ipython stuff. No idea how to debug this, but I hope it can be helpful as we try to fix this.
> 
> Thanks,
> Viraj
> 
>> On Aug 18, 2020, at 10:28 PM, Chufan Gao <chufang at andrew.cmu.edu <mailto:chufang at andrew.cmu.edu>> wrote:
>> 
>> Hi All,
>> 
>> Rachel and I are also facing a similar issue with our Jupyter notebooks. 
>> We also both reinstalled jupyter with no effect.
>> For me, these notebooks are extremely helpful in fast code iteration and testing out concepts.
>> I also have the intuition that it is an upstream issue, as they were running fine (without any changes) before lop2 went down.
>> Would you please take another look?
>> 
>> Worst case, I have to convert my notebooks into .py files, which will slow things down.
>> 
>> Sincerely,
>> Chufan (Andy) Gao
>> From: Autonlab-users <autonlab-users-bounces at autonlab.org <mailto:autonlab-users-bounces at autonlab.org>> on behalf of Predrag Punosevac <predragp at andrew.cmu.edu <mailto:predragp at andrew.cmu.edu>>
>> Sent: Tuesday, August 18, 2020 10:35:11 PM
>> To: Viraj Mehta
>> Cc: users at autonlab.org <mailto:users at autonlab.org>
>> Subject: Re: ipython hangs on Auton cluster
>>  
>> Viraj Mehta <virajm at andrew.cmu.edu <mailto:virajm at andrew.cmu.edu>> wrote:
>> 
>> > I???m pretty sure it???s not an upstream bug, as many environments
>> > (conda and virtualenv) which were working with ipython across several
>> > python versions before are now not working.
>> > 
>> > I understand that ipython and ipdb aren???t typically required for
>> > Python workflows but certain efforts, like stepping through code that
>> > requires a GPU and loads a model from the Auton cluster, are difficult
>> > to debug without ipdb. Is there anything else that has changed that
>> > might have broken it?
>> 
>> Nothing that I am aware of. However, you do understand that the system
>> is very complex and it is like a live organism constantly morphing.
>> 
>> Best,
>> Predrag
>> 
>> 
>> 
>> > 
>> > Thanks,
>> > Viraj
>> > 
>> > > On Aug 18, 2020, at 6:21 PM, Predrag Punosevac <predragp at andrew.cmu.edu <mailto:predragp at andrew.cmu.edu>> wrote:
>> > > 
>> > > I looked a bit more carefully. It could be an upstream bug. It wouldn't be the first time
>> > > 
>> > > https://github.com/ipython/ipython/issues/11678 <https://github.com/ipython/ipython/issues/11678> <https://github.com/ipython/ipython/issues/11678 <https://github.com/ipython/ipython/issues/11678>>
>>  <https://github.com/ipython/ipython/issues/11678>	
>> ipython won't start · Issue #11678 · ipython/ipython · GitHub <https://github.com/ipython/ipython/issues/11678>
>> github.com <http://github.com/>
>> Now I'm facing that ipython won't start without any error messages. I tried to run it with DEBUG, then the command will be "uninterruptible sleep" after the logs. $ pyenv global system $ python --version Python 2.7.5 $ ipython --version ...
>> 
>>  <https://github.com/ipython/ipython/issues/11678>	
>> ipython won't start · Issue #11678 · ipython/ipython · GitHub <https://github.com/ipython/ipython/issues/11678>
>> github.com <http://github.com/>
>> Now I'm facing that ipython won't start without any error messages. I tried to run it with DEBUG, then the command will be "uninterruptible sleep" after the logs. $ pyenv global system $ python --version Python 2.7.5 $ ipython --version ...
>> 
>> 
>> > > 
>> > > You don't need ipython to run Python code. You could work and debug your code on your local machine and just run production code on the server. A typical python code is just a script starting with a shebang following with a path to the binaries. I fail to see how ipython could be useful for that. It is surely useful for the interactive work. 
>> > > 
>> > > Predrag
>> > > 
>> > > On Tue, Aug 18, 2020 at 5:45 PM Viraj Mehta <virajm at andrew.cmu.edu <mailto:virajm at andrew.cmu.edu> <mailto:virajm at andrew.cmu.edu <mailto:virajm at andrew.cmu.edu>>> wrote:
>> > > Tried this with 3.7 and 3.8 and it still hangs. Also if it???s a good clue, it doesn???t stop even if I send SIGINT or SIGQUIT. Not really sure what???s going on here.
>> > > 
>> > >> On Aug 18, 2020, at 4:39 PM, Viraj Mehta <virajm at andrew.cmu.edu <mailto:virajm at andrew.cmu.edu> <mailto:virajm at andrew.cmu.edu <mailto:virajm at andrew.cmu.edu>>> wrote:
>> > >> 
>> > >> Yeah, I???ll give it a shot. Thanks!
>> > >> 
>> > >>> On Aug 18, 2020, at 4:38 PM, Predrag Punosevac <predragp at andrew.cmu.edu <mailto:predragp at andrew.cmu.edu> <mailto:predragp at andrew.cmu.edu <mailto:predragp at andrew.cmu.edu>>> wrote:
>> > >>> 
>> > >>> I just upgraded all /opt/conda-py37 and /opt/conda-py38 packages on both GPU9 and GPU11. Could you please try again? Could you also try with py38 which is now recommended and report back. If this works I will upgrade packages across all servers. This could be potentially remotely related to the fact that Ifegenia could not build TensorFlow. Another thought is that the ipython SQLite database is corrupted. 
>> > >>> 
>> > >>> Best,
>> > >>> Predag
>> > >>> 
>> > >>> On Tue, Aug 18, 2020 at 4:34 PM Viraj Mehta <virajm at andrew.cmu.edu <mailto:virajm at andrew.cmu.edu> <mailto:virajm at andrew.cmu.edu <mailto:virajm at andrew.cmu.edu>>> wrote:
>> > >>> Hi Predrag,
>> > >>> 
>> > >>> Hope you???re doing well. I???ve been running into an issue the last couple days on the Auton cluster that is blocking my work on code that used to work and was hoping to get your thoughts. I have tried to distill this down to a small but replicable issue, as seen in the attachment, which I have seen hang on the ipython call on GPU9 and GPU11 so far. Do you know why this might be? Thanks.
>> > >>> 
>> > >>> Best,
>> > >>> Viraj
>> > >> 
>> > > 
>> > 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20200819/d76aa7f0/attachment-0001.html>


More information about the Autonlab-users mailing list