Forums

iPython notebook issues leading to breaking CPU limit

I was doing some data processing in some iPython notebooks. Everything was going fine, until I opened a different notebook, and started running the same functions I was using in other notebooks. At that time I was nowhere near my CPU limit (~23%). Then I started getting a bunch of errors in that notebook.

  • unable to import certain module dependencies (even though they imported fine in another notebook)
  • BlockingIOError: resource temporarily unavailable (one of my data processing functions uses a ProcessPoolExecutor), I got around this by using a different version of the function
  • a seeming benign code cell seemed to be hanging (all it did was make a matplotlib figure and plot some data)

I tried restarting and reconnecting to the ipython kernel, and that didn't seem to do anything. I eventually left that code block running, and now I apparently blew past my CPU limit.

I have no idea what's going on.

Hi paularcoleo, sorry about that- did you restart/reconnect just from within the jupyter notebook?

So one known issue we have right now is that jupyter notebooks consume a lot of CPU (because it is basically running a server etc in the background for you that your notebook connects to). To stop this, after you are done (or in this case trying to restart your notebook), one thing to do would be to go to the consoles tab, fetch your running processes, and kill the relevant jupyter processes (there will be a couple).

This will stop your CPUs from running up when you are not using the notebook, and may also help you restart more thoroughly.

At one point when I was trying to fix everything, I did look up my running processes, but it said the list was empty.

that is quite weird- I see you currently do have jupyter processes running. Is the issue fixed for you right now? One thing to double check when you next run your code- is it using the correct version of python, and are you using a python within your virtualenv or the system python etc? (that might account for unable to import modules)

Everything seems to be in order now - I can import the library I was having issues with, I'm not getting the BlockingIOError when trying to run my process pool executor, the matplotlib code block executed properly.

:)

I'm getting the blocking IO error again, and unable to run the same kind of matplotlib code block in a new ntoebook.

EDIT: and its working again.... I feel like I'm going crazy

Hi paularcoleo, Conrad et al.

This issue seems to be resolved but not for me ... Is is "still" solved? :) My Jupyter notebooks are consuming around 1sec/min (CPU/wall time) of processors. Killing the corresponding tasks in the Consoles/Running_processes stops the consumption but:

  • involved processes declare much less used time than the whole account does
  • killing one process at a time is not convenient for my use case (plan to use ipynb's for a bunch of students in my programming class in and out the classroom)

Is there something I can do?

Thanks in advance!

Unfortunately the 1 second-per-minute thing is still not fixed :-( What happens is that when a notebook is idle, it spawns a process every minute as part of a heartbeat system that Jupyter uses internally. This process uses up about 1 second of CPU, then exits -- which is why the processes in the list don't show the CPU usage; we only show the CPU used by currently-running processes, and the ones that are using up the 1s/min are ephemeral.

We're not sure what a good fix for this would be. It's certainly a problem, and does need a solution, but the CPU drain comes from something so deep inside Jupyter that we're not able to change it directly -- at least, without heavily customizing Jupyter, which we're loath to do. Maybe some kind of "shut all Jupyter stuff down" button on the notebook pages?

Hi Giles,

Thanks for your thorough answer. I gather that the problem has been studied and no quick solution is at hand (off the top of my head: the same routine that checks the kernel’s presence could kill the process if no activity is registered for the Nth time). Other than that, the “kill all student’s processes” button sounds good. Can the method be exposed through the API ? (for automation, I mean) I could craft a short script to manage that in a more precise manner, provided the API should expose the needed methods. If you prefer we could continue this in other thread.

Thanks again for taking care of this!

Hmm, that's an interesting point. The bulk of our development time is currently being sucked up by a regulatory change in the way we handle credit card payments, but one of the things we're considering for after that is making the API used by the "running processes" table an official one. That already has "list processes" and "kill process" verbs, but it's an internal-only API right now. If it became official and supported (and got a few extra features) then potentially it would make this kind of problem much easier to work around -- you could just say "list all of the notebook processes for this account" then iterate over them saying "kill this" in a really simple script.

I can't promise we'll be able to get something like that done quickly, but it does sound like it could potentially drop out easily from some exploratory work we already have underway.