Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel not restarting sometimes #13336

Closed
OverLordGoldDragon opened this issue Jul 19, 2020 · 23 comments · Fixed by #19411
Closed

Kernel not restarting sometimes #13336

OverLordGoldDragon opened this issue Jul 19, 2020 · 23 comments · Fixed by #19411

Comments

@OverLordGoldDragon
Copy link
Contributor

OverLordGoldDragon commented Jul 19, 2020

import tensorflow is one way to reproduce: clip. Further, 4.1.4 prints INFO-level messages (also shown), which didn't occur in 4.1.3, now needing manually setting os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1' to disable.

It also doesn't always reproduce; still unsure how it works exactly.

@OverLordGoldDragon OverLordGoldDragon changed the title Kernel not restarting (4.1.4) Kernel not restarting sometimes (4.1.4) Jul 19, 2020
@texadactyl
Copy link

I have the same symptom in the IDE.
Now, when I try it interactively with python3, I see this (Linux):

Python 3.8.2 (default, Apr 27 2020, 15:53:34) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
Illegal instruction (core dumped)

Tensorflow is HUGE. My RAM is limited to 8GB. Also, there might be something going on in the tensorflow package. In any case, this is not a Spyder bug. I realize that this is not the reply you hoped for.

How much RAM do you have? What is your OS?

@texadactyl
Copy link

Have a look here: https://github.com/tensorflow/tensorflow/issues

@texadactyl
Copy link

texadactyl commented Jul 19, 2020

Also, you could try this in case pypi is out of date:
pip3 install git+https://github.com/tensorflow/tensorflow

I am doing it myself.

Well, that didn't work out. Blew up another way.

Trying out tensorflow-cpu (omits the GPU portion).

That didn't work out either. Giving up on installing locally.
Works fine here: https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/quickstart/beginner.ipynb

Good luck!

@OverLordGoldDragon
Copy link
Contributor Author

@texadactyl It's not a TensorFlow bug; this didn't occur in Spyder 4.1.3. So it's either Spyder or its dependencies. And TensorFlow doesn't eat much RAM from simply being imported.

@texadactyl
Copy link

How do you explain crashing in python3 when Spyder is not involved?

@OverLordGoldDragon
Copy link
Contributor Author

@texadactyl Bad install or dependencies, most likely. My Task Manager shows a 120MB increase in RAM usage from it - so unless your device can't afford it, import tensorflow alone shouldn't fail.

@texadactyl
Copy link

Have you tried running python3 interactively in a terminal window.
Just do this:
import tensorflow

What happens?

@texadactyl
Copy link

This issue looks similar to yours: #12768

@texadactyl
Copy link

Also, have you tried this in a Jupyter notebook?

@OverLordGoldDragon
Copy link
Contributor Author

@texadactyl My RAM figure is from running Python in Anaconda Powershell Prompt. The linked issue has an unrelated exception (I don't have any exception). Jupyter works fine.

@texadactyl
Copy link

So, import tensorflow works fine in Juypter. Also, it works fine from a DOS command line.
You could try this: In Anaconda, uninstall Spyder then reinstall it.

@ccordoba12
Copy link
Member

@OverLordGoldDragon, please share with us the specs of your conda environment, so I can try to reproduce this problem on my side.

For that, please run the following command from the Anaconda prompt:

conda env export -n my-env > env.txt

replacing my-env with your env name. Afterwards, please upload the file called env.txt here, which should have been created in the same directory where you run the previous command.

@OverLordGoldDragon
Copy link
Contributor Author

OverLordGoldDragon commented Jul 21, 2020

@ccordoba12 Better yet, I'll give the exact steps for recreating the environment; base is unchanged from a fresh Anaconda install with Anaconda3-2020.02-Windows-x86_64.exe. To note, this isn't reproduced with TensorFlow 1.14.0, but 2.2.0 isn't installed with conda; regardless, it was a non-issue with Spyder 4.1.3.

conda create --name tf2_env --clone base
conda activate tf2_env
conda install anaconda
conda update --all
pip install tensorflow==2.2.0

Run spyder, then import tensorflow, and try to restart kernel. My env.yml (I already have tf2_env so I added 2).

@ccordoba12
Copy link
Member

pip install tensorflow==2.2.0

This is not correct because you're mixing pip and conda packages, and that's what's probably making the kernel crash. Please create a new environment, install tensorflow with conda and try again.

@OverLordGoldDragon
Copy link
Contributor Author

@ccordoba12 I understand it's not recommended to do so, but some necessary packages are unavailable with Anaconda and I've been using pip for a long time; currently conda supports only up to TF 2.1.0 on Windows. This same install (and much more pip's) worked fine with Spyder 4.1.3. I'm fine downgrading, but unsure how.

@ccordoba12
Copy link
Member

I'm fine downgrading, but unsure how.

You can create a new env with

conda create -n my-env spyder=4.1.3

@OverLordGoldDragon
Copy link
Contributor Author

It... fixed itself? Haven't seen it in past two days despite intensive use, and I haven't installed / uninstalled anything in the env. I'd give it more time to be sure, but still strange.

@ccordoba12 ccordoba12 added this to the v4.2.0 milestone Aug 9, 2020
@OverLordGoldDragon
Copy link
Contributor Author

OverLordGoldDragon commented Aug 13, 2020

@ccordoba12 Nevermind, it's been back - on and off, no clue what the deal is. Not a bother so long as I can keep spawning more console windows though. Reproduced with TF 2.2.0 and 2.3.0.

@OverLordGoldDragon
Copy link
Contributor Author

OverLordGoldDragon commented Oct 24, 2020

@ccordoba12 Found a different way to reproduce: importing a compiled .c module (.pyd) and calling its method.

C code
#define PY_SSIZE_T_CLEAN
#include <Python.h>

static PyObject *
spam_system(PyObject *self, PyObject *args)
{
    const char *command;
    int sts;

    if (!PyArg_ParseTuple(args, "s", &command))
        return NULL;
    sts = system(command);
    return PyLong_FromLong(sts);
}

static PyMethodDef SpamMethods[] = {
    {"system",  spam_system, METH_VARARGS,
     "Execute a shell command."},
    {NULL, NULL, 0, NULL}        /* Sentinel */
};

static struct PyModuleDef spammodule = {
    PyModuleDef_HEAD_INIT,
    "spam",   /* name of module */
    NULL,     /* module documentation, may be NULL */
    -1,       /* size of per-interpreter state of the module,
                 or -1 if the module keeps state in global variables. */
    SpamMethods
};

PyMODINIT_FUNC
PyInit_spam(void)
{
    return PyModule_Create(&spammodule);
}
setup.py
from distutils.core import setup, Extension

module1 = Extension('spam',
                    sources = ['spammodule.c'])

setup (name = 'PackageName',
       version = '1.0',
       description = 'This is a demo package',
       ext_modules = [module1])

Run python setup.py build, then drag the proper .pyd out of build so that it's visible to Python, then in separate .py, run

import spam
spam.system('h')

TensorFlow does import from .pyd, so this might explain the problem (though, wasn't it working for older TFs? They also import .pyd, but above script actually errors, so maybe it reacts to errors - dunno).

@ccordoba12 ccordoba12 modified the milestones: v4.2.1, v4.2.2 Dec 9, 2020
@loewenm
Copy link

loewenm commented Mar 29, 2021

Update as of 2021/03/29

This issue still persists. I realize that there's likely a ton of other bugs to squish and we really appreciate the work you guys are doing. That said, the latest versions of Tensorflow/Keras have some really nice features that I (we'd) love to take advantage of...

As far as I can tell, the latest conda build does not have the latest TF/Keras build and still requires pip to install. Here's my dependencies, which is mostly the latest conda build plus pip-installed TF/keras, SHAP, XGBoost and upgraded qtconsole from 5.0.2 to 5.0.3 (which isn't available in conda either).

(see attached txt file to save space)
python3_8.txt

##########

Edit: @ccordoba12, I wanted to provide some context/clarity to the above. I believe the issue comes down to conda's availability of Tensorflow versions, which is different depending on the user's operating system.

As of the timing of this comment, conda will download and install tensorflow 2.4.1 for Linux only... Windows users receive 2.3.0 and MacOS receive 2.0.0. See the following URL for reference: https://anaconda.org/anaconda/tensorflow (this'll likely change in the future)

Tensorflow 2.4.1 for Windows does exist and can be installed using pip, but not conda.

Also, I conda update --all'd today, which upgraded qtconsole to 5.0.3 on my base environment. That's great, because now we wont get that annoying warning when we boot up Spyder!

@ccordoba12 ccordoba12 added this to the 4.x milestone Mar 29, 2021
@loewenm
Copy link

loewenm commented Apr 5, 2021

@ccordoba12

I realize that the team is busy with the 5.0 release, which won't load due to an issue with No module named 'qdarkstyle.colorsystem'. Consequently, I have found a solution to this existing thread for v4.x.

After testing the install of Tensorflow with Conda (instead of pip), the kernel continues to hang/not restart when requested. The solution to this is to explicitly call exit() in console. This seems to work (i.e. restart the kernel) in Spyder when using Tensorflow/Keras.

I hope this helps someone else.

@muriloasouza
Copy link

muriloasouza commented Feb 6, 2022

Problem still persists with Spyder 5.2.0.

@loewenm Good advice! You can also close the console tab (clicking on X) instead of typing exit() in the console window.

I got this message today after trying to restart:

Exception in thread Thread-8:
Traceback (most recent call last):
  File "C:\Users\Muril\miniconda3\envs\tf-keras-gpu\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Users\Muril\miniconda3\envs\tf-keras-gpu\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Muril\miniconda3\envs\tf-keras-gpu\lib\site-packages\spyder_kernels\comms\frontendcomm.py", line 124, in poll_thread
    self.poll_one()
  File "C:\Users\Muril\miniconda3\envs\tf-keras-gpu\lib\site-packages\spyder_kernels\comms\frontendcomm.py", line 144, in poll_one
    self._comm_close(msg)
  File "C:\Users\Muril\miniconda3\envs\tf-keras-gpu\lib\site-packages\spyder_kernels\comms\frontendcomm.py", line 241, in _comm_close
    self.close(comm_id)
  File "C:\Users\Muril\miniconda3\envs\tf-keras-gpu\lib\site-packages\spyder_kernels\comms\frontendcomm.py", line 104, in close
    return super(FrontendComm, self).close(comm_id)
  File "C:\Users\Muril\miniconda3\envs\tf-keras-gpu\lib\site-packages\spyder_kernels\comms\commbase.py", line 173, in close
    self._comms[comm_id]['comm'].close()
  File "C:\Users\Muril\miniconda3\envs\tf-keras-gpu\lib\site-packages\ipykernel\comm\comm.py", line 118, in close
    self.kernel.comm_manager.unregister_comm(self)
  File "C:\Users\Muril\miniconda3\envs\tf-keras-gpu\lib\site-packages\ipykernel\comm\manager.py", line 54, in unregister_comm
    comm = self.comms.pop(comm.comm_id)
KeyError: '7a780c3789cf11ecb08a94e70b68af3c'

@impact27
Copy link
Contributor

Fixed by #19411

If the frontend receives an error while restarting, it will abort. But if the kernel sends an error while closing, then the frontend aborts, even though the new kernel is fine.

@ccordoba12 ccordoba12 modified the milestones: 4.x, v6.0alpha1 Sep 24, 2022
@ccordoba12 ccordoba12 assigned impact27 and unassigned ccordoba12 Sep 24, 2022
@ccordoba12 ccordoba12 changed the title Kernel not restarting sometimes (4.1.4) Kernel not restarting sometimes Sep 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants