Jupyter:内核似乎已经死亡。它将自动重启。 (与Keras相关)
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Jupyter:内核似乎已经死亡。它将自动重启。 (与Keras相关)相关的知识,希望对你有一定的参考价值。
我正在尝试训练Resnet50,但是无论如何我都做不到,因为Jupyter笔记本的内核快要死了(The kernel appears to have died. It will restart automatically
),就在它开始训练的那一刻(Epoch 1/100)。我有GeForce GTX 1060 Ti,在训练中进行nvidia-smi
(虽然持续1秒)时,与过去相比,我仅看到分配了80 MB的内存,然后内核死了,就好像它尝试了一样,但是失败。
这里是要求:
pandas==0.25.1
numpy==1.17.2
opencv-python==4.1.1.26
scikit-image==0.15.0
scikit-learn==0.21.3
tensorflow-gpu==1.14.0
Keras==2.2.5
matplotlib==3.1.1
Pillow==6.1.0
albumentations==0.3.2
tqdm==4.35.0
jupyter
我满意。这是我设置培训课程的方式:
config = tf.ConfigProto()
config.gpu_options.allow_growth = False
config.gpu_options.per_process_gpu_memory_fraction = 0.9
sess = tf.Session(config=config)
keras.backend.set_session(sess)
keras.__version__
os.environ["CUDA_VISIBLE_DEVICES"] = '0' #yes, this is the ID of my GPU.
# create the FCN model
model_mobilenet = ResNet50(input_shape=(1024, 1024, 3), include_top=False) # use the Resnet
model_x8_output = Conv2D(128, (1, 1), activation='relu')(model_mobilenet.layers[-95].output)
model_x8_output = UpSampling2D(size=(8, 8))(model_x8_output)
model_x8_output = Conv2D(3, (3, 3), padding='same', activation='sigmoid')(model_x8_output)
MODEL_x8 = Model(inputs=model_mobilenet.input, outputs=model_x8_output)
MODEL_x8.compile(loss='binary_crossentropy', optimizer=Adam(lr=1e-3), metrics=[jaccard_distance])
MODEL_x8.fit_generator(train_generator, steps_per_epoch=300, epochs=100, verbose=1, validation_data=val_generator, validation_steps=10)
Epoch 1/100
1/300 [..............................] - ETA: 1:01:59 - loss: 0.7193 - jaccard_distance: 0.1125
我尝试设置:
- [
config.gpu_options.allow_growth
至True
。 - [
config.gpu_options.per_process_gpu_memory_fraction
到任何其他任意值,例如0.1
- 注释:
#os.environ["CUDA_VISIBLE_DEVICES"] = 0
他们都没有工作。我感谢建设性的答案。
提前感谢。
EDIT:我现在尝试将其作为脚本(而不是作为笔记本)运行,并且在Tensorflow会话行出现时,终端抛出以下内容:
2020-01-28 13:44:55.756819: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/username/ros_ws/devel/lib:/opt/ros/melodic/lib
2020-01-28 13:44:55.757047: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/username/ros_ws/devel/lib:/opt/ros/melodic/lib
2020-01-28 13:44:55.757313: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/username/ros_ws/devel/lib:/opt/ros/melodic/lib
2020-01-28 13:44:55.757526: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/username/ros_ws/devel/lib:/opt/ros/melodic/lib
2020-01-28 13:44:55.757736: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/username/ros_ws/devel/lib:/opt/ros/melodic/lib
2020-01-28 13:44:55.757940: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/username/ros_ws/devel/lib:/opt/ros/melodic/lib
2020-01-28 13:44:55.808416: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-01-28 13:44:55.808444: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
这很奇怪,因为我没有CUDA 10,而没有9.0,所以甚至都不应该问这个。我的Tensorflow版本不正确吗?
答案
最有可能是因为没有足够的内存来存储数据/模型。您输入的图像尺寸也是1024x1024。我建议您尝试使用像256甚至128这样的小图像进行训练,以查看它是否至少有效。 另一答案
好,知道了。以上是关于Jupyter:内核似乎已经死亡。它将自动重启。 (与Keras相关)的主要内容,如果未能解决你的问题,请参考以下文章
合并 pandas 中的两个数据框,给出“内核似乎已经死机。它将自动重启。”使用 Jupyter 笔记本