如何让 Tensorflow Profiler 在 Tensorflow 2.5 中使用“tensorflow-macos”和“tensorflow-metal”工作
Posted
技术标签:
【中文标题】如何让 Tensorflow Profiler 在 Tensorflow 2.5 中使用“tensorflow-macos”和“tensorflow-metal”工作【英文标题】:How do I get the Tensorflow Profiler working in Tensorflow 2.5 with 'tensorflow-macos' and 'tensorflow-metal' 【发布时间】:2021-10-02 13:47:38 【问题描述】:我正在运行 Big Sur OS-X 11.5、Tensorflow 2.5 和 Python 3.8。
当我尝试显示探查器选项卡时出现此错误:
W0726 08:25:03.846074 123145487446016 application.py:556] 找不到路径 /data/index.js,发送 404
% pip list
Package Version
-------------------------- -------------------
absl-py 0.12.0
anyio 3.2.1
appnope 0.1.2
argon2-cffi 20.1.0
astunparse 1.6.3
async-generator 1.10
attrs 21.2.0
Babel 2.9.1
backcall 0.2.0
bleach 3.3.1
cachetools 4.2.2
certifi 2021.5.30
cffi 1.14.6
charset-normalizer 2.0.1
cycler 0.10.0
Cython 0.29.24
debugpy 1.3.0
decorator 5.0.9
defusedxml 0.7.1
dill 0.3.4
dotmap 1.3.23
entrypoints 0.3
flatbuffers 1.12
future 0.18.2
gast 0.4.0
gensim 4.0.1
google-auth 1.32.1
google-auth-oauthlib 0.4.4
google-pasta 0.2.0
googleapis-common-protos 1.53.0
grpcio 1.34.1
gviz-api 1.9.0
h5py 3.1.0
idna 3.2
importlib-resources 5.2.0
ipykernel 6.0.1
ipython 7.25.0
ipython-genutils 0.2.0
ipywidgets 7.6.3
jedi 0.18.0
Jinja2 3.0.1
json5 0.9.6
jsonschema 3.2.0
jupyter-client 6.1.12
jupyter-core 4.7.1
jupyter-server 1.9.0
jupyterlab 3.0.16
jupyterlab-pygments 0.1.2
jupyterlab-server 2.6.1
jupyterlab-widgets 1.0.0
keras-nightly 2.5.0.dev2021032900
Keras-Preprocessing 1.1.2
kiwisolver 1.3.1
Markdown 3.3.4
MarkupSafe 2.0.1
matplotlib 3.4.2
matplotlib-inline 0.1.2
mistune 0.8.4
nbclassic 0.3.1
nbclient 0.5.3
nbconvert 6.1.0
nbformat 5.1.3
nest-asyncio 1.5.1
notebook 6.4.0
numpy 1.19.5
oauthlib 3.1.1
opt-einsum 3.3.0
packaging 21.0
pandas 1.3.0
pandocfilters 1.4.3
parso 0.8.2
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.3.1
pip 21.1.3
prometheus-client 0.11.0
promise 2.3
prompt-toolkit 3.0.19
protobuf 3.17.3
ptyprocess 0.7.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pybind11 2.6.2
pycparser 2.20
Pygments 2.9.0
pyparsing 2.4.7
pyrsistent 0.18.0
python-dateutil 2.8.2
pytz 2021.1
pyzmq 22.1.0
requests 2.26.0
requests-oauthlib 1.3.0
requests-unixsocket 0.2.0
rsa 4.7.2
scipy 1.7.0
Send2Trash 1.7.1
setuptools 41.2.0
six 1.15.0
smart-open 5.1.0
sniffio 1.2.0
tensorboard 2.5.0
tensorboard-data-server 0.6.1
tensorboard-plugin-profile 2.4.0
tensorboard-plugin-wit 1.8.0
tensorflow-datasets 4.3.0
tensorflow-estimator 2.5.0
tensorflow-hub 0.12.0
tensorflow-macos 2.5.0
tensorflow-metadata 1.1.0
tensorflow-metal 0.1.1
termcolor 1.1.0
terminado 0.10.1
testpath 0.5.0
tornado 6.1
tqdm 4.61.2
traitlets 5.0.5
typing-extensions 3.7.4.3
urllib3 1.26.6
wcwidth 0.2.5
webencodings 0.5.1
websocket-client 1.1.0
Werkzeug 2.0.1
wheel 0.36.2
widgetsnbextension 3.5.1
wrapt 1.12.1
zipp 3.5.0
我按照(如下)概述的步骤来测试 Tensorbord 分析器:
https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras
个人资料标签是空的,其他标签都很好。
训练输出:
# Create a TensorBoard callback
logs = "logs/" + datetime.now().strftime("%Y%m%d-%H%M%S")
tboard_callback = tf.keras.callbacks.TensorBoard(log_dir = logs,
histogram_freq = 1,
profile_batch = '500,520')
model.fit(ds_train,
epochs=2,
validation_data=ds_test,
callbacks = [tboard_callback])
2021-07-26 08:24:10.046068: I tensorflow/core/profiler/lib/profiler_session.cc:126] Profiler session initializing.
2021-07-26 08:24:10.046080: I tensorflow/core/profiler/lib/profiler_session.cc:141] Profiler session started.
2021-07-26 08:24:10.046398: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session tear down.
2021-07-26 08:24:10.155591: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/2
2021-07-26 08:24:10.350671: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
467/469 [============================>.] - ETA: 0s - loss: 0.3578 - accuracy: 0.9012
2021-07-26 08:24:18.130834: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
469/469 [==============================] - 9s 11ms/step - loss: 0.3571 - accuracy: 0.9014 - val_loss: 0.1861 - val_accuracy: 0.9457
Epoch 2/2
49/469 [==>...........................] - ETA: 3s - loss: 0.1939 - accuracy: 0.9428
2021-07-26 08:24:19.001183: I tensorflow/core/profiler/lib/profiler_session.cc:126] Profiler session initializing.
2021-07-26 08:24:19.001196: I tensorflow/core/profiler/lib/profiler_session.cc:141] Profiler session started.
2021-07-26 08:24:19.182713: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
58/469 [==>...........................] - ETA: 5s - loss: 0.1920 - accuracy: 0.9436
2021-07-26 08:24:19.273235: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session tear down.
2021-07-26 08:24:19.321288: I tensorflow/core/profiler/rpc/client/save_profile.cc:137] Creating directory: logs/20210726-082410/train/plugins/profile/2021_07_26_08_24_19
2021-07-26 08:24:19.352372: I tensorflow/core/profiler/rpc/client/save_profile.cc:143] Dumped gzipped tool data for trace.json.gz to logs/20210726-082410/train/plugins/profile/2021_07_26_08_24_19/BlueDiamond.local.trace.json.gz
2021-07-26 08:24:19.389321: I tensorflow/core/profiler/rpc/client/save_profile.cc:137] Creating directory: logs/20210726-082410/train/plugins/profile/2021_07_26_08_24_19
2021-07-26 08:24:19.389770: I tensorflow/core/profiler/rpc/client/save_profile.cc:143] Dumped gzipped tool data for memory_profile.json.gz to logs/20210726-082410/train/plugins/profile/2021_07_26_08_24_19/BlueDiamond.local.memory_profile.json.gz
2021-07-26 08:24:19.394165: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: logs/20210726-082410/train/plugins/profile/2021_07_26_08_24_19Dumped tool data for xplane.pb to logs/20210726-082410/train/plugins/profile/2021_07_26_08_24_19/BlueDiamond.local.xplane.pb
Dumped tool data for overview_page.pb to logs/20210726-082410/train/plugins/profile/2021_07_26_08_24_19/BlueDiamond.local.overview_page.pb
Dumped tool data for input_pipeline.pb to logs/20210726-082410/train/plugins/profile/2021_07_26_08_24_19/BlueDiamond.local.input_pipeline.pb
Dumped tool data for tensorflow_stats.pb to logs/20210726-082410/train/plugins/profile/2021_07_26_08_24_19/BlueDiamond.local.tensorflow_stats.pb
Dumped tool data for kernel_stats.pb to logs/20210726-082410/train/plugins/profile/2021_07_26_08_24_19/BlueDiamond.local.kernel_stats.pb
469/469 [==============================] - 5s 10ms/step - loss: 0.1616 - accuracy: 0.9535 - val_loss: 0.1354 - val_accuracy: 0.9601
【问题讨论】:
【参考方案1】:您是否尝试过更改TensorBoard callback 中profile_batch
参数的值?正如所写,它应该分析批次 500-520。如果没有足够的批次运行,则不会收集任何配置文件。
【讨论】:
我尝试设置 profile_batch = '1,5',但是 tensorboard 的 profile 选项卡仍然是空的。 我需要从 Chrome 运行 tensorboard 吗?例如张量板问题 #2824? 能否提供您所引用问题的链接? github.com/tensorflow/tensorboard/issues/2874 ***.com/questions/57835326/…以上是关于如何让 Tensorflow Profiler 在 Tensorflow 2.5 中使用“tensorflow-macos”和“tensorflow-metal”工作的主要内容,如果未能解决你的问题,请参考以下文章
ImportError:来自“tensorflow.python.profiler”的“trace”
tfprof(tensorflow profiler)模型分析报告中的b flops是啥?
如何正确处理信号,让 gperftools CPU profiler 仍然有效?
Tensorflow2.0报错:ProfilerNotRunningError: Cannot stop profiling. No profiler is running.(修改后别忘了重启内核或关