使用 Keras 在 GPU 上进行推理
Posted
技术标签:
【中文标题】使用 Keras 在 GPU 上进行推理【英文标题】:Inference on GPU with Keras 【发布时间】:2020-10-25 06:30:10 【问题描述】:我正在尝试使用我的 RTX 2060 Super 对 Keras 进行预测。由于某种原因,它似乎在我的 CPU 上运行。
这是我用于调试的测试脚本:
import numpy as np
import tensorflow as tf
from keras import Sequential
from keras.layers import Conv2D, Flatten, Dense
def get_model():
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(6, 7, 3), activation='relu'))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(16, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(1, activation='tanh'))
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
return model
def test_gpu():
model = get_model()
arg = np.random.rand(10000, 6, 7, 3)
with tf.device('gpu'):
for i in range(10000):
print(i)
model.predict(arg)
if __name__ == '__main__':
tf.config.experimental.list_physical_devices()
tf.debugging.set_log_device_placement(True)
test_gpu()
这是打印到控制台的结果:
2020-07-04 16:02:53.476342: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-04 16:02:54.750958: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-07-04 16:02:54.829844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: GeForce RTX 2060 SUPER computeCapability: 7.5
coreClock: 1.71GHz coreCount: 34 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2020-07-04 16:02:54.829996: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-04 16:02:54.833612: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-04 16:02:54.836233: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-04 16:02:54.837132: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-04 16:02:54.840536: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-04 16:02:54.842135: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-04 16:02:54.847975: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-04 16:02:54.848397: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-04 16:02:54.855989: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-04 16:02:54.862690: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x279fb82e950 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-04 16:02:54.862816: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-07-04 16:02:54.863172: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: GeForce RTX 2060 SUPER computeCapability: 7.5
coreClock: 1.71GHz coreCount: 34 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2020-07-04 16:02:54.863317: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-04 16:02:54.863390: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-04 16:02:54.863463: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-04 16:02:54.863531: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-04 16:02:54.863599: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-04 16:02:54.863668: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-04 16:02:54.863737: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-04 16:02:54.864148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-04 16:02:55.380931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-04 16:02:55.381015: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-07-04 16:02:55.381059: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-07-04 16:02:55.381623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6650 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060 SUPER, pci bus id: 0000:07:00.0, compute capability: 7.5)
2020-07-04 16:02:55.383791: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x279ab93e810 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-04 16:02:55.383895: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2060 SUPER, Compute Capability 7.5
2020-07-04 16:02:55.385370: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.585261: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op Sub in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.585707: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op Mul in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.585832: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op Add in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.586031: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.586161: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op VarIsInitializedOp in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.586336: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op LogicalNot in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.586529: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op Assert in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.586907: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.587688: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.588197: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.595362: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.603863: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.605481: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.611149: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.616445: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.617115: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
0
2020-07-04 16:02:55.623924: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.636035: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op RangeDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.636340: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op RepeatDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.644954: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op MapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.645358: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op PrefetchDataset in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.653283: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op FlatMapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.653830: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op TensorDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.653992: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op RepeatDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.654245: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op ZipDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.657661: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op ParallelMapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.658464: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op ModelDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.658648: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op AnonymousIteratorV2 in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.658798: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op MakeIterator in device /job:localhost/replica:0/task:0/device:CPU:0
iterator: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
iterator_1: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
sequential_conv2d_conv2d_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
sequential_conv2d_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
sequential_conv2d_1_conv2d_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
sequential_conv2d_1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
sequential_dense_matmul_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
sequential_dense_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
sequential_dense_1_matmul_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
sequential_dense_1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
sequential_dense_2_matmul_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
sequential_dense_2_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
IteratorGetNext: (IteratorGetNext): /job:localhost/replica:0/task:0/device:CPU:0
sequential/conv2d/Conv2D/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
sequential/conv2d/Conv2D: (Conv2D): /job:localhost/replica:0/task:0/device:GPU:0
sequential/conv2d/BiasAdd/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
sequential/conv2d/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
sequential/conv2d/Relu: (Relu): /job:localhost/replica:0/task:0/device:GPU:0
sequential/conv2d_1/Conv2D/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
sequential/conv2d_1/Conv2D: (Conv2D): /job:localhost/replica:0/task:0/device:GPU:0
sequential/conv2d_1/BiasAdd/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
sequential/conv2d_1/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.721249: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op __inference_predict_function_248 in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.722140: I tensorflow/core/common_runtime/colocation_graph.cc:256] Ignoring device specification /job:localhost/replica:0/task:0/device:GPU:0 for node 'IteratorGetNext' because the input edge from 'iterator' is a reference connection and already has a device field set to /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.722373: I tensorflow/core/common_runtime/placer.cc:114] iterator: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.722471: I tensorflow/core/common_runtime/placer.cc:114] iterator_1: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.722583: I tensorflow/core/common_runtime/placer.cc:114] sequential_conv2d_conv2d_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.722710: I tensorflow/core/common_runtime/placer.cc:114] sequential_conv2d_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.722845: I tensorflow/core/common_runtime/placer.cc:114] sequential_conv2d_1_conv2d_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.722981: I tensorflow/core/common_runtime/placer.cc:114] sequential_conv2d_1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.723118: I tensorflow/core/common_runtime/placer.cc:114] sequential_dense_matmul_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.723245: I tensorflow/core/common_runtime/placer.cc:114] sequential_dense_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.723378: I tensorflow/core/common_runtime/placer.cc:114] sequential_dense_1_matmul_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.723514: I tensorflow/core/common_runtime/placer.cc:114] sequential_dense_1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.723643: I tensorflow/core/common_runtime/placer.cc:114] sequential_dense_2_matmul_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.723780: I tensorflow/core/common_runtime/placer.cc:114] sequential_dense_2_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.723913: I tensorflow/core/common_runtime/placer.cc:114] IteratorGetNext: (IteratorGetNext): /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:55.724038: I tensorflow/core/common_runtime/placer.cc:114] sequential/conv2d/Conv2D/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.724172: I tensorflow/core/common_runtime/placer.cc:114] sequential/conv2d/Conv2D: (Conv2D): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.724355: I tensorflow/core/common_runtime/placer.cc:114] sequential/conv2d/BiasAdd/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.724483: I tensorflow/core/common_runtime/placer.cc:114] sequential/conv2d/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.724589: I tensorflow/core/common_runtime/placer.cc:114] sequential/conv2d/Relu: (Relu): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.724729: I tensorflow/core/common_runtime/placer.cc:114] sequential/conv2d_1/Conv2D/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.724882: I tensorflow/core/common_runtime/placer.cc:114] sequential/conv2d_1/Conv2D: (Conv2D): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.725008: I tensorflow/core/common_runtime/placer.cc:114] sequential/conv2d_1/BiasAdd/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.725137: I tensorflow/core/common_runtime/placer.cc:114] sequential/conv2d_1/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.731809: I tensorflow/core/common_runtime/placer.cc:114] sequential/conv2d_1/Relu: (Relu): /job:localhost/replica:0/task:0/device:GPU:0
sequential/conv2d_1/Relu: (Relu): /job:localhost/replica:0/task:0/device:GPU:0
sequential/flatten/Reshape: (Reshape): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense/MatMul/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense/MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense/BiasAdd/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense/Relu: (Relu): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense_1/MatMul/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense_1/MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense_1/BiasAdd/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense_1/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense_1/Relu: (Relu): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense_2/MatMul/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense_2/MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense_2/BiasAdd/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense_2/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
sequential/dense_2/Tanh: (Tanh): /job:localhost/replica:0/task:0/device:GPU:0
Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:0
identity_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
sequential/flatten/Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.732008: I tensorflow/core/common_runtime/placer.cc:114] sequential/flatten/Reshape: (Reshape): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.732143: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense/MatMul/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.732276: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense/MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.732406: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense/BiasAdd/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.732537: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.732657: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense/Relu: (Relu): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.732794: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense_1/MatMul/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.732937: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense_1/MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.733070: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense_1/BiasAdd/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.733199: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense_1/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.733323: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense_1/Relu: (Relu): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.733455: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense_2/MatMul/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.733580: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense_2/MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.733697: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense_2/BiasAdd/ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.733826: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense_2/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.733939: I tensorflow/core/common_runtime/placer.cc:114] sequential/dense_2/Tanh: (Tanh): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.734039: I tensorflow/core/common_runtime/placer.cc:114] Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.734141: I tensorflow/core/common_runtime/placer.cc:114] identity_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.734240: I tensorflow/core/common_runtime/placer.cc:114] Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.734339: I tensorflow/core/common_runtime/placer.cc:114] sequential/flatten/Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:55.745329: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-04 16:02:56.011439: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-04 16:02:57.154757: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.
This message will be only logged once.
1
2020-07-04 16:02:57.354381: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op ConcatV2 in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:57.355353: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op DeleteIterator in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.357613: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op RangeDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.357817: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op RepeatDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.361556: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op MapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.362044: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op PrefetchDataset in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:57.369283: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op FlatMapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.369687: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op TensorDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.369839: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op RepeatDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.369993: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op ZipDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.373118: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op ParallelMapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.373590: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op ModelDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2
2020-07-04 16:02:57.516787: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op RangeDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.516987: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op RepeatDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.520518: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op MapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.520843: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op PrefetchDataset in device /job:localhost/replica:0/task:0/device:GPU:0
2020-07-04 16:02:57.528061: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op FlatMapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.528482: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op TensorDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.528642: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op RepeatDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.528802: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op ZipDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.531905: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op ParallelMapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-07-04 16:02:57.532347: I tensorflow/core/common_runtime/eager/execute.cc:501] Executing op ModelDataset in device /job:localhost/replica:0/task:0/device:CPU:0
3
这是一个屏幕截图,显示了我在任务管理器中的 CPU 和 GPU 利用率:
任何帮助将不胜感激!
【问题讨论】:
这是一个常见的误解,您可以通过查看任务管理器来确定是否正在使用 GPU,这是不正确的。这里没有任何迹象表明 GPU 未被使用。 如果您查看我粘贴的 tensorflow 日志输出,您可以看到大部分操作是在 CPU 上完成的,而不是在显卡上。 是的,这一点也不异常,并不是所有的操作都可以在GPU上运行。您的模型实在太小,无法将负载加载到 GPU 中。 原来我是在 GPU 上运行的。我切换到使用CPU,它要慢得多。我想我只是对速度和 GPU 利用率有不切实际的期望。谢谢! 【参考方案1】:在此处总结Comments Section
(Answer Section
) 中提到的要点,并提供代码以检查Tensorflow
是否在下面使用GPU
,为了社区的利益。
我们不应该使用 Task Manager
来检查GPU
是否被Tensorflow
使用。
相反,我们可以使用下面提到的代码:
import tensorflow as tf
print(tf.config.list_physical_devices('GPU'))
print('Default GPU Device: '.format(tf.test.gpu_device_name()))
print(tf.test.is_built_with_cuda())
如果Tensorflow
使用GPU
,上述代码的输出将是:
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Default GPU Device: /device:GPU:0
True
您还可以运行以下代码来了解 Tensorflow 使用的所有设备:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
上述代码在 Google Colab 中使用 Runtime 作为 GPU 运行时的输出如下所示:
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality
incarnation: 1364469592146627999
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality
incarnation: 1949236974972245157
physical_device_desc: "device: XLA_CPU device"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality
incarnation: 7931601386541220977
physical_device_desc: "device: XLA_GPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 14648777152
locality
bus_id: 1
links
incarnation: 15267718363411873827
physical_device_desc: "device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5"
]
您也可以使用命令nvidia-smi
。
更多信息请参考Stack Overflow Answer。
【讨论】:
以上是关于使用 Keras 在 GPU 上进行推理的主要内容,如果未能解决你的问题,请参考以下文章
具有推理功能的 TensorFlow + Keras 多 GPU 模型
将训练有素的 Tensorflow 模型保存到另一台机器上进行推理 [重复]
原官方keras 版 maskrcnn 转onnx,并使用onnxruntime gpu c++ 推理
原官方keras 版 maskrcnn 转onnx,并使用onnxruntime gpu c++ 推理