如何修复“在解释器中至少有 1 个以 numpy 数组或切片的形式引用的内部数据”并在 tf.lite 上运行推理

Posted 2023-02-16

技术标签:

【中文标题】如何修复“在解释器中至少有 1 个以 numpy 数组或切片的形式引用的内部数据”并在 tf.lite 上运行推理【英文标题】：how to fix "There is at least 1 reference to internal data in the interpreter in the form of a numpy array or slice" and run inference on tf.lite 【发布时间】：2019-11-08 16:31:38 【问题描述】：

我正在尝试使用 tf.lite 在我根据this 进行训练后量化优化的 mnist keras 模型上运行推理

RuntimeError: There is at least 1 reference to internal data
in the interpreter in the form of a numpy array or slice. Be sure to
only hold the function returned from tensor() if you are using raw
data access.

它发生在我将图像调整为 4 维或解释器本身的大小（如注释行中所示）之后；因为在此之前的错误类似于“预期 4 个维度但发现 3 个”。代码如下：

import tensorflow as tf
tf.enable_eager_execution()
import numpy as np
from tensorflow.keras.datasets import mnist
import matplotlib.pyplot as plt
%matplotlib inline

mnist_train, mnist_test = tf.keras.datasets.mnist.load_data()
images, labels = tf.cast(mnist_test[0], tf.float32)/255.0, mnist_test[1]
images = np.reshape(images,[images.shape[0],images.shape[1],images.shape[2],1])
mnist_ds = tf.data.Dataset.from_tensor_slices((images, labels)).batch(1, drop_remainder = True)

interpreter = tf.lite.Interpreter(model_path="C:\\Users\\USER\\Documents\\python\\converted_quant_model_cnn_5_100.tflite")
#tf.lite.Interpreter.resize_tensor_input(interpreter, input_index="index" , tensor_size=([1,28,28,1]) )

interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_index = interpreter.get_input_details()[0]["index"]
output_index = interpreter.get_output_details()[0]["index"]

for img, label in mnist_ds.take(1):
  break
#print(img.get_shape)
interpreter.set_tensor(input_index, img)
interpreter.invoke()
predictions = interpreter.get_tensor(output_index)

【问题讨论】：

你解决了这个问题吗？仍在处理同样的问题。它似乎有一次随机正常工作，但其他时候会出现同样的问题。问题已解决。这是一个愚蠢的错误，我记不清了。可能是数据集或处理方式。为了记录，我通过确保在 invoke() 之前调用 interpreter.allocate_tensors() 来解决它。 【参考方案1】：

我在 tflite 模型上运行推理时遇到了同样的问题。回溯时，我最终读取了发生此运行时错误的函数。

导致此错误的函数是：

def _ensure_safe(self)

和

def _safe_to_run(self)

从函数“_ensure_safe()”中调用函数“_safe_to_run()”。 _safe_to_run() 函数返回 True 或 False。当它返回 False 时，会发生上述运行时错误。

当存在 numpy 数组缓冲区时返回 False。这意味着运行可能会破坏（或更改）内部分配的内存的 tflite 调用是不安全的。

因此，为了让“_ensure_safe()”函数不引发此运行时错误，我们必须确保没有任何指向内部缓冲区的 numpy 数组处于活动状态。

此外，为了更清楚起见，应从任何将调用 _interpreter 上可能重新分配内存的函数的函数调用函数“_ensure_safe()”。因此当你调用函数时

interpreter.allocate_tensors()

正如您在上面的代码中提到的，这个“interpreter.allocate_tensors()”函数在内部执行的第一件事是调用“_ensure_safe()”函数，因为“interpreter.allocate_tensors()”涉及更改内部分配内存（在这种情况下，更改意味着“分配”，顾名思义）。另一个同时调用“_ensure_safe()”的示例是调用“invoke()”函数时。并且有很多这样的功能，但你明白了。

既然知道根本原因和工作原理，为了克服这个运行时错误，即没有指向内部缓冲区的 numpy 数组，我们必须清除它们。

清除它们：

一）。关闭 jupyter notebook 并重新启动内核，因为这将清除所有 numpy 数组/切片

b)。或者只是再次加载模型，即在你的 jupyter notebook 中再次运行这一行：

interpreter = tf.lite.Interpreter(model_path="C:\\Users\\USER\\Documents\\python\\converted_quant_model_cnn_5_100.tflite")

希望这能解决您的问题，我向您保证它确实为我解决了问题。

如果这两个选项都没有，那么在上面的解释中我已经指出了“为什么”会发生这个错误。因此，如果您发现“没有指向内部缓冲区的 numpy 数组”的其他方法，请分享。

参考：https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/python/interpreter.py

【讨论】：

也面临这个问题。但我使用的是烧瓶，这个过程是实时的。这是我在堆栈上的问题link。【参考方案2】：

只是添加为我解决的问题。我正在使用脚本，所以它与 Jupyter Notebooks 无关。

我的问题是我使用的是predictions = interpreter.tensor(output_index) 而是predictions = interpreter.get_tensor(output_index)。

但是，问题出现在此线程中评论的相同错误。

【讨论】：

【参考方案3】：

我正在使用脚本，对我来说，问题是同一脚本的多个实例同时运行。杀死实例解决了问题

【讨论】：

【参考方案4】：

我复制interpreter.tensor对象，然后就可以了，希望对你有帮助！

改变

interpreter.set_tensor(input_index, test2)
interpreter.invoke()
output = interpreter.tensor(output_h1)
result_h1 = np.reshape(output(), (224,224))

到

import copy
interpreter.set_tensor(input_index, test2)
interpreter.invoke()
output = interpreter.tensor(output_h1)
result_h1 = np.reshape(copy.copy(output()), (224,224))

【讨论】：

以上是关于如何修复“在解释器中至少有 1 个以 numpy 数组或切片的形式引用的内部数据”并在 tf.lite 上运行推理的主要内容，如果未能解决你的问题，请参考以下文章