使用 toco 将假量化 tensorflow 模型(.pb)转换为 tensorflow lite 模型(.tflite)失败

Posted

技术标签:

【中文标题】使用 toco 将假量化 tensorflow 模型(.pb)转换为 tensorflow lite 模型(.tflite)失败【英文标题】:convert fake quantized tensorflow model(.pb) to tensorflow lite model(.tflite) using toco failed 【发布时间】:2018-09-07 13:33:10 【问题描述】:

我尝试按照tensorflow quantization 中的说明生成量化的 tensorflow lite 模型。

首先,我在训练过程中使用 tf.contrib.quantize.create_training_graph() 和 tf.contrib.quantize.create_eval_graph() 将假量化节点插入图中,并生成冻结的 pb 文件(model.pb)最后。

其次,我使用以下命令将我的假量化 tensorflow 模型转换为量化 tensorflow lite 模型。

bazel-bin/tensorflow/contrib/lite/toco/toco \
--input_file=model.pb \
--input_format=TENSORFLOW_GRAPHDEF \
--output_format=TFLITE \
--output_file=model.tflite \
--inference_type=QUANTIZED_UINT8 --input_shapes=1,1:1,5002 \
--input_arrays=Test/Model/input,Test/Model/apps \
--output_arrays=Test/Model/output_probs,Test/Model/final_state  \
--mean_values=127.5,127.5 --std_values=127.5,127.5 --allow_custom_ops

隐蔽进程失败,日志如下:

2018-03-28 18:00:38.348403: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] Before Removing unused ops: 118 operators, 193 arrays (0 quantized)
2018-03-28 18:00:38.349394: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] Before general graph transformations: 118 operators, 193 arrays (0 quantized)
2018-03-28 18:00:38.382854: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] After general graph transformations pass 1: 57 operators, 103 arrays (1 quantized)
2018-03-28 18:00:38.384327: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] After general graph transformations pass 2: 56 operators, 101 arrays (1 quantized)
2018-03-28 18:00:38.385235: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] After general graph transformations pass 3: 55 operators, 100 arrays (1 quantized)
2018-03-28 18:00:38.385995: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] Before pre-quantization graph transformations: 55 operators, 100 arrays (1 quantized)
2018-03-28 18:00:38.386047: W tensorflow/contrib/lite/toco/graph_transformations/hardcode_min_max.cc:131] Skipping min-max setting for TensorFlowSplit operator with output Test/Model/RNN/RNN/multi_rnn_cell/cell_0/basic_lstm_cell/split because output Test/Model/RNN/RNN/multi_rnn_cell/cell_0/basic_lstm_cell/split already has min-max.
2018-03-28 18:00:38.386076: W tensorflow/contrib/lite/toco/graph_transformations/hardcode_min_max.cc:131] Skipping min-max setting for TensorFlowSplit operator with output Test/Model/RNN/RNN/multi_rnn_cell/cell_1/basic_lstm_cell/split because output Test/Model/RNN/RNN/multi_rnn_cell/cell_1/basic_lstm_cell/split already has min-max.
2018-03-28 18:00:38.386328: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] After pre-quantization graph transformations pass 1: 48 operators, 93 arrays (1 quantized)
2018-03-28 18:00:38.386484: W tensorflow/contrib/lite/toco/graph_transformations/hardcode_min_max.cc:131] Skipping min-max setting for TensorFlowSplit operator with output Test/Model/RNN/RNN/multi_rnn_cell/cell_1/basic_lstm_cell/split because output Test/Model/RNN/RNN/multi_rnn_cell/cell_1/basic_lstm_cell/split already has min-max.
2018-03-28 18:00:38.386502: W tensorflow/contrib/lite/toco/graph_transformations/hardcode_min_max.cc:131] Skipping min-max setting for TensorFlowSplit operator with output Test/Model/RNN/RNN/multi_rnn_cell/cell_0/basic_lstm_cell/split because output Test/Model/RNN/RNN/multi_rnn_cell/cell_0/basic_lstm_cell/split already has min-max.
2018-03-28 18:00:38.386778: F tensorflow/contrib/lite/toco/tooling_util.cc:1432] Array Test/Model/embedding_lookup, which is an input to the TensorFlowReshape operator producing the output array Test/Model/Reshape_1, is lacking min/max data, which is necessary for quantization. Either target a non-quantized output format, or change the input graph to contain min/max information, or pass --default_ranges_min= and --default_ranges_max= if you do not care about the accuracy of results.
Aborted

问题出在哪里,我哪里错了?

【问题讨论】:

【参考方案1】:

你没有做错什么。

目前 create_training_graph 和 create_eval_graph 在各种模型架构中并不是最强大的。我们让他们在大多数 CNN 上工作,但 RNN 仍在进行中,并带来了一系列不同的挑战。

根据 RNN 的细节,目前的量化方法将涉及更多,并且可能需要手动将 FakeQuantization 操作放在正确的位置。特别是在您的错误消息中,您似乎需要在 embedding_lookup 中添加 FakeQuantization 操作。话虽如此,最终量化的 RNN 可能会运行,但我不知道准确度会如何。它实际上最终依赖于模型和数据集:)

当自动重写正确支持 RNN 时,我将更新此答案。

【讨论】:

感谢您的回复。而在您的开发计划中,正确支持 RNN 需要多长时间? 这可能需要几周或更长时间才能使其正常工作。 RNN 的激活更容易受到量化误差的影响,我们必须做一些研究才能让它在低位上工作。我们必须调查是否还需要更多位。 嗨@suharshs,有进展吗?也许这个问题与 RNN 无关。原因 Test/Model/embedding_lookup 不是 RNN 网络的一部分。在我手动将 FakeQuantization 放在 Test/Model/embedding_lookup 周围后,报告另一个变量缺少最小/最大信息。 @Chaos 你能用你的模型的细节创建一个github.com/tensorflow/tensorflow Github 问题,以便我们跟踪并确定优先级吗?

以上是关于使用 toco 将假量化 tensorflow 模型(.pb)转换为 tensorflow lite 模型(.tflite)失败的主要内容,如果未能解决你的问题,请参考以下文章

TensorFlow Lite:toco_convert 用于任意大小的输入张量

使用 toco 进行 tflite 转换中的“尺寸必须匹配”错误

量化如何以及何时在 TFLite 图中起作用?

模运算矢量化

facenet训练后模型使用tensorflow量化

Tensorflow 量化感知训练