无法使用 keras.models.load_model() 加载 TF 变压器模型
Posted
技术标签:
【中文标题】无法使用 keras.models.load_model() 加载 TF 变压器模型【英文标题】:Can't load TF transformer model with keras.models.load_model() 【发布时间】:2021-06-04 15:53:37 【问题描述】:我有一个在 sagemaker 中训练的模型(自定义训练作业),并由我的训练脚本使用 keras model.save()
方法保存,该方法生成一个带有权重和索引的 variables
目录和一个 .pb
文件。该模型是来自 huggingface 的 transformer
库的 TFBertForSequenceClassification
,根据他们的文档,该模型是 keras 模型的子类。但是,当我尝试使用 keras.models.load_model()
加载模型时,出现以下错误:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/save.py", line 187, in load_model
return saved_model_load.load(filepath, compile, options)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 121, in load
path, options=options, loader_cls=KerasObjectLoader)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py", line 633, in load_internal
ckpt_options)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 194, in __init__
super(KerasObjectLoader, self).__init__(*args, **kwargs)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py", line 130, in __init__
self._load_all()
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 215, in _load_all
self._layer_nodes = self._load_layers()
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 315, in _load_layers
layers[node_id] = self._load_layer(proto.user_object, node_id)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 341, in _load_layer
obj, setter = self._revive_from_config(proto.identifier, metadata, node_id)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 368, in _revive_from_config
obj, self._proto.nodes[node_id], node_id)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 298, in _add_children_recreated_from_config
obj_child, child_proto, child_id)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 298, in _add_children_recreated_from_config
obj_child, child_proto, child_id)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 250, in _add_children_recreated_from_config
metadata = json_utils.decode(proto.user_object.metadata)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/json_utils.py", line 60, in decode
return json.loads(json_string, object_hook=_decode_helper)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/json/__init__.py", line 361, in loads
return cls(**kw).decode(s)
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
我被难住了。转换器库自己的 save_pretrained()
方法将层信息保存在 .json
文件中,但我不明白为什么 keras 模型保存会知道/关心这一点(而且我认为这不是问题所在)。有什么帮助吗?
【问题讨论】:
不知道你是否解决了这个问题,但我陷入了同样的错误。我发现将其保存为 .h5 文件扩展名解决了这个问题。更多信息在这里 - tensorflow.org/guide/keras/save_and_serialize 使用model.save()
或tf.keras.models.save_model()
保存Keras 模型并使用tf.keras.models.load_model()
加载模型。谢谢!
【参考方案1】:
另一种选择是使用第一个转换器层构建您自己的分类器,并在其上放置您的分类器(和输出)。然后按如下方式使用model.save()和tf.keras.models.load_model(model_path):
Important(!) - 注意第一层的用法: 感谢提供解决方案的 Utpal Chakraborty:Isues with saving and loading tensorflow model which uses hugging face transformer model as its first layer
import tensorflow as tf
from tensorflow.keras import Model
from transformers import (
AutoConfig,
AutoTokenizer,
TFAutoModel)
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, Input
#use GPU
gpus = tf.config.experimental.list_physical_devices('GPU')
print(gpus)
if gpus:
try:
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)
config = AutoConfig.from_pretrained('bert-base-uncased',output_hidden_states=True, num_labels=4)
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
transformer_layer = TFAutoModel.from_pretrained('bert-base-uncased',config=config, from_pt=False)
# optional - freeze all layers:
for layer in transformer_layer.layers:
layer._trainable = False
input_word_ids = Input(shape=(512,), dtype=tf.int32, name="input_ids")
mask = Input(shape=(512,), dtype=tf.int32, name="attention_mask")
#note this critical call to inner model layer
embedding = transformer_layer.bert(input_word_ids, mask)[0]
#take only the CLS embedding
hidden = tf.keras.layers.Dense(768, activation='relu')(embedding[:,0,:])
out = Dense(num_labels, activation='softmax')(hidden)
#Compile model
model = Model(inputs = [input_word_ids,mask], outputs=out)
print(model.summary())
optimizer = Adam(learning_rate=5e-05)
metric = tf.keras.metrics.CategoricalAccuracy('accuracy')
model.compile(optimizer=optimizer, loss=tf.keras.losses.CategoricalCrossentropy(), metrics=[metric])
#Then fit the model
#.....
#Now save
model_dir = './tmp/model'
model.save(model_dir)
#test it:
model = tf.keras.models.load_model(model_dir)
【讨论】:
【参考方案2】:这是训练模型与您正在加载模型的 TensorFlow 版本之间的 TensorFlow 版本不兼容。我的服务器有 TensorFlow 版本 2.6.2,我的 PC 有 2.4.1。在服务器上训练模型后,当我尝试在我的 PC 上加载“json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)”时出现错误。
检查服务器和 PC 上的 TensorFlow 版本。确保它们相似。 在我的 PC 上升级 Tensorflow 后,它成功加载了经过训练的模型【讨论】:
以上是关于无法使用 keras.models.load_model() 加载 TF 变压器模型的主要内容,如果未能解决你的问题,请参考以下文章
无法使用 StorageClass 配置卷 - 无法获取存储帐户的存储密钥
Worklight Studio 和本地开发,有时无法使用 Java 类,有时无法使用 HTML 文件
Ubuntu 80端口无法使用-非root用户无法使用1024以下端口