TypeError:层的输入应该是张量。得到:last_hidden_state
Posted
技术标签:
【中文标题】TypeError:层的输入应该是张量。得到:last_hidden_state【英文标题】:TypeError: Inputs to a layer should be tensors. Got: last_hidden_state 【发布时间】:2021-12-30 00:20:04 【问题描述】:我一直在尝试使用 BERT 训练句子相似度模型,但遇到了这个错误。我到处搜索,但找不到解决方案,有人可以帮我解决这个问题吗?附上代码供您参考。
# Create the model under a distribution strategy scope.
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
# Encoded token ids from BERT tokenizer.
input_ids = tf.keras.layers.Input(
shape=(max_length,), dtype=tf.int32, name="input_ids"
)
# Attention masks indicates to the model which tokens should be attended to.
attention_masks = tf.keras.layers.Input(
shape=(max_length,), dtype=tf.int32, name="attention_masks"
)
# Token type ids are binary masks identifying different sequences in the model.
token_type_ids = tf.keras.layers.Input(
shape=(max_length,), dtype=tf.int32, name="token_type_ids"
)
# Loading pretrained BERT model.
bert_model = transformers.TFBertModel.from_pretrained("bert-base-uncased")
# Freeze the BERT model to reuse the pretrained features without modifying them.
bert_model.trainable = False
sequence_output, pooled_output = bert_model(
input_ids, attention_mask=attention_masks, token_type_ids=token_type_ids
)
# Add trainable layers on top of frozen layers to adapt the pretrained features on the new data.
bi_lstm = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(64, return_sequences=True)
)(sequence_output)
# Applying hybrid pooling approach to bi_lstm sequence output.
avg_pool = tf.keras.layers.GlobalAveragePooling1D()(bi_lstm)
max_pool = tf.keras.layers.GlobalMaxPooling1D()(bi_lstm)
concat = tf.keras.layers.concatenate([avg_pool, max_pool])
dropout = tf.keras.layers.Dropout(0.3)(concat)
output = tf.keras.layers.Dense(3, activation="softmax")(dropout)
model = tf.keras.models.Model(
inputs=[input_ids, attention_masks, token_type_ids], outputs=output
)
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="categorical_crossentropy",
metrics=["acc"],
)
print(f"Strategy: strategy")
model.summary()
【问题讨论】:
【参考方案1】:您必须显式访问 Bert 模型输出的 last_hidden_state
属性:
with strategy.scope():
# Encoded token ids from BERT tokenizer.
input_ids = tf.keras.layers.Input(
shape=(max_length,), dtype=tf.int32, name="input_ids"
)
# Attention masks indicates to the model which tokens should be attended to.
attention_masks = tf.keras.layers.Input(
shape=(max_length,), dtype=tf.int32, name="attention_masks"
)
# Token type ids are binary masks identifying different sequences in the model.
token_type_ids = tf.keras.layers.Input(
shape=(max_length,), dtype=tf.int32, name="token_type_ids"
)
# Loading pretrained BERT model.
bert_model = transformers.TFBertModel.from_pretrained("bert-base-uncased")
# Freeze the BERT model to reuse the pretrained features without modifying them.
bert_model.trainable = False
output = bert_model(
input_ids, attention_mask=attention_masks, token_type_ids=token_type_ids)
bi_lstm = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(64, return_sequences=True)
)(output['last_hidden_state'])
# Applying hybrid pooling approach to bi_lstm sequence output.
avg_pool = tf.keras.layers.GlobalAveragePooling1D()(bi_lstm)
max_pool = tf.keras.layers.GlobalMaxPooling1D()(bi_lstm)
concat = tf.keras.layers.concatenate([avg_pool, max_pool])
dropout = tf.keras.layers.Dropout(0.3)(concat)
output = tf.keras.layers.Dense(3, activation="softmax")(dropout)
model = tf.keras.models.Model(
inputs=[input_ids, attention_masks, token_type_ids], outputs=output
)
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="categorical_crossentropy",
metrics=["acc"],
)
print(f"Strategy: strategy")
model.summary()
如果您想使用所有隐藏状态而不仅仅是最后一个,请尝试以下操作。请注意,您必须将 BertConfig 的 output_hidden_states
参数设置为 True
并将此配置传递给 Bert 模型。输出是一个包含 13 个隐藏状态的列表,这就是为什么它们在被传递到 LSTM
层之前被连接起来。
import tensorflow as tf
from transformers import BertTokenizer, BertConfig
import transformers
# Loading pretrained BERT model.
config = BertConfig.from_pretrained("bert-base-uncased", output_hidden_states=True)
bert_model = transformers.TFBertModel.from_pretrained("bert-base-uncased", config = config)
# Freeze the BERT model to reuse the pretrained features without modifying them.
bert_model.trainable = False
output = bert_model(
input_ids, attention_mask=attention_masks, token_type_ids=token_type_ids)
all_hidden_states = tf.concat(output['hidden_states'], axis=1)
bi_lstm = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(64, return_sequences=True)
)(all_hidden_states)
【讨论】:
你能帮忙吗***.com/questions/70911697/…以上是关于TypeError:层的输入应该是张量。得到:last_hidden_state的主要内容,如果未能解决你的问题,请参考以下文章
Tensorflow Slim:TypeError:预期 int32,得到的列表包含类型为“_Message”的张量
Tensorflow 2.5.0 - TypeError:函数构建代码之外的操作正在传递“图形”张量