在 Keras 中使用通用句子编码器嵌入层

Posted 2023-02-16

技术标签:

【中文标题】在 Keras 中使用通用句子编码器嵌入层【英文标题】：Using the a Universal Sentence Encoder Embedding Layer in Keras 【发布时间】：2021-03-13 07:48:40 【问题描述】：

我正在尝试使用 Keras 将 USE 作为嵌入层加载到我的模型中。我使用了两种方法。第一个改编自代码here如下：

import tensorflow as tf
tf.config.experimental_run_functions_eagerly(True)

import tensorflow_hub as hub
from keras import backend as K
 
module_url = "../emb_models/use/universal-sentence-encoder-large-5"
embed = hub.load(module_url)

# For the keras Lambda
def UniversalEmbedding(x):
    results = embed(tf.squeeze(tf.cast(x, tf.string)))
    # results = embed(tf.squeeze(tf.cast(x, tf.string)))["outputs"] 
    # removed outputs as it gave an error "TypeError: Only integers, slices (`:`), ellipsis (`...`),
    # tf.newaxis (`None`) and scalar tf.int32/tf.int64 tensors are valid indices, got 'outputs'"
    print(results)
    return K.concatenate([results])

# model
sentence_input = Input(shape=(1,), name='sentences', dtype="string")
sentence_embeds = Lambda(UniversalEmbedding, output_shape=(embed_size,))(sentence_input)

模型已成功创建，但在拟合过程中（一旦开始训练）它给出了以下错误：

2020-12-01 10:45:12.307164: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at lookup_table_op.cc:809 : Failed precondition: Table not initialized.

第二种方法改编自这个issue如下：

module_url = "../emb_models/use/universal-sentence-encoder-large-5"
use_embeddings_layer = hub.KerasLayer(module_url, trainable=False, dtype=tf.string)

inputs = tf.keras.layers.Input(shape=(None,), dtype='string'))
sentence_input = Input(shape=(1,), name="sentences", dtype="string") 
sentence_input = Lambda(lambda x: K.squeeze(x, axis=1), name='squeezed_input')(sentence_input)    
sentence_embed = use_embeddings_layer(sentence_input)

模型未创建，出现以下错误：

AttributeError: 'tuple' object has no attribute 'layer'

有什么想法吗？

信息：tensorflow-gpu == 1.14.0，keras==2.3.1，tensorflow-hub==0.8.0

【问题讨论】：

【参考方案1】：

此thread 有一个相关示例，展示了如何将 hub.KerasLayer 与 USE 结合使用。该示例使用 training=true 但它应该与 training=false 一起使用（纯推理，也没有微调）。

另外，最好尝试使用最新版本的 TF（例如 TF 2.5）以排除由于旧 TF 版本引起的任何问题。

【讨论】：

以上是关于在 Keras 中使用通用句子编码器嵌入层的主要内容，如果未能解决你的问题，请参考以下文章