InvalidArgumentError:输入必须是向量,得到形状:[]
Posted
技术标签:
【中文标题】InvalidArgumentError:输入必须是向量,得到形状:[]【英文标题】:InvalidArgumentError: input must be a vector, got shape: [] 【发布时间】:2020-07-14 09:43:18 【问题描述】:我正在尝试在 pandas 数据框新列中使用 universal sentence encoder
保存文本数据的嵌入,但出现错误。
这就是我想要做的。
module_url = "https://tfhub.dev/google/universal-sentence-encoder/4" #@param ["https://tfhub.dev/google/universal-sentence-encoder/4", "https://tfhub.dev/google/universal-sentence-encoder-large/5"]
model = thub.load(module_url)
print ("module %s loaded" % module_url)
def embed(input):
return model(input)
然后
for t in list(df['title'].str.strip().iteritems()):
df['new'] = np.array(embed(t[1]))
这是为了最终将 df['title'] 列值转换为此处的文本,并将其嵌入到字典中。 'how are you?'
:embedding
但两者都做不到。
得到标题中的错误。
InvalidArgumentError Traceback (most recent call last)
<ipython-input-26-79969d6e031c> in <module>
1 for t in list(df['title'].str.strip().iteritems()):
----> 2 df['new'] = np.array(embed(t[1]))
3
<ipython-input-7-c4fca4bebab0> in embed(input)
3 print ("module %s loaded" % module_url)
4 def embed(input):
----> 5 return model(input)
c:\users\sujee\desktop\environments\projectnlp\lib\site-packages\tensorflow_core\python\saved_model\load.py in _call_attribute(instance, *args, **kwargs)
436
437 def _call_attribute(instance, *args, **kwargs):
--> 438 return instance.__call__(*args, **kwargs)
439
440
c:\users\sujee\desktop\environments\projectnlp\lib\site-packages\tensorflow_core\python\eager\def_function.py in __call__(self, *args, **kwds)
566 xla_context.Exit()
567 else:
--> 568 result = self._call(*args, **kwds)
569
570 if tracing_count == self._get_tracing_count():
c:\users\sujee\desktop\environments\projectnlp\lib\site-packages\tensorflow_core\python\eager\def_function.py in _call(self, *args, **kwds)
604 # In this case we have not created variables on the first call. So we can
605 # run the first trace but we should fail if variables are created.
--> 606 results = self._stateful_fn(*args, **kwds)
607 if self._created_variables:
608 raise ValueError("Creating variables on a non-first call to a function"
c:\users\sujee\desktop\environments\projectnlp\lib\site-packages\tensorflow_core\python\eager\function.py in __call__(self, *args, **kwargs)
2361 with self._lock:
2362 graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
-> 2363 return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
2364
2365 @property
c:\users\sujee\desktop\environments\projectnlp\lib\site-packages\tensorflow_core\python\eager\function.py in _filtered_call(self, args, kwargs)
1609 if isinstance(t, (ops.Tensor,
1610 resource_variable_ops.BaseResourceVariable))),
-> 1611 self.captured_inputs)
1612
1613 def _call_flat(self, args, captured_inputs, cancellation_manager=None):
c:\users\sujee\desktop\environments\projectnlp\lib\site-packages\tensorflow_core\python\eager\function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
1690 # No tape is watching; skip to running the function.
1691 return self._build_call_outputs(self._inference_function.call(
-> 1692 ctx, args, cancellation_manager=cancellation_manager))
1693 forward_backward = self._select_forward_and_backward_functions(
1694 args,
c:\users\sujee\desktop\environments\projectnlp\lib\site-packages\tensorflow_core\python\eager\function.py in call(self, ctx, args, cancellation_manager)
543 inputs=args,
544 attrs=("executor_type", executor_type, "config_proto", config),
--> 545 ctx=ctx)
546 else:
547 outputs = execute.execute_with_cancellation(
c:\users\sujee\desktop\environments\projectnlp\lib\site-packages\tensorflow_core\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
65 else:
66 message = e.message
---> 67 six.raise_from(core._status_to_exception(e.code, message), None)
68 except TypeError as e:
69 keras_symbolic_tensors = [
c:\users\sujee\desktop\environments\projectnlp\lib\site-packages\six.py in raise_from(value, from_value)
InvalidArgumentError: input must be a vector, got shape: []
[[node StatefulPartitionedCall/StatefulPartitionedCall/text_preprocessor/tokenize/StringSplit/StringSplit]] [Op:__inference_restored_function_body_5286]
Function call stack:
restored_function_body
刚接触 tensorflow,所以不知道如何解决这个问题。
这里是打印print(np.array(embed(t[1])))
时生成的一些numpy数组值(embeddings
)。
https://paste.pythondiscord.com/pigaqumuha.py
【问题讨论】:
【参考方案1】:在pandas dataframe
New
列中使用universal sentence encoder
保存文本数据的embeddings
的代码,连同输出如下所示:
import tensorflow_hub as hub
import tensorflow as tf
embed = hub.load("https://tfhub.dev/google/universal-sentence-encoder-large/5")
embeddings = embed([ "The quick brown fox jumps over the lazy dog.", "I am a sentence for which I would like to get its embedding"])
print(embeddings)
import pandas as pd
data = [ ["The quick brown fox jumps over the lazy dog."], ["I am a sentence for which I would like to get its embedding"]]
df = pd.DataFrame(data, columns = ['Title'])
print(df)
df['New'] = list(tf.keras.backend.eval(embeddings))
print(df)
输出如下:
tf.Tensor(
[[ 0.01305107 0.02235125 -0.03263275 ... -0.00565093 -0.0479303
-0.11492757]
[ 0.05833393 -0.0818501 0.06890941 ... -0.00923877 -0.08695354
-0.01415738]], shape=(2, 512), dtype=float32)
【讨论】:
以上是关于InvalidArgumentError:输入必须是向量,得到形状:[]的主要内容,如果未能解决你的问题,请参考以下文章
InvalidArgumentError:输出形状的内部尺寸必须匹配更新形状的内部尺寸
Tensorflow 错误:InvalidArgumentError:您必须使用 dtype float 和 shape[?:784]] 为占位符张量“Placeholder”提供一个值
InvalidArgumentError: input_1:0 已输入和提取
InvalidArgumentError: ConcatOp : 在使用 Conv2D 预测 X_test 时,输入的维度应该匹配 - 为啥?
InvalidArgumentError:重塑的输入是一个178802值的张量,但请求的形状有89401
如何修复 InvalidArgumentError:logits 和标签必须是可广播的:logits_size=[32,198] labels_size=[32,3]