InvalidArgumentError:找到 2 个根错误。 Tensorflow 文本分类模型中的不兼容形状
Posted
技术标签:
【中文标题】InvalidArgumentError:找到 2 个根错误。 Tensorflow 文本分类模型中的不兼容形状【英文标题】:InvalidArgumentError: 2 root error(s) found. Incompatible shapes in Tensorflow text-classification model 【发布时间】:2020-05-09 02:20:46 【问题描述】:我正在尝试从以下repo 获取代码,该代码基于此paper。它有很多错误,但我大部分都能正常工作。但是,我一直遇到同样的问题,我真的不明白如何解决这个问题/甚至出了什么问题。
第二次验证 if 语句标准时发生错误。第一次总是有效,然后在第二次中断。如果它有帮助,我将包括它在中断之前打印的输出。请参阅下面的错误:
step = 1, train_loss = 1204.7784423828125, train_accuracy = 0.13725490868091583
counter = 1, dev_loss = 1188.6639287274584, dev_accuacy = 0.2814199453625912
step = 2, train_loss = 1000.983154296875, train_accuracy = 0.26249998807907104
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py in _do_call(self, fn, *args)
1364 try:
-> 1365 return fn(*args)
1366 except errors.OpError as e:
7 frames
InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Incompatible shapes: [2,185] vs. [2,229]
[[node loss/cond/add_1]]
[[viterbi_decode/cond/rnn_1/while/Switch_3/_541]]
(1) Invalid argument: Incompatible shapes: [2,185] vs. [2,229]
[[node loss/cond/add_1]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
InvalidArgumentError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py in _do_call(self, fn, *args)
1382 '\nsession_config.graph_options.rewrite_options.'
1383 'disable_meta_optimizer = True')
-> 1384 raise type(e)(node_def, op, message)
1385
1386 def _extend_graph(self):
InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Incompatible shapes: [2,185] vs. [2,229]
[[node loss/cond/add_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
[[viterbi_decode/cond/rnn_1/while/Switch_3/_541]]
(1) Invalid argument: Incompatible shapes: [2,185] vs. [2,229]
[[node loss/cond/add_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.
Original stack trace for 'loss/cond/add_1':
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py", line 16, in <module>
app.launch_new_instance()
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 664, in launch_instance
app.start()
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelapp.py", line 477, in start
ioloop.IOLoop.instance().start()
File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events
self._handle_recv()
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback
callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
handler(stream, idents, msg)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2822, in run_ast_nodes
if self.run_code(code, result):
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-11-90859dc83f76>", line 66, in <module>
main()
File "<ipython-input-11-90859dc83f76>", line 12, in main
model = DAModel()
File "<ipython-input-9-682db36e2a23>", line 148, in __init__
self.logits, self.labels, self.dialogue_lengths)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/crf/python/ops/crf.py", line 257, in crf_log_likelihood
transition_params)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/crf/python/ops/crf.py", line 116, in crf_sequence_score
false_fn=_multi_seq_fn)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/layers/utils.py", line 202, in smart_cond
pred, true_fn=true_fn, false_fn=false_fn, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/smart_cond.py", line 59, in smart_cond
name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/control_flow_ops.py", line 1235, in cond
orig_res_f, res_f = context_f.BuildCondBranch(false_fn)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/control_flow_ops.py", line 1061, in BuildCondBranch
original_result = fn()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/crf/python/ops/crf.py", line 104, in _multi_seq_fn
unary_scores = crf_unary_score(tag_indices, sequence_lengths, inputs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/crf/python/ops/crf.py", line 287, in crf_unary_score
flattened_tag_indices = array_ops.reshape(offsets + tag_indices, [-1])
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/math_ops.py", line 899, in binary_op_wrapper
return func(x, y, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/math_ops.py", line 1197, in _add_dispatch
return gen_math_ops.add_v2(x, y, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_math_ops.py", line 549, in add_v2
"AddV2", x=x, y=y, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()
这是代码(为了让它运行,它与 repo 略有不同:
版本: Python 3
张量流 == 1.15.0
熊猫 == 0.25.3
numpy == 1.17.5
import glob
import pandas as pd
import tensorflow as tf
import pandas as pd
import numpy as np
# preprocess data
file_list = []
for f in glob.glob('swda/*'):
file_list.append(f)
df_list = []
for i in file_list:
df = pd.read_csv(i)
df_list.append(df)
text_list = []
label_list = []
for df in df_list:
df['utterance_no_specialchar_'] = df.utterance_no_specialchar.astype(str)
text = df.utterance_no_specialchar_.tolist()
labels = df.da_category.tolist()
text_list.append(text)
label_list.append(labels)
### new preprocessing step
text_list = [[[j] for j in i] for i in text_list]
tok_data = [y[0] for x in text_list for y in x]
tokenizer = tf.keras.preprocessing.text.Tokenizer()
tokenizer.fit_on_texts(tok_data)
sequences = []
for x in text_list:
tmp = []
for y in x:
tmp.append(tokenizer.texts_to_sequences(y)[0])
sequences.append(tmp)
def _pad_sequences(sequences, pad_tok, max_length):
"""
Args:
sequences: a generator of list or tuple
pad_tok: the char to pad with
Returns:
a list of list where each sublist has same length
"""
sequence_padded, sequence_length = [], []
for seq in sequences:
seq = list(seq)
seq_ = seq[:max_length] + [pad_tok]*max(max_length - len(seq), 0)
sequence_padded += [seq_]
sequence_length += [min(len(seq), max_length)]
return sequence_padded, sequence_length
def pad_sequences(sequences, pad_tok, nlevels=1):
"""
Args:
sequences: a generator of list or tuple
pad_tok: the char to pad with
nlevels: "depth" of padding, for the case where we have characters ids
Returns:
a list of list where each sublist has same length
"""
if nlevels == 1:
max_length = max(map(lambda x : len(x), sequences))
sequence_padded, sequence_length = _pad_sequences(sequences,
pad_tok, max_length)
elif nlevels == 2:
max_length_word = max([max(map(lambda x: len(x), seq))
for seq in sequences])
sequence_padded, sequence_length = [], []
for seq in sequences:
# all words are same length now
sp, sl = _pad_sequences(seq, pad_tok, max_length_word)
sequence_padded += [sp]
sequence_length += [sl]
max_length_sentence = max(map(lambda x : len(x), sequences))
sequence_padded, _ = _pad_sequences(sequence_padded,
[pad_tok]*max_length_word, max_length_sentence)
sequence_length, _ = _pad_sequences(sequence_length, 0,
max_length_sentence)
return sequence_padded, sequence_length
def minibatches(data, labels, batch_size):
data_size = len(data)
start_index = 0
num_batches_per_epoch = int((len(data) + batch_size - 1) / batch_size)
for batch_num in range(num_batches_per_epoch):
start_index = batch_num * batch_size
end_index = min((batch_num + 1) * batch_size, data_size)
yield data[start_index: end_index], labels[start_index: end_index]
def select(parameters, length):
"""Select the last valid time step output as the sentence embedding
:params parameters: [batch, seq_len, hidden_dims]
:params length: [batch]
:Returns : [batch, hidden_dims]
"""
shape = tf.shape(parameters)
idx = tf.range(shape[0])
idx = tf.stack([idx, length - 1], axis = 1)
return tf.gather_nd(parameters, idx)
class DAModel():
def __init__(self):
with tf.variable_scope("placeholder"):
self.dialogue_lengths = tf.placeholder(tf.int32, shape = [None], name = "dialogue_lengths")
self.word_ids = tf.placeholder(tf.int32, shape = [None,None,None], name = "word_ids")
self.utterance_lengths = tf.placeholder(tf.int32, shape = [None, None], name = "utterance_lengths")
self.labels = tf.placeholder(tf.int32, shape = [None, None], name = "labels")
self.clip = tf.placeholder(tf.float32, shape = [], name = 'clip')
######################## EMBEDDINGS ###########################################
with tf.variable_scope("embeddings"):
_word_embeddings = tf.get_variable(
name = "_word_embeddings",
dtype = tf.float32,
shape = [words, word_dim],
initializer = tf.random_uniform_initializer()
)
word_embeddings = tf.nn.embedding_lookup(_word_embeddings,self.word_ids, name="word_embeddings")
self.word_embeddings = tf.nn.dropout(word_embeddings, 0.8)
with tf.variable_scope("utterance_encoder"):
s = tf.shape(self.word_embeddings)
batch_size = s[0] * s[1]
time_step = s[-2]
word_embeddings = tf.reshape(self.word_embeddings, [batch_size, time_step, word_dim])
length = tf.reshape(self.utterance_lengths, [batch_size])
fw = tf.nn.rnn_cell.LSTMCell(hidden_size_lstm_1, forget_bias=0.8, state_is_tuple= True)
bw = tf.nn.rnn_cell.LSTMCell(hidden_size_lstm_1, forget_bias=0.8, state_is_tuple= True)
output, _ = tf.nn.bidirectional_dynamic_rnn(fw, bw, word_embeddings,sequence_length=length, dtype = tf.float32)
output = tf.concat(output, axis = -1) # [batch_size, time_step, dim]
# Select the last valid time step output as the utterance embedding,
# this method is more concise than TensorArray with while_loop
# output = select(output, self.utterance_lengths) # [batch_size, dim]
output = select(output, length) # [batch_size, dim]
# output = tf.reshape(output, s[0], s[1], 2 * hidden_size_lstm_1)
output = tf.reshape(output, [s[0], s[1], 2 * hidden_size_lstm_1])
output = tf.nn.dropout(output, 0.8)
with tf.variable_scope("bi-lstm"):
cell_fw = tf.contrib.rnn.BasicLSTMCell(hidden_size_lstm_2, state_is_tuple = True)
cell_bw = tf.contrib.rnn.BasicLSTMCell(hidden_size_lstm_2, state_is_tuple = True)
(output_fw, output_bw), _ = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, output, sequence_length = self.dialogue_lengths, dtype = tf.float32)
outputs = tf.concat([output_fw, output_bw], axis = -1)
outputs = tf.nn.dropout(outputs, 0.8)
with tf.variable_scope("proj1"):
output = tf.reshape(outputs, [-1, 2 * hidden_size_lstm_2])
W = tf.get_variable("W", dtype = tf.float32, shape = [2 * hidden_size_lstm_2, proj1], initializer= tf.contrib.layers.xavier_initializer())
b = tf.get_variable("b", dtype = tf.float32, shape = [proj1], initializer=tf.zeros_initializer())
output = tf.nn.relu(tf.matmul(output, W) + b)
with tf.variable_scope("proj2"):
W = tf.get_variable("W", dtype = tf.float32, shape = [proj1, proj2], initializer= tf.contrib.layers.xavier_initializer())
b = tf.get_variable("b", dtype = tf.float32, shape = [proj2], initializer=tf.zeros_initializer())
output = tf.nn.relu(tf.matmul(output, W) + b)
with tf.variable_scope("logits"):
nstep = tf.shape(outputs)[1]
W = tf.get_variable("W", dtype = tf.float32,shape=[proj2, tags], initializer = tf.random_uniform_initializer())
b = tf.get_variable("b", dtype = tf.float32,shape = [tags],initializer=tf.zeros_initializer())
pred = tf.matmul(output, W) + b
self.logits = tf.reshape(pred, [-1, nstep, tags])
with tf.variable_scope("loss"):
log_likelihood, self.trans_params = tf.contrib.crf.crf_log_likelihood(
self.logits, self.labels, self.dialogue_lengths)
self.loss = tf.reduce_mean(-log_likelihood) + tf.nn.l2_loss(W) + tf.nn.l2_loss(b)
#tf.summary.scalar("loss", self.loss)
with tf.variable_scope("viterbi_decode"):
viterbi_sequence, _ = tf.contrib.crf.crf_decode(self.logits, self.trans_params, self.dialogue_lengths)
batch_size = tf.shape(self.dialogue_lengths)[0]
output_ta = tf.TensorArray(dtype = tf.float32, size = 1, dynamic_size = True)
def body(time, output_ta_1):
length = self.dialogue_lengths[time]
vcode = viterbi_sequence[time][:length]
true_labs = self.labels[time][:length]
accurate = tf.reduce_sum(tf.cast(tf.equal(vcode, true_labs), tf.float32))
output_ta_1 = output_ta_1.write(time, accurate)
return time + 1, output_ta_1
def condition(time, output_ta_1):
return time < batch_size
i = 0
[time, output_ta] = tf.while_loop(condition, body, loop_vars = [i, output_ta])
output_ta = output_ta.stack()
accuracy = tf.reduce_sum(output_ta)
self.accuracy = accuracy / tf.reduce_sum(tf.cast(self.dialogue_lengths, tf.float32))
#tf.summary.scalar("accuracy", self.accuracy)
with tf.variable_scope("train_op"):
optimizer = tf.train.AdagradOptimizer(0.1)
#if tf.greater(self.clip , 0):
grads, vs = zip(*optimizer.compute_gradients(self.loss))
grads, gnorm = tf.clip_by_global_norm(grads, self.clip)
self.train_op = optimizer.apply_gradients(zip(grads, vs))
#else:
# self.train_op = optimizer.minimize(self.loss)
#self.merged = tf.summary.merge_all()
### Set model variables
hidden_size_lstm_1 = 200
hidden_size_lstm_2 = 200
tags = 39 # assuming number of classes to predict?
word_dim = 300
proj1 = 200
proj2 = 100
words = 20001
# words = 8759 + 1 # max(num_unique_word_tokens)
batchSize = 2
log_dir = "train"
model_dir = "DAModel"
model_name = "ckpt"
### Run model
def main():
# tokenize and vectorize text data to prepare for embedding
train_data = sequences[:75]
train_labels = label_list[:75]
dev_data = sequences[75:]
dev_labels = label_list[75:]
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
with tf.Session(config = config) as sess:
model = DAModel()
sess.run(tf.global_variables_initializer())
clip = 2
saver = tf.train.Saver()
#writer = tf.summary.FileWriter("D:\\Experimemts\\tensorflow\\DA\\train", sess.graph)
writer = tf.summary.FileWriter("train", sess.graph)
counter = 0
for epoch in range(10):
for dialogues, labels in minibatches(train_data, train_labels, batchSize):
_, dialogue_lengthss = pad_sequences(dialogues, 0)
word_idss, utterance_lengthss = pad_sequences(dialogues, 0, nlevels = 2)
true_labs = labels
labs_t, _ = pad_sequences(true_labs, 0)
counter += 1
train_loss, train_accuracy, _ = sess.run([model.loss, model.accuracy,model.train_op], feed_dict = model.word_ids: word_idss, model.utterance_lengths: utterance_lengthss, model.dialogue_lengths: dialogue_lengthss, model.labels:labs_t, model.clip :clip )
#writer.add_summary(summary, global_step = counter)
print("step = , train_loss = , train_accuracy = ".format(counter, train_loss, train_accuracy))
train_precision_summ = tf.Summary()
train_precision_summ.value.add(
tag='train_accuracy', simple_value=train_accuracy)
writer.add_summary(train_precision_summ, counter)
train_loss_summ = tf.Summary()
train_loss_summ.value.add(
tag='train_loss', simple_value=train_loss)
writer.add_summary(train_loss_summ, counter)
if counter % 1 == 0:
loss_dev = []
acc_dev = []
for dev_dialogues, dev_labels in minibatches(dev_data, dev_labels, batchSize):
_, dialogue_lengthss = pad_sequences(dev_dialogues, 0)
word_idss, utterance_lengthss = pad_sequences(dev_dialogues, 0, nlevels = 2)
true_labs = dev_labels
labs_t, _ = pad_sequences(true_labs, 0)
dev_loss, dev_accuacy = sess.run([model.loss, model.accuracy], feed_dict = model.word_ids: word_idss, model.utterance_lengths: utterance_lengthss, model.dialogue_lengths: dialogue_lengthss, model.labels:labs_t)
loss_dev.append(dev_loss)
acc_dev.append(dev_accuacy)
valid_loss = sum(loss_dev) / len(loss_dev)
valid_accuracy = sum(acc_dev) / len(acc_dev)
dev_precision_summ = tf.Summary()
dev_precision_summ.value.add(
tag='dev_accuracy', simple_value=valid_accuracy)
writer.add_summary(dev_precision_summ, counter)
dev_loss_summ = tf.Summary()
dev_loss_summ.value.add(
tag='dev_loss', simple_value=valid_loss)
writer.add_summary(dev_loss_summ, counter)
print("counter = , dev_loss = , dev_accuacy = ".format(counter, valid_loss, valid_accuracy))
if __name__ == "__main__":
tf.reset_default_graph()
main()
数据来自here,如下所示:
[[['what '],
['do you want to start '],
['f uh laughter you hit you hit f uh '],
['it doesnt matter '],
['f um were discussing the capital punishment i believe '],
['right '],
['you are right '],
['yeah '],
[' i i suppose i should have '],
['f uh which '],
['i am am pro capital punishment except that i dont like the way its done '],
['uhhuh '],
['f uh yeah '],
['f uh i f uh i guess i i hate to see anyone die f uh ']
...
]]
可以在这里找到训练模型的数据集: https://github.com/cmeaton/Hierarchical_BiLSTM-CRF_Encoder/tree/master/swda_parsed
我很难理解这个错误的含义以及如何理解它。任何建议将不胜感激。谢谢。
【问题讨论】:
这是语音识别吗?什么是data_list
和label_list
变量?我知道您已经为其中一个(可能是label_list
)提供了数据,但是有没有办法为另一个获取一些样本?
@thushv89 谢谢。这不是语音识别,而是对话行为分类。 “序列”是我用作 X 的东西。它是标记化的文本数据。 label_list 是我的 y,我试图预测的类。上面的要点中提供了每个示例。
感谢您的数据。你真的可以减少数据的大小(可能是 10 个样本)并格式化它。由于缩进问题,无法复制和粘贴。
@thushv89 没问题,我更新了要点链接。它包括 2 项文本/令牌数据及其标签,每项限制为 5 个样本。
@connor449 该错误不可重现。您能否提供一个最小代码来重现您的数据样本中的错误?
【参考方案1】:
简介
我认为主要问题是您提供给sess.run
的数组(或矩阵或其他结构)的大小不匹配。特别是当你打电话时:
train_loss, train_accuracy, _ = sess.run([model.loss, model.accuracy,model.train_op], feed_dict = model.word_ids: word_idss, model.utterance_lengths: utterance_lengthss, model.dialogue_lengths: dialogue_lengthss, model.labels:labs_t, model.clip :clip )
更具体地说,这里的这个错误暗示这是一个不匹配的问题:
tensorflow.python.framework.errors_impl.InvalidArgumentError:
indices[317] = [317, -1] does not index into param shape [318,39,400]
[[node utterance_encoder/GatherNd]]
我认为在全新安装上运行可能会导致无错误运行。
我收到了类似的错误,但也收到了完整的警告列表。 请注意我在 Windows 7 上运行并使用 python 3.6.1。
版本
我尝试了以下 tensorflow 版本,但没有成功:
tf 1.15 tf 1.14 tf 1.13.1 tf 1.12 tf 1.11 tf 1.10 tf 1.10 将 keras 降级为 2.2.1步骤
已安装 python 3.6.1(支持的 tensorflow 版本)。为所有用户安装。设置路径。安装在 C:\Python36 pip3 install --user --upgrade tensorflow==1.15 pip3 install --user --upgrade pandas == 0.25.3 pip3 install --user --upgrade numpy == 1.17.5 下载以下内容:https://github.com/cmeaton/Hierarchical_BiLSTM-CRF_Encoder/tree/master/swda_parsed 运行提供的代码结果(包括许多警告)
我认为以下可能很重要:
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[317] = [317, -1] does not index into param shape [318,39,400]
[[node utterance_encoder/GatherNd]]
完整跟踪
WARNING:tensorflow:From test.py:313: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.
WARNING:tensorflow:From test.py:256: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.
WARNING:tensorflow:From test.py:259: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2020-01-31 12:13:10.096283: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
WARNING:tensorflow:From test.py:119: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
WARNING:tensorflow:From test.py:121: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From test.py:130: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.
WARNING:tensorflow:From test.py:137: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From test.py:147: LSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
WARNING:tensorflow:From test.py:150: bidirectional_dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
WARNING:tensorflow:From D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\rnn.py:464: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
WARNING:tensorflow:From D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\rnn_cell_impl.py:958: Layer.add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.add_weight` method instead.
WARNING:tensorflow:From D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\rnn_cell_impl.py:962: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\rnn.py:244: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
WARNING:tensorflow:From test.py:163: BasicLSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
WARNING:tensorflow:From test.py:223: The name tf.train.AdagradOptimizer is deprecated. Please use tf.compat.v1.train.AdagradOptimizer instead.
WARNING:tensorflow:From D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\training\adagrad.py:76: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From test.py:261: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.
WARNING:tensorflow:From test.py:263: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
WARNING:tensorflow:From test.py:265: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.
2020-01-31 12:13:16.563989: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at gather_nd_op.cc:47 : Invalid argument: indices[317] = [317, -1] does not index into param shape [318,39,400]
Traceback (most recent call last):
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call
return fn(*args)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn
target_list, run_metadata)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[317] = [317, -1] does not index into param shape [318,39,400]
[[node utterance_encoder/GatherNd]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 314, in <module>
main()
File "test.py", line 274, in main
train_loss, train_accuracy, _ = sess.run([model.loss, model.accuracy,model.train_op], feed_dict = model.word_ids: word_idss, model.utterance_lengths: utterance_lengthss, model.dialogue_lengths: dialogue_lengthss, model.labels:labs_t, model.clip :clip )
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 956, in run
run_metadata_ptr)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run
run_metadata)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[317] = [317, -1] does not index into param shape [318,39,400]
[[node utterance_encoder/GatherNd (defined at D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
Original stack trace for 'utterance_encoder/GatherNd':
File "test.py", line 314, in <module>
main()
File "test.py", line 260, in main
model = DAModel()
File "test.py", line 155, in __init__
output = select(output, length) # [batch_size, dim]
File "test.py", line 114, in select
return tf.gather_nd(parameters, idx)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\array_ops.py", line 4277, in gather_nd
return gen_array_ops.gather_nd(params, indices, name=name)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\gen_array_ops.py", line 3975, in gather_nd
"GatherNd", params=params, indices=indices, name=name)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()
【讨论】:
【参考方案2】:让我们关注错误:
Invalid argument: Incompatible shapes: [2,185] vs. [2,229]
问题似乎是两个张量之间的运算失败,因为它们的形状不兼容。
您选择的tensorflow
版本可能比作者使用的版本更宽松。
根据this issue,作者猜测他使用了tensorflow==1.8
。
所以首先我建议你尝试使用这个早期版本,或者之前\之后的其他版本(1.7、1.9、1.10 等)。
此外,早期版本可能没有像今天那样集成keras
包,因此您可能还想使用特定的keras
版本。
例如根据this issue,降级到keras==2.2.2
有帮助。
如果这没有帮助,也许其中之一会:1、2、3、4、5、6
【讨论】:
感谢您的回复。我降级到 tensorflow==1.8 和 keras==2.2.0 和同样的错误以上是关于InvalidArgumentError:找到 2 个根错误。 Tensorflow 文本分类模型中的不兼容形状的主要内容,如果未能解决你的问题,请参考以下文章
InvalidArgumentError: input_1:0 已输入和提取
TensorFlow 2.0: InvalidArgumentError: device CUDA:0 not supported by XLA service while setup XLA_GPU
InvalidArgumentError索引[i,0] = x不在keras的[0,x]中