Pytorch:IndexError:索引超出自身范围。怎么解决?
Posted
技术标签:
【中文标题】Pytorch:IndexError:索引超出自身范围。怎么解决?【英文标题】:Pytorch: IndexError: index out of range in self. How to solve? 【发布时间】:2020-09-16 17:44:58 【问题描述】:此训练代码基于run_glue.py
脚本found here:
# Set the seed value all over the place to make this reproducible.
seed_val = 42
random.seed(seed_val)
np.random.seed(seed_val)
torch.manual_seed(seed_val)
torch.cuda.manual_seed_all(seed_val)
# Store the average loss after each epoch so we can plot them.
loss_values = []
# For each epoch...
for epoch_i in range(0, epochs):
# ========================================
# Training
# ========================================
# Perform one full pass over the training set.
print("")
print('======== Epoch : / : ========'.format(epoch_i + 1, epochs))
print('Training...')
# Measure how long the training epoch takes.
t0 = time.time()
# Reset the total loss for this epoch.
total_loss = 0
# Put the model into training mode. Don't be mislead--the call to
# `train` just changes the *mode*, it doesn't *perform* the training.
# `dropout` and `batchnorm` layers behave differently during training
# vs. test (source: https://***.com/questions/51433378/what-does-model-train-do-in-pytorch)
model.train()
# For each batch of training data...
for step, batch in enumerate(train_dataloader):
# Progress update every 100 batches.
if step % 100 == 0 and not step == 0:
# Calculate elapsed time in minutes.
elapsed = format_time(time.time() - t0)
# Report progress.
print(' Batch :>5, of :>5,. Elapsed: :.'.format(step, len(train_dataloader), elapsed))
# Unpack this training batch from our dataloader.
#
# As we unpack the batch, we'll also copy each tensor to the GPU using the
# `to` method.
#
# `batch` contains three pytorch tensors:
# [0]: input ids
# [1]: attention masks
# [2]: labels
b_input_ids = batch[0].to(device)
b_input_mask = batch[1].to(device)
b_labels = batch[2].to(device)
# Always clear any previously calculated gradients before performing a
# backward pass. PyTorch doesn't do this automatically because
# accumulating the gradients is "convenient while training RNNs".
# (source: https://***.com/questions/48001598/why-do-we-need-to-call-zero-grad-in-pytorch)
model.zero_grad()
# Perform a forward pass (evaluate the model on this training batch).
# This will return the loss (rather than the model output) because we
# have provided the `labels`.
# The documentation for this `model` function is here:
# https://huggingface.co/transformers/v2.2.0/model_doc/bert.html#transformers.BertForSequenceClassification
outputs = model(b_input_ids,
token_type_ids=None,
attention_mask=b_input_mask,
labels=b_labels)
# The call to `model` always returns a tuple, so we need to pull the
# loss value out of the tuple.
loss = outputs[0]
# Accumulate the training loss over all of the batches so that we can
# calculate the average loss at the end. `loss` is a Tensor containing a
# single value; the `.item()` function just returns the Python value
# from the tensor.
total_loss += loss.item()
# Perform a backward pass to calculate the gradients.
loss.backward()
# Clip the norm of the gradients to 1.0.
# This is to help prevent the "exploding gradients" problem.
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
# Update parameters and take a step using the computed gradient.
# The optimizer dictates the "update rule"--how the parameters are
# modified based on their gradients, the learning rate, etc.
optimizer.step()
# Update the learning rate.
scheduler.step()
# Calculate the average loss over the training data.
avg_train_loss = total_loss / len(train_dataloader)
# Store the loss value for plotting the learning curve.
loss_values.append(avg_train_loss)
print("")
print(" Average training loss: 0:.2f".format(avg_train_loss))
print(" Training epcoh took: :".format(format_time(time.time() - t0)))
# ========================================
# Validation
# ========================================
# After the completion of each training epoch, measure our performance on
# our validation set.
print("")
print("Running Validation...")
t0 = time.time()
# Put the model in evaluation mode--the dropout layers behave differently
# during evaluation.
model.eval()
# Tracking variables
eval_loss, eval_accuracy = 0, 0
nb_eval_steps, nb_eval_examples = 0, 0
# Evaluate data for one epoch
for batch in validation_dataloader:
# Add batch to GPU
batch = tuple(t.to(device) for t in batch)
# Unpack the inputs from our dataloader
b_input_ids, b_input_mask, b_labels = batch
# Telling the model not to compute or store gradients, saving memory and
# speeding up validation
with torch.no_grad():
# Forward pass, calculate logit predictions.
# This will return the logits rather than the loss because we have
# not provided labels.
# token_type_ids is the same as the "segment ids", which
# differentiates sentence 1 and 2 in 2-sentence tasks.
# The documentation for this `model` function is here:
# https://huggingface.co/transformers/v2.2.0/model_doc/bert.html#transformers.BertForSequenceClassification
outputs = model(b_input_ids,
token_type_ids=None,
attention_mask=b_input_mask)
# Get the "logits" output by the model. The "logits" are the output
# values prior to applying an activation function like the softmax.
logits = outputs[0]
# Move logits and labels to CPU
logits = logits.detach().cpu().numpy()
label_ids = b_labels.to('cpu').numpy()
# Calculate the accuracy for this batch of test sentences.
tmp_eval_accuracy = flat_accuracy(logits, label_ids)
# Accumulate the total accuracy.
eval_accuracy += tmp_eval_accuracy
# Track the number of batches
nb_eval_steps += 1
# Report the final accuracy for this validation run.
print(" Accuracy: 0:.2f".format(eval_accuracy/nb_eval_steps))
print(" Validation took: :".format(format_time(time.time() - t0)))
print("")
print("Training complete!")
错误如下,在使用bert模型进行文本分类训练时遇到了如下错误。
~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/sparse.py in forward(self, input)
112 return F.embedding(
113 input, self.weight, self.padding_idx, self.max_norm,
--> 114 self.norm_type, self.scale_grad_by_freq, self.sparse)
115
116 def extra_repr(self):
~/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
1722 # remove once script supports set_grad_enabled
1723 _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 1724 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
1725
1726
IndexError: index out of range in self
我该如何解决?
【问题讨论】:
在您提供的代码中,错误到底在哪里?从您的帖子中不清楚,因为您只提供了在 pytorch 包内发生的错误。既然你说这是在训练期间完成的,那么我认为这是在向前运行。确保b_input_ids
和 b_input_mask
是您所期望的(非空列表或 numpy 数组?),然后再将它们传递给 model(...)
【参考方案1】:
我认为你搞砸了声明torch.nn.Embedding
的输入维度和你的输入。 torch.nn.Embedding
是 simple lookup table that stores embeddings of a fixed dictionary and size。
任何小于零或大于声明的输入维度的输入都会引发此错误。
比较您的输入和torch.nn.Embedding
中提到的维度。
附加代码sn-p来模拟问题。
from torch import nn
input_dim = 10
embedding_dim = 2
embedding = nn.Embedding(input_dim, embedding_dim)
err = True
if err:
#Any input more than input_dim - 1, here input_dim = 10
#Any input less than zero
input_to_embed = torch.tensor([10])
else:
input_to_embed = torch.tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
embed = embedding(input_to_embed)
print(embed)
希望这能解决您的问题。
【讨论】:
这是怎么回事? 10 和 2 是从哪里来的?这段代码去哪里了?我没有看到 torch.nn.Embedding 在任何地方被声明。【参考方案2】:上次我使用 BERT 得到同样的 IndexError: index out of range in self
是因为我的输入文本太长,而我的标记器的输出标记超过 512 个标记。我通过在 512 处截断标记数组来解决它。
encoded_input = tokenizer(text, return_tensors='pt')
#'input_ids': tensor([[ 0, 12350, ..., 363, 2]]),
#'attention_mask': tensor([[1, 1,..., 1, 1]])
encoded_input_trc=
for k,v in encoded_input.items():
v_truncated = v[:,:512]
encoded_input_trc[k]=v_truncated
return encoded_input_trc
【讨论】:
【参考方案3】:当我在数据中有一些无效的标签值时,我发现我得到了这个。当我修复它时,错误也得到了修复。
【讨论】:
以上是关于Pytorch:IndexError:索引超出自身范围。怎么解决?的主要内容,如果未能解决你的问题,请参考以下文章