Pytorch：IndexError：索引超出自身范围。怎么解决？

Posted 2023-02-15

技术标签:

【中文标题】Pytorch：IndexError：索引超出自身范围。怎么解决？【英文标题】：Pytorch: IndexError: index out of range in self. How to solve? 【发布时间】：2020-09-16 17:44:58 【问题描述】：

此训练代码基于run_glue.py 脚本found here：

# Set the seed value all over the place to make this reproducible.
seed_val = 42

random.seed(seed_val)
np.random.seed(seed_val)
torch.manual_seed(seed_val)
torch.cuda.manual_seed_all(seed_val)

# Store the average loss after each epoch so we can plot them.
loss_values = []

# For each epoch...
for epoch_i in range(0, epochs):
    
    # ========================================
    #               Training
    # ========================================
    
    # Perform one full pass over the training set.

    print("")
    print('======== Epoch : / : ========'.format(epoch_i + 1, epochs))
    print('Training...')

    # Measure how long the training epoch takes.
    t0 = time.time()

    # Reset the total loss for this epoch.
    total_loss = 0

    # Put the model into training mode. Don't be mislead--the call to 
    # `train` just changes the *mode*, it doesn't *perform* the training.
    # `dropout` and `batchnorm` layers behave differently during training
    # vs. test (source: https://***.com/questions/51433378/what-does-model-train-do-in-pytorch)
    model.train()

    # For each batch of training data...
    for step, batch in enumerate(train_dataloader):

        # Progress update every 100 batches.
        if step % 100 == 0 and not step == 0:
            # Calculate elapsed time in minutes.
            elapsed = format_time(time.time() - t0)
            
            # Report progress.
            print('  Batch :>5,  of  :>5,.    Elapsed: :.'.format(step, len(train_dataloader), elapsed))

        # Unpack this training batch from our dataloader. 
        #
        # As we unpack the batch, we'll also copy each tensor to the GPU using the 
        # `to` method.
        #
        # `batch` contains three pytorch tensors:
        #   [0]: input ids 
        #   [1]: attention masks
        #   [2]: labels 
        b_input_ids = batch[0].to(device)
        b_input_mask = batch[1].to(device)
        b_labels = batch[2].to(device)

        # Always clear any previously calculated gradients before performing a
        # backward pass. PyTorch doesn't do this automatically because 
        # accumulating the gradients is "convenient while training RNNs". 
        # (source: https://***.com/questions/48001598/why-do-we-need-to-call-zero-grad-in-pytorch)
        model.zero_grad()        

        # Perform a forward pass (evaluate the model on this training batch).
        # This will return the loss (rather than the model output) because we
        # have provided the `labels`.
        # The documentation for this `model` function is here: 
        # https://huggingface.co/transformers/v2.2.0/model_doc/bert.html#transformers.BertForSequenceClassification
        outputs = model(b_input_ids, 
                    token_type_ids=None, 
                    attention_mask=b_input_mask, 
                    labels=b_labels)
        
        # The call to `model` always returns a tuple, so we need to pull the 
        # loss value out of the tuple.
        loss = outputs[0]

        # Accumulate the training loss over all of the batches so that we can
        # calculate the average loss at the end. `loss` is a Tensor containing a
        # single value; the `.item()` function just returns the Python value 
        # from the tensor.
        total_loss += loss.item()

        # Perform a backward pass to calculate the gradients.
        loss.backward()

        # Clip the norm of the gradients to 1.0.
        # This is to help prevent the "exploding gradients" problem.
        torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)

        # Update parameters and take a step using the computed gradient.
        # The optimizer dictates the "update rule"--how the parameters are
        # modified based on their gradients, the learning rate, etc.
        optimizer.step()

        # Update the learning rate.
        scheduler.step()

    # Calculate the average loss over the training data.
    avg_train_loss = total_loss / len(train_dataloader)            
    
    # Store the loss value for plotting the learning curve.
    loss_values.append(avg_train_loss)

    print("")
    print("  Average training loss: 0:.2f".format(avg_train_loss))
    print("  Training epcoh took: :".format(format_time(time.time() - t0)))
        
    # ========================================
    #               Validation
    # ========================================
    # After the completion of each training epoch, measure our performance on
    # our validation set.

    print("")
    print("Running Validation...")

    t0 = time.time()

    # Put the model in evaluation mode--the dropout layers behave differently
    # during evaluation.
    model.eval()

    # Tracking variables 
    eval_loss, eval_accuracy = 0, 0
    nb_eval_steps, nb_eval_examples = 0, 0

    # Evaluate data for one epoch
    for batch in validation_dataloader:
        
        # Add batch to GPU
        batch = tuple(t.to(device) for t in batch)
        
        # Unpack the inputs from our dataloader
        b_input_ids, b_input_mask, b_labels = batch
        
        # Telling the model not to compute or store gradients, saving memory and
        # speeding up validation
        with torch.no_grad():        

            # Forward pass, calculate logit predictions.
            # This will return the logits rather than the loss because we have
            # not provided labels.
            # token_type_ids is the same as the "segment ids", which 
            # differentiates sentence 1 and 2 in 2-sentence tasks.
            # The documentation for this `model` function is here: 
            # https://huggingface.co/transformers/v2.2.0/model_doc/bert.html#transformers.BertForSequenceClassification
            outputs = model(b_input_ids, 
                            token_type_ids=None, 
                            attention_mask=b_input_mask)
        
        # Get the "logits" output by the model. The "logits" are the output
        # values prior to applying an activation function like the softmax.
        logits = outputs[0]

        # Move logits and labels to CPU
        logits = logits.detach().cpu().numpy()
        label_ids = b_labels.to('cpu').numpy()
        
        # Calculate the accuracy for this batch of test sentences.
        tmp_eval_accuracy = flat_accuracy(logits, label_ids)
        
        # Accumulate the total accuracy.
        eval_accuracy += tmp_eval_accuracy

        # Track the number of batches
        nb_eval_steps += 1

    # Report the final accuracy for this validation run.
    print("  Accuracy: 0:.2f".format(eval_accuracy/nb_eval_steps))
    print("  Validation took: :".format(format_time(time.time() - t0)))

print("")
print("Training complete!")

错误如下，在使用bert模型进行文本分类训练时遇到了如下错误。

    ~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/sparse.py in forward(self, input)
    112         return F.embedding(
    113             input, self.weight, self.padding_idx, self.max_norm,
--> 114             self.norm_type, self.scale_grad_by_freq, self.sparse)
    115 
    116     def extra_repr(self):

~/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   1722         # remove once script supports set_grad_enabled
   1723         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 1724     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   1725 
   1726 

IndexError: index out of range in self

我该如何解决？

【问题讨论】：

在您提供的代码中，错误到底在哪里？从您的帖子中不清楚，因为您只提供了在 pytorch 包内发生的错误。既然你说这是在训练期间完成的，那么我认为这是在向前运行。确保 b_input_ids 和 b_input_mask 是您所期望的（非空列表或 numpy 数组？），然后再将它们传递给 model(...) 【参考方案1】：

我认为你搞砸了声明torch.nn.Embedding 的输入维度和你的输入。 torch.nn.Embedding 是 simple lookup table that stores embeddings of a fixed dictionary and size。

任何小于零或大于声明的输入维度的输入都会引发此错误。比较您的输入和torch.nn.Embedding 中提到的维度。

附加代码sn-p来模拟问题。

from torch import nn
input_dim = 10
embedding_dim = 2
embedding = nn.Embedding(input_dim, embedding_dim)
err = True
if err:
    #Any input more than input_dim - 1, here input_dim = 10
    #Any input less than zero
    input_to_embed = torch.tensor([10])
else:
    input_to_embed = torch.tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
embed = embedding(input_to_embed)
print(embed)

希望这能解决您的问题。

【讨论】：

这是怎么回事？ 10 和 2 是从哪里来的？这段代码去哪里了？我没有看到 torch.nn.Embedding 在任何地方被声明。【参考方案2】：

上次我使用 BERT 得到同样的 IndexError: index out of range in self 是因为我的输入文本太长，而我的标记器的输出标记超过 512 个标记。我通过在 512 处截断标记数组来解决它。

    encoded_input = tokenizer(text, return_tensors='pt')
    #'input_ids': tensor([[    0, 12350, ...,  363,     2]]),
    #'attention_mask': tensor([[1, 1,..., 1, 1]])

    encoded_input_trc=
    for k,v in encoded_input.items():
        v_truncated = v[:,:512]
        encoded_input_trc[k]=v_truncated
    return encoded_input_trc

【讨论】：

【参考方案3】：

当我在数据中有一些无效的标签值时，我发现我得到了这个。当我修复它时，错误也得到了修复。

【讨论】：

以上是关于Pytorch：IndexError：索引超出自身范围。怎么解决？的主要内容，如果未能解决你的问题，请参考以下文章