如何在 HuggingFace Transformers GPT-2 中使用过去?
Posted
技术标签:
【中文标题】如何在 HuggingFace Transformers GPT-2 中使用过去?【英文标题】:How to use the past with HuggingFace Transformers GPT-2? 【发布时间】:2020-11-23 17:09:01 【问题描述】:我有:
context = torch.tensor(context, dtype=torch.long, device=self.device)
context = context.unsqueeze(0)
generated = context
with torch.no_grad():
past_outputs = None
for i in trange(num_words):
print(i, num_words)
inputs = "input_ids": generated
outputs, past_outputs = self.model(
**inputs,
past=past_outputs
)
next_token_logits = outputs[
0, -1, :] / (temperature if temperature > 0 else 1.0)
# reptition penalty from CTRL
# (https://arxiv.org/abs/1909.05858)
for _ in set(generated.view(-1).tolist()):
next_token_logits[_] /= repetition_penalty
filtered_logits = top_k_top_p_filtering(
next_token_logits, top_k=top_k, top_p=top_p)
if temperature == 0: # greedy sampling:
next_token = torch.argmax(filtered_logits).unsqueeze(0)
else:
next_token = torch.multinomial(
F.softmax(filtered_logits, dim=-1), num_samples=1)
generated = torch.cat(
(generated, next_token.unsqueeze(0)), dim=1)
这适用于第一次迭代,但在下一次迭代时出现错误:
File "/Users/shamoon/Sites/wordblot/packages/ml-server/generator.py", line 143, in sample_sequence
past=past_outputs
File "/Users/shamoon/.local/share/virtualenvs/ml-server-EdimT5-E/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/Users/shamoon/.local/share/virtualenvs/ml-server-EdimT5-E/lib/python3.7/site-packages/transformers/modeling_gpt2.py", line 601, in forward
output_hidden_states=output_hidden_states,
File "/Users/shamoon/.local/share/virtualenvs/ml-server-EdimT5-E/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/Users/shamoon/.local/share/virtualenvs/ml-server-EdimT5-E/lib/python3.7/site-packages/transformers/modeling_gpt2.py", line 470, in forward
position_embeds = self.wpe(position_ids)
File "/Users/shamoon/.local/share/virtualenvs/ml-server-EdimT5-E/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/Users/shamoon/.local/share/virtualenvs/ml-server-EdimT5-E/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 114, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/Users/shamoon/.local/share/virtualenvs/ml-server-EdimT5-E/lib/python3.7/site-packages/torch/nn/functional.py", line 1724, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self
是不是我做错了什么?
【问题讨论】:
哪一行导致异常?你能得到更广泛的回溯吗? 什么是model
、generated
、temperature
?这个answer 解释了过去的用法。请发布完整的堆栈跟踪。我假设您超过了 1024 的最大输入长度。
model
是gpt2-xl
,generated
在代码中更新。 temperature
是 0.5
能否请您包含完整的堆栈跟踪信息? 'num_words' 的值是多少? context
的初始大小是多少?
你用哪个类来加载你的模型? gpt2lmheadmodel
?
【参考方案1】:
我认为问题在于context
包含超过词汇大小的整数值。我的假设是基于最后的回溯行:
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self
【讨论】:
@Shamoon 你是什么意思? 我也猜想next_token
可能没有词汇了
如果我没有通过past=past_outputs
,那么它可以正常工作。
@Shamoon 你检查了past_outputs
的值吗?【参考方案2】:
我做到了:
outputs, past_outputs = self.models[model_name](
context,
past=past_outputs
)
context = next_token.unsqueeze(0)
【讨论】:
这样做不会丢失初始上下文吗? 你保留过去,所以没关系。我想?以上是关于如何在 HuggingFace Transformers GPT-2 中使用过去?的主要内容,如果未能解决你的问题,请参考以下文章
如何下载 HuggingFace 模型“transformers.trainer.Trainer”?
如何在 Huggingface Trainer 课程中恢复训练时避免迭代 Dataloader?
如何在 HuggingFace Transformers GPT-2 中使用过去?
如何在 HuggingFace Transformers 库中获取中间层的预训练 BERT 模型输出?