为自定义 GPT-NEO 模型实现 do_sampling
Posted
技术标签:
【中文标题】为自定义 GPT-NEO 模型实现 do_sampling【英文标题】:implement do_sampling for custom GPT-NEO model 【发布时间】:2021-12-21 15:25:22 【问题描述】:import numpy as np
from transformers import GPTNeoForCausalLM, GPT2Tokenizer
import coremltools as ct
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
sentence_fragment = "The Oceans are"
class NEO(torch.nn.Module):
def __init__(self, model):
super(NEO, self).__init__()
self.next_token_predictor = model
def forward(self, x):
sentence = x
predictions, _ = self.next_token_predictor(sentence)
token = torch.argmax(predictions[-1, :], dim=0, keepdim=True)
sentence = torch.cat((sentence, token), 0)
return sentence
token_predictor = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-125M", torchscript=True).eval()
context = torch.tensor(tokenizer.encode(sentence_fragment))
random_tokens = torch.randint(10000, (5,))
traced_token_predictor = torch.jit.trace(token_predictor, random_tokens)
model = NEO(model=traced_token_predictor)
scripted_model = torch.jit.script(model)
# Custom model
sentence_fragment = "The Oceans are"
for i in range(10):
context = torch.tensor(tokenizer.encode(sentence_fragment))
torch_out = scripted_model(context)
sentence_fragment = tokenizer.decode(torch_out)
print("Custom model: ".format(sentence_fragment))
# Stock model
model = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-125M", torchscript=True).eval()
sentence_fragment = "The Oceans are"
input_ids = tokenizer(sentence_fragment, return_tensors="pt").input_ids
gen_tokens = model.generate(input_ids, do_sample=True, max_length=20)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print("Stock model: "+gen_text)
运行 1
输出:
Custom model: The Oceans are the most important source of water for the entire world
Stock model: The Oceans are on the rise. The American Southwest is thriving, but the southern United States still
运行 2
输出:
Custom model: The Oceans are the most important source of water for the entire world.
Stock model: The Oceans are the land of man
This is a short video of the Australian government
自定义模型始终返回相同的输出。但是,do_sampling = True
股票 model.generate
在每次调用时返回不同的结果。我花了很多时间弄清楚 do_sampling 是如何对变形金刚起作用的,所以我需要你们的帮助,不胜感激。
如何编写自定义模型以在每次调用时产生不同的结果?
谢谢!
【问题讨论】:
【参考方案1】:所以,答案是实施抽样:D
class NEO(torch.nn.Module):
def __init__(self, model):
super(NEO, self).__init__()
self.next_token_predictor = model
def forward(self, x):
sentence = x
predictions, _ = self.next_token_predictor(sentence)
# get top K (k=2) indicies of highest probs of tokens
# 2 indicies would be enough, anyway you will got 2 in a power of N variations
_, topK = torch.topk(predictions[-1, :], 2, dim=0)
# get one of two of those indicies randomly, and concat sentence
perm = torch.randperm(topK.size(0))
idx = perm[:1]
token = topK[idx.long()]
sentence = torch.cat((sentence, token), 0)
return sentence
【讨论】:
以上是关于为自定义 GPT-NEO 模型实现 do_sampling的主要内容,如果未能解决你的问题,请参考以下文章
我想将字符串值从 API 转换为自定义可编码模型 SWIFT