为自定义 GPT-NEO 模型实现 do_sampling

Posted

技术标签:

【中文标题】为自定义 GPT-NEO 模型实现 do_sampling【英文标题】:implement do_sampling for custom GPT-NEO model 【发布时间】:2021-12-21 15:25:22 【问题描述】:
import numpy as np
from transformers import GPTNeoForCausalLM, GPT2Tokenizer 
import coremltools as ct
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

sentence_fragment = "The Oceans are"

class NEO(torch.nn.Module):
    def __init__(self, model):
        super(NEO, self).__init__()
        self.next_token_predictor = model
    
    def forward(self, x):
        sentence = x
        predictions, _ = self.next_token_predictor(sentence)
        token = torch.argmax(predictions[-1, :], dim=0, keepdim=True)
        sentence = torch.cat((sentence, token), 0)
        return sentence

token_predictor = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-125M", torchscript=True).eval()

context = torch.tensor(tokenizer.encode(sentence_fragment))
random_tokens = torch.randint(10000, (5,))
traced_token_predictor = torch.jit.trace(token_predictor, random_tokens)

model = NEO(model=traced_token_predictor)
scripted_model = torch.jit.script(model)

# Custom model

sentence_fragment = "The Oceans are"

for i in range(10):
    context = torch.tensor(tokenizer.encode(sentence_fragment))
    torch_out = scripted_model(context)
    sentence_fragment = tokenizer.decode(torch_out)
print("Custom model: ".format(sentence_fragment))

# Stock model

model = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-125M", torchscript=True).eval()

sentence_fragment = "The Oceans are"

input_ids = tokenizer(sentence_fragment, return_tensors="pt").input_ids
gen_tokens = model.generate(input_ids, do_sample=True, max_length=20)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print("Stock model: "+gen_text)

运行 1

输出:


Custom model: The Oceans are the most important source of water for the entire world
Stock model: The Oceans are on the rise. The American Southwest is thriving, but the southern United States still

运行 2

输出:


Custom model: The Oceans are the most important source of water for the entire world. 
Stock model: The Oceans are the land of man

This is a short video of the Australian government

自定义模型始终返回相同的输出。但是,do_sampling = True 股票 model.generate 在每次调用时返回不同的结果。我花了很多时间弄清楚 do_sampling 是如何对变形金刚起作用的,所以我需要你们的帮助,不胜感激。

如何编写自定义模型以在每次调用时产生不同的结果?

谢谢!

【问题讨论】:

【参考方案1】:

所以,答案是实施抽样:D

class NEO(torch.nn.Module):
    def __init__(self, model):
        super(NEO, self).__init__()
        self.next_token_predictor = model
    
    def forward(self, x):
        sentence = x
        predictions, _ = self.next_token_predictor(sentence)
        # get top K (k=2) indicies of highest probs of tokens
        # 2 indicies would be enough, anyway you will got 2 in a power of N variations
        _, topK = torch.topk(predictions[-1, :], 2, dim=0)
        # get one of two of those indicies randomly, and concat sentence
        perm = torch.randperm(topK.size(0))
        idx = perm[:1]
        token = topK[idx.long()]
        sentence = torch.cat((sentence, token), 0)
        return sentence

【讨论】:

以上是关于为自定义 GPT-NEO 模型实现 do_sampling的主要内容,如果未能解决你的问题,请参考以下文章

为自定义模型使用 QTableWidget 的排序功能

我想将字符串值从 API 转换为自定义可编码模型 SWIFT

为自定义 ArrayAdapter 实现 Fiterable 接口

为自定义平台实现 REST sdk

为自定义私有类实现 Equatable - Swift

c++为自定义矩阵类实现迭代器