Huggingface 转换器模型返回字符串而不是 logits

Posted

技术标签:

【中文标题】Huggingface 转换器模型返回字符串而不是 logits【英文标题】:Huggingface transformer model returns string instead of logits 【发布时间】:2021-03-02 05:42:26 【问题描述】:

我正在尝试从 huggingface 网站运行此示例。 https://huggingface.co/transformers/task_summary.html。似乎模型返回两个字符串而不是 logits!这会导致 torch.argmax() 引发错误

    from transformers import AutoTokenizer, AutoModelForQuestionAnswering
    import torch
    
    tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
    
    model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad", return_dict=True)
    
    text = r"""???? Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose
    architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural
    Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between
    TensorFlow 2.0 and PyTorch.
    """
    
    questions = ["How many pretrained models are available in ???? Transformers?",
    "What does ???? Transformers provide?",
    "???? Transformers provides interoperability between which frameworks?"]
    
    for question in questions:
      inputs = tokenizer(question, text, add_special_tokens=True, return_tensors="pt")
      input_ids = inputs["input_ids"].tolist()[0] # the list of all indices of words in question + context
    
      text_tokens = tokenizer.convert_ids_to_tokens(input_ids) # Get the tokens for the question + context
      answer_start_scores, answer_end_scores = model(**inputs)
    
      answer_start = torch.argmax(answer_start_scores)  # Get the most likely beginning of answer with the argmax of the score
      answer_end = torch.argmax(answer_end_scores) + 1  # Get the most likely end of answer with the argmax of the score
    
      answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end]))
    
      print(f"Question: question")
      print(f"Answer: answer")

【问题讨论】:

有同样的问题(和头痛)-***.com/q/67511285/758836,所以谢谢你,因为这里有答案! 【参考方案1】:

由于最近的一次更新,模型现在返回特定于任务的输出对象(它们是字典)而不是普通的元组。您使用的网站尚未更新以反映该更改。您可以通过指定 return_dict=False 来强制模型返回一个元组:

answer_start_scores, answer_end_scores = model(**inputs, return_dict=False)

或者您可以通过调用values() 方法从QuestionAnsweringModelOutput 对象中提取值:

answer_start_scores, answer_end_scores = model(**inputs).values()

甚至使用QuestionAnsweringModelOutput 对象:

outputs = model(**inputs)
answer_start_scores = outputs.start_logits
answer_end_scores = outputs.end_logits

【讨论】:

以上是关于Huggingface 转换器模型返回字符串而不是 logits的主要内容,如果未能解决你的问题,请参考以下文章

通过 Huggingface 转换器更新 BERT 模型

如何将 HuggingFace 的 Seq2seq 模型转换为 onnx 格式

变压器模型预测的意外结果

在 Huggingface BERT 模型之上添加密集层

将 AllenNLP 解释与 HuggingFace 模型一起使用

在训练 Bert 二进制分类模型时,Huggingface 变形金刚返回“ValueError:要解包的值太多(预期为 2)”