ValueError:spacy.strings.StringStore 大小已更改,可能表示二进制不兼容。预期来自 C 标头的 80,来自 PyObject 的 64

Posted

技术标签:

【中文标题】ValueError:spacy.strings.StringStore 大小已更改,可能表示二进制不兼容。预期来自 C 标头的 80,来自 PyObject 的 64【英文标题】:ValueError: spacy.strings.StringStore size changed, may indicate binary incompatibility. Expected 80 from C header, got 64 from PyObject 【发布时间】:2021-07-06 13:35:11 【问题描述】:

我正在使用带有 jupyter notebook 的 python 3.8.5

spacy = 3.0.5

neuralcoref = 4.0

下面是我运行测试的代码

import datetime
import re
import time

import pandas as pd

from formative_assessment.dataset_extractor import ConvertDataType
from formative_assessment.feature_extractor import FeatureExtractor


class AEGrading:
    """
        Automatically evaluates, grades and provides feedback to students' answers of the datasets.
        Provides feedback as dict including total data of the student answer.
    """

    def __init__(self, qid, stu_answer, dataset, dataset_path, score=5):

        self.qid = qid
        self.stu_answer = stu_answer
        self.dataset = dataset
        self.length_ratio = len(stu_answer) / len(dataset[qid]["desired_answer"])
        self.score = score
        self.fe = FeatureExtractor(qid, stu_answer, dataset, dataset_path)
        self.wrong_terms = 

        self.feedback = "id": self.qid, "question": self.dataset[self.qid]["question"],
                         "desired_answer": self.dataset[self.qid]["desired_answer"], "student_answer": stu_answer,
                         "length_ratio": self.length_ratio, "is_answered": "-", "is_wrong_answer": "not wrong answer",
                         "interchanged": "-", "missed_topics": "-", "missed_terms": "-", "irrelevant_terms": "-",
                         "score_avg": 0, "our_score": 0

    def is_answered(self, default="not answered"):
        """
            Checks if the student answered or not given the default evaluator's string. Assigns score to 'zero' if not
            answered.

        :param default: str
            String to be checked if student not answered
        :return: bool
            True if student answered, else False
        """

        re_string = " *" + default + " *"

        if re.match(re_string, self.stu_answer.lower()):
            self.feedback["is_answered"] = "not answered"
            self.score = 0
            return False

        else:
            self.feedback["is_answered"] = "answered"
            return True

    def iot_score(self):
        """
            Checks if there are any interchange of topics or missed topics and deduce the score accordingly. Deduce
            nothing from the score if there are no interchange of topics or missed topics

        :return: None
        """
        iot = self.fe.get_interchanged_topics()

        interchanged = iot["interchanged"]
        missed_topics = iot["missed_topics"]
        total_relations = iot["total_relations"]
        topics_num = iot["total_topics"]

        self.feedback["interchanged"] = interchanged
        self.feedback["missed_topics"] = missed_topics

        if interchanged:
            iot_deduce = len(interchanged) / total_relations
            self.score = self.score - (iot_deduce * self.score)

        if missed_topics:
            missed_deduce = len(missed_topics) / topics_num
            self.score = self.score - (missed_deduce * self.score)

    def missed_terms_score(self):
        """
            Checks if there are any missed terms in the student answer and deduce score accordingly

        :return: None
        """

        missed_terms = self.fe.get_missed_terms()
        self.feedback["missed_terms"] = missed_terms.keys()

        total = round(sum(missed_terms.values()), 3)
        self.score = self.score - (total * self.score)  # self.score/2

    def irrelevant_terms_score(self):
        """
            Checks if there are any irrelevant terms in the student answer. We do not deduce score for this feature, as
            we consider any irrelevant term as noise.

        :return: None
        """
        self.feedback["irrelevant_terms"] = self.fe.get_irrelevant_terms()


if __name__ == '__main__':

    PATH = "dataset/mohler/cleaned/"
    max_score = 5

    # Convert the data into  dictionary with ids, their corresponding questions, desired answers and student answers
    convert_data = ConvertDataType(PATH)
    dataset_dict = convert_data.to_dict()

    id_list = list(dataset_dict.keys())
    data = []

    # random.seed(20)
    for s_no in id_list[:7]:

        # s_no = random.choice(id_list)
        question = dataset_dict[s_no]["question"]
        desired_answer = dataset_dict[s_no]["desired_answer"]

        student_answers = dataset_dict[s_no]["student_answers"]
        scores = dataset_dict[s_no]["scores"]
        # score_me = dataset_dict[s_no]["score_me"]
        # score_other = dataset_dict[s_no]["score_other"]

        for index, _ in enumerate(student_answers):
            # index = random.randint(0, 12)
            start = time.time()
            student_answer = str(student_answers[index])

            print(s_no, student_answer)
            aeg = AEGrading(s_no, student_answer, dataset_dict, PATH, max_score)

            if aeg.is_answered():
                aeg.iot_score()
                aeg.missed_terms_score()
                aeg.irrelevant_terms_score()
                if aeg.score == 0:
                    aeg.feedback["is_wrong_answer"] = "wrong_answer"

            # aeg.feedback["score_me"] = score_me[index] # Only for mohler data
            # aeg.feedback["score_other"] = score_other[index]
            aeg.feedback["score_avg"] = scores[index]
            aeg.feedback["our_score"] = round((aeg.score * 4)) / 4  # Score in multiples of 0.25

            data.append(aeg.feedback)
            print(aeg.feedback)
            print("It took ", time.time() - start, " secs")
            print("----------------------------------------------------------")

            if len(data) % 50 == 0:
                df = pd.DataFrame(data)
                SAVE_PATH = "outputs/automatic_evaluation/II_NN/" + str(datetime.datetime.now()) + ".csv"
                df.to_csv(SAVE_PATH, sep=",")

    df = pd.DataFrame(data)
    SAVE_PATH = "outputs/automatic_evaluation/II_NN/" + str(datetime.datetime.now()) + ".csv"
    df.to_csv(SAVE_PATH, sep=",")

运行上面的代码后,我得到如下错误

ValueError Traceback(最近一次调用最后一次) 在 9 将熊猫导入为 pd 10 ---> 11 从 formative_assessment.dataset_extractor 导入 ConvertDataType 12 从 formative_assessment.feature_extractor 导入 FeatureExtractor 13

~\Desktop\FYP\Automatic-Formative-Assessment-main\formative_assessment\dataset_extractor.py 在 8 将熊猫导入为 pd 9 ---> 10 从 formative_assessment.utilities.utils 导入实用程序 11

~\Desktop\FYP\Automatic-Formative-Assessment-main\formative_assessment\utilities\utils.py 在 11 从输入导入列表 12 ---> 13 导入神经核 14 将 numpy 导入为 np 15 导入pytextrank

~\anaconda3\lib\site-packages\neuralcoref_init_.py in 12 warnings.filterwarnings("ignore", message="spacy.strings.StringStore 大小已更改,") 13 ---> 14 从 .neuralcoref 导入 NeuralCoref 15 从 .file_utils 导入 NEURALCOREF_MODEL_URL、NEURALCOREF_MODEL_PATH、NEURALCOREF_CACHE、cached_pa​​th 16

strings.pxd in init neuralcoref.neuralcoref()

ValueError:spacy.strings.StringStore 大小已更改,可能表示二进制不兼容。预期 C 标头为 80,PyObject 为 64

**我曾尝试卸载以下方法

pip 卸载神经核

pip install neuralcoref --no-binary neuralcoref

但问题还是一样,希望有人能帮助我,非常感谢..**

【问题讨论】:

这里也一样,降级 Spacy 版本可能会有所帮助。 【参考方案1】:

就我而言,我必须降级到 Python 3.7.4 并且它可以工作。看看这里:https://pypi.org/project/neuralcoref/#files,您可以看到“neuralcoref”仅支持 Python 3.5、3.6 和 3.7。

【讨论】:

【参考方案2】:

对我来说,当我使用“从源代码安装 NeuralCoref”(https://github.com/huggingface/neuralcoref) 部分中说明的方法时,它起作用了。

我先安装了 Cython 和 SpaCy,然后按照流程进行。

【讨论】:

欢迎提供指向解决方案的链接,但请确保您的答案在没有它的情况下有用:add context around the link 这样您的其他用户就会知道它是什么以及它存在的原因,然后引用最相关的您链接到的页面的一部分,以防目标页面不可用。 Answers that are little more than a link may be deleted.【参考方案3】:

请看一下这个答案 https://***.com/a/62844213/1264899

要使neuralcoref 工作,您需要使用spaCy 2.1.0 版和python 3.7 版。这是在 Ubuntu 16.04 和 Mac 上唯一适用于神经核的组合。

【讨论】:

【参考方案4】:

正如其他人已经指出的那样,spaCy 3 及更高版本不支持neuralcoref。根据this 的评论,spaCy 团队正在积极解决 coref 解析问题,以便将其包含在他们的库中,敬请期待。 最后,如果您现在需要这个库,您应该创建一个单独的环境并执行以下操作:

git clone https://github.com/huggingface/neuralcoref.git
cd neuralcoref
pip install -r requirements.txt
pip install -e .

【讨论】:

以上是关于ValueError:spacy.strings.StringStore 大小已更改,可能表示二进制不兼容。预期来自 C 标头的 80,来自 PyObject 的 64的主要内容,如果未能解决你的问题,请参考以下文章

ValueError: '对象对于所需数组来说太深'

ValueError:不支持多类格式

如何解决 raise ValueError("columns must have matching element counts") ValueError: columns mus

“ValueError:标签 ['timestamp'] 不包含在轴中”错误

ValueError:不支持连续[重复]

django:ValueError - 无法序列化