UnicodeEncodeError：'ascii'编解码器无法在位置 7 编码字符 u'\xe9'：序数不在范围内（128）[重复]

Posted 2023-02-23

技术标签:

【中文标题】UnicodeEncodeError：\'ascii\'编解码器无法在位置 7 编码字符 u\'\\xe9\'：序数不在范围内（128）[重复]【英文标题】：UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 7: ordinal not in range(128) [duplicate]UnicodeEncodeError：'ascii'编解码器无法在位置 7 编码字符 u'\xe9'：序数不在范围内（128）[重复] 【发布时间】：2013-11-18 22:49:12 【问题描述】：

我有这个代码：

    printinfo = title + "\t" + old_vendor_id + "\t" + apple_id + '\n'
    # Write file
    f.write (printinfo + '\n')

但运行时出现此错误：

    f.write(printinfo + '\n')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 7: ordinal not in range(128)

写出来有点麻烦：

Identité secrète (Abduction) [VF]

请有任何想法，不知道如何解决。

干杯。

更新：这是我的大部分代码，所以你可以看到我在做什么：

def runLookupEdit(self, event):
    newpath1 = pathindir + "/"
    errorFileOut = newpath1 + "REPORT.csv"
    f = open(errorFileOut, 'w')

global old_vendor_id

for old_vendor_id in vendorIdsIn.splitlines():
    writeErrorFile = 0
    from lxml import etree
    parser = etree.XMLParser(remove_blank_text=True) # makes pretty print work

    path1 = os.path.join(pathindir, old_vendor_id)
    path2 = path1 + ".itmsp"
    path3 = os.path.join(path2, 'metadata.xml')

    # Open and parse the xml file
    cantFindError = 0
    try:
        with open(path3): pass
    except IOError:
        cantFindError = 1
        errorMessage = old_vendor_id
        self.Error(errorMessage)
        break
    tree = etree.parse(path3, parser)
    root = tree.getroot()

    for element in tree.xpath('//video/title'):
        title = element.text
        while '\n' in title:
            title= title.replace('\n', ' ')
        while '\t' in title:
            title = title.replace('\t', ' ')
        while '  ' in title:
            title = title.replace('  ', ' ')
        title = title.strip()
        element.text = title
    print title

#########################################
######## REMOVE UNWANTED TAGS ########
#########################################

    # Remove the comment tags
    comments = tree.xpath('//comment()')
    q = 1
    for c in comments:
        p = c.getparent()
        if q == 3:
            apple_id = c.text
        p.remove(c)
        q = q+1

    apple_id = apple_id.split(':',1)[1]
    apple_id = apple_id.strip()
    printinfo = title + "\t" + old_vendor_id + "\t" + apple_id

    # Write file
    # f.write (printinfo + '\n')
    f.write(printinfo.encode('utf8') + '\n')
f.close()

【问题讨论】：

如果您查看问题的右侧，您会注意到一列“相关”问题。我建议您从查看它们开始。在编写问题标题时，您还会得到一个可能重复的列表。 @MartijnPieters：你是对的，像往常一样。评论已删除。 【参考方案1】：

您需要在写入文件之前显式编码 Unicode，否则 Python 会使用默认的 ASCII 编解码器为您完成编码。

选择一种编码并坚持下去：

f.write(printinfo.encode('utf8') + '\n')

或使用io.open() 创建一个文件对象，在您写入文件时为您编码：

import io

f = io.open(filename, 'w', encoding='utf8')

您可能想阅读：

Python Unicode HOWTO

Pragmatic Unicode by Ned Batchelder

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) 乔尔·斯波尔斯基

在继续之前。

【讨论】：

使用 f.write(printinfo.encode('utf8') + '\n') 可以工作，但会创建奇怪的字符 Identit√© secr√®te (Abduction) [VF]，应该重音 Identité秘密（绑架）[VF] @speedyrazor：请做阅读我提供的链接。您正在打开一个 UTF-8 文件，其中的内容将字节显示为不同的编码。为您的应用选择正确的编码。 @Martin Pieters：我读过，但不是很明白。如果我正在阅读的 XML 文件中有“Identité secrète”，我会选择行并将它们写入文件，但该行显示为“Identit√© secr√®te”。不好意思问一下，请问用什么代码可以解决这个问题？ @speedyrazor：您的 XML 文件也使用了编解码器。它要么使用 UTF-8，要么在 XML 文件的第一行指定了不同的编解码器。然后 XML 解析器将该数据解码为 Unicode 值。将值写入文件时，您需要再次选择编解码器来写入字节。我为您选择了 UTF-8，因为该编解码器可以对所有 unicode 进行编码，但是无论您用来查看结果文件的什么，都使用不同编解码器来解释字节。 é 字符是 unicode 代码点 U+00E9。 UTF-8 将其编码为两个字节，十六进制 C3 和 A9。误解这两个字节会给你√©。 @speedyrazor：不知道你是如何再次阅读生产文件的，我无法进一步帮助你。

以上是关于UnicodeEncodeError：'ascii'编解码器无法在位置 7 编码字符 u'\xe9'：序数不在范围内（128）[重复]的主要内容，如果未能解决你的问题，请参考以下文章