python：转换为HTML特殊字符[重复]

Posted 2023-02-23

技术标签:

【中文标题】python：转换为HTML特殊字符[重复]【英文标题】：python: convert to HTML special characters [duplicate] 【发布时间】：2012-06-15 03:50:37 【问题描述】：

可能重复：Replace html entities with the corresponding utf-8 characters in Python 2.6What's the easiest way to escape HTML in Python?

有一种方法可以轻松地将字符串转换为 HTML 字符串，例如用等字符替换为 &lt; &gt; 还是我必须编写自己的转换程序？？？

【问题讨论】：

见docs.python.org/library/htmllib.html#module-htmlentitydefs 我认为你需要的是“HTML转义”。这就是为什么您没有自己找到答案的原因。 Here is a *** answer. @TimPietzcker：哎呀......标题并没有真正的帮助;-) 【参考方案1】：

如果您只关心&、< 和> 等关键特殊字符：

>>> import cgi
>>> cgi.escape("<hello&goodbye>")
'&lt;hello&amp;goodbye&gt;'

对于其他非 ASCII 字符：

>>> "Übeltäter".encode("ascii", "xmlcharrefreplace")
b'&#220;belt&#228;ter'

当然，如果需要，你可以将两者结合起来：

>>> cgi.escape("<Übeltäter>").encode("ascii", "xmlcharrefreplace")
b'&lt;&#220;belt&#228;ter&gt;'

【讨论】：

>>> "Übeltäter".encode("ascii", "xmlcharrefreplace") 导致UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) cgi.escape() 现在已弃用。请改用html.escape() - 检查this answer

以上是关于python：转换为HTML特殊字符[重复]的主要内容，如果未能解决你的问题，请参考以下文章