IPython Notebook:默认编码是啥?
Posted
技术标签:
【中文标题】IPython Notebook:默认编码是啥?【英文标题】:IPython Notebook: What is the default encoding?IPython Notebook:默认编码是什么? 【发布时间】:2013-03-03 11:23:55 【问题描述】:我创建了一个使用 utf-8 编码的包。
调用函数时,它返回一个DataFrame
,其中有一列用utf-8编码。
在命令行中使用 IPython 时,显示该表的内容没有任何问题。使用笔记本时,它会因错误'utf8' codec can't decode byte 0xe7
而崩溃。我在下面附上了完整的回溯。
使用 Notebook 的正确编码是什么?
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-13-92c0011919e7> in <module>()
3 ver = verif.VerificacaoNA()
4 comp, total = ver.executarCompRealFisica(DT_INI, DT_FIN)
----> 5 comp
c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\core\displayhook.pyc in __call__(self, result)
240 self.update_user_ns(result)
241 self.log_output(format_dict)
--> 242 self.finish_displayhook()
243
244 def flush(self):
c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\displayhook.pyc in finish_displayhook(self)
59 sys.stdout.flush()
60 sys.stderr.flush()
---> 61 self.session.send(self.pub_socket, self.msg, ident=self.topic)
62 self.msg = None
63
c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in send(self, stream, msg_or_type, content, parent, ident, buffers, subheader, track, header)
557
558 buffers = [] if buffers is None else buffers
--> 559 to_send = self.serialize(msg, ident)
560 flag = 0
561 if buffers:
c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in serialize(self, msg, ident)
461 content = self.none
462 elif isinstance(content, dict):
--> 463 content = self.pack(content)
464 elif isinstance(content, bytes):
465 # content is already packed, as in a relayed message
c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in <lambda>(obj)
76
77 # ISO8601-ify datetime objects
---> 78 json_packer = lambda obj: jsonapi.dumps(obj, default=date_default)
79 json_unpacker = lambda s: extract_dates(jsonapi.loads(s))
80
c:\Python27-32\lib\site-packages\pyzmq-13.0.0-py2.7-win32.egg\zmq\utils\jsonapi.pyc in dumps(o, **kwargs)
70 kwargs['separators'] = (',', ':')
71
---> 72 return _squash_unicode(jsonmod.dumps(o, **kwargs))
73
74 def loads(s, **kwargs):
c:\Python27-32\lib\json\__init__.pyc in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, encoding, default, **kw)
236 check_circular=check_circular, allow_nan=allow_nan, indent=indent,
237 separators=separators, encoding=encoding, default=default,
--> 238 **kw).encode(obj)
239
240
c:\Python27-32\lib\json\encoder.pyc in encode(self, o)
199 # exceptions aren't as detailed. The list call should be roughly
200 # equivalent to the PySequence_Fast that ''.join() would do.
--> 201 chunks = self.iterencode(o, _one_shot=True)
202 if not isinstance(chunks, (list, tuple)):
203 chunks = list(chunks)
c:\Python27-32\lib\json\encoder.pyc in iterencode(self, o, _one_shot)
262 self.key_separator, self.item_separator, self.sort_keys,
263 self.skipkeys, _one_shot)
--> 264 return _iterencode(o, 0)
265
266 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe7 in position 199: invalid continuation byte
【问题讨论】:
当我将智能引号作为索引或列名中的值嵌入时,我遇到过这种情况。不确定使用什么编码来解决它,但是当我删除智能引号时,问题就消失了。 我已将列设置为 latin-1,错误消失了,但字符串显示未知字符 你能发布一个演示问题的最小代码示例吗? 【参考方案1】:我最近遇到了同样的问题,确实将默认编码设置为 UTF-8 确实可以解决问题:
import sys
reload(sys)
sys.setdefaultencoding("utf-8")
在我的环境(Python 2.7.3)上运行sys.getdefaultencoding()
产生'ascii'
,所以我猜这是默认设置。
另见this related question 和Ian Bicking's blog post on the subject。
【讨论】:
使用 setdefaultencoding 不是一个好主意,例如参见here,例如,它禁用了print
命令。
在 Python 2 上,默认为 ascii。就在 python 3 上,默认值是 utf-8。以上是关于IPython Notebook:默认编码是啥?的主要内容,如果未能解决你的问题,请参考以下文章
服务器(Ubuntu)远程访问ipython notebook(服务器运行ipython notebook 本地浏览器访问)