json.loads() 是不是容易受到任意代码执行的影响?
Posted
技术标签:
【中文标题】json.loads() 是不是容易受到任意代码执行的影响?【英文标题】:Is json.loads() vulnerable to arbitrary code execution?json.loads() 是否容易受到任意代码执行的影响? 【发布时间】:2016-12-13 06:57:08 【问题描述】:来自 Python 的标准 json
模块的 json.loads
是否容易受到任意代码执行或任何其他安全问题的影响?
我的应用程序可以接收来自不可靠来源的 JSON 消息。
【问题讨论】:
***.com/questions/6794454/json-vs-pickle-security 看看这个 反序列化 sourcecode 不像 看起来 可以执行任意代码(如果解释器本身没问题并且您没有设置额外的参数)但序列化代码显然调用它编码的对象上的方法 还要检查这个答案:***.com/questions/26931919/… 它有一个关于 JSON 的部分 【参考方案1】:请注意,以下答案与 Windows 10 64 位的默认 Python3.4 安装有关。另请注意,此答案仅查看 py 扫描仪,而不是 c 扫描仪。
有关源文件,请参阅https://hg.python.org/cpython/file/tip/Lib/json 或在本地 python 安装中找到它们。
研究
查看这篇文章底部的参考实现以及这项研究
json.loads(s)
调用的解析函数在\Lib\json\scanner.py
中定义:
parse_object = context.parse_object
parse_array = context.parse_array
parse_string = context.parse_string
parse_float = context.parse_float
parse_int = context.parse_int
parse_constant = context.parse_constant
context
是 JSONDecoder
类的一个实例,该类在 \Lib\json\decoder.py
中定义并使用以下解析器:
self.parse_float = parse_float or float
self.parse_int = parse_int or int
self.parse_constant = parse_constant or _CONSTANTS.__getitem__
self.parse_string = scanstring
self.parse_object = JSONObject
self.parse_array = JSONArray
从这里我们可以查看每个单独的解析器,以确定它是否容易受到任意代码执行的影响:
parse_float
这使用默认的float
函数,因此是安全的。
parse_int
这使用默认的int
函数,因此是安全的。
parse_constant
_CONSTANTS
在同一个文件中定义为:
_CONSTANTS =
'-Infinity': NegInf,
'Infinity': PosInf,
'NaN': NaN,
因此正在执行一个简单的查找,因此它是安全的。
解析字符串、JSONObject、JSONArray
通过查看本文末尾的实现可以看出,唯一可以执行的外部代码是:
来自JSONObject
:
object_pairs_hook
object_hook
来自JSONArray
:
scan_once
object_pairs_hook
, object_hook
默认情况下 object_pairs_hook
和 object_hook
在解码器初始化程序中定义为 None
:
def __init__(self, object_hook=None, parse_float=None,
parse_int=None, parse_constant=None, strict=True,
object_pairs_hook=None)
scan_once
scan_once
定义为:
self.scan_once = scanner.make_scanner(self)
可以在\Lib\json\scanner.py
找到它的来源,从中我们可以看到scan_once
只是为JSON对象的每个部分调用了适当的解析器。
结论
从上面和参考实现可以看出只要JSON解码器使用的扫描器是默认的,任意代码都不会被执行,可能通过使用自定义解码器通过使用其__init__
参数来代替使其执行任意代码,但我不这么认为。
执行
反斜杠
BACKSLASH =
'"': '"', '\\': '\\', '/': '/',
'b': '\b', 'f': '\f', 'n': '\n', 'r': '\r', 't': '\t',
STRINGCHUNK
STRINGCHUNK = re.compile(r'(.*?)(["\\\x00-\x1f])', FLAGS)
扫描字符串
def py_scanstring(s, end, strict=True,
_b=BACKSLASH, _m=STRINGCHUNK.match):
"""Scan the string s for a JSON string. End is the index of the
character in s after the quote that started the JSON string.
Unescapes all valid JSON string escape sequences and raises ValueError
on attempt to decode an invalid string. If strict is False then literal
control characters are allowed in the string.
Returns a tuple of the decoded string and the index of the character in s
after the end quote."""
chunks = []
_append = chunks.append
begin = end - 1
while 1:
chunk = _m(s, end)
if chunk is None:
raise ValueError(
errmsg("Unterminated string starting at", s, begin))
end = chunk.end()
content, terminator = chunk.groups()
# Content is contains zero or more unescaped string characters
if content:
_append(content)
# Terminator is the end of string, a literal control character,
# or a backslash denoting that an escape sequence follows
if terminator == '"':
break
elif terminator != '\\':
if strict:
#msg = "Invalid control character %r at" % (terminator,)
msg = "Invalid control character 0!r at".format(terminator)
raise ValueError(errmsg(msg, s, end))
else:
_append(terminator)
continue
try:
esc = s[end]
except IndexError:
raise ValueError(
errmsg("Unterminated string starting at", s, begin))
# If not a unicode escape sequence, must be in the lookup table
if esc != 'u':
try:
char = _b[esc]
except KeyError:
msg = "Invalid \\escape: 0!r".format(esc)
raise ValueError(errmsg(msg, s, end))
end += 1
else:
uni = _decode_uXXXX(s, end)
end += 5
if 0xd800 <= uni <= 0xdbff and s[end:end + 2] == '\\u':
uni2 = _decode_uXXXX(s, end + 1)
if 0xdc00 <= uni2 <= 0xdfff:
uni = 0x10000 + (((uni - 0xd800) << 10) | (uni2 - 0xdc00))
end += 6
char = chr(uni)
_append(char)
return ''.join(chunks), end
scanstring = c_scanstring or py_scanstring
空白
WHITESPACE = re.compile(r'[ \t\n\r]*', FLAGS)
WHITESPACE_STR
WHITESPACE_STR = ' \t\n\r'
JSONObject
def JSONObject(s_and_end, strict, scan_once, object_hook, object_pairs_hook,
memo=None, _w=WHITESPACE.match, _ws=WHITESPACE_STR):
s, end = s_and_end
pairs = []
pairs_append = pairs.append
# Backwards compatibility
if memo is None:
memo =
memo_get = memo.setdefault
# Use a slice to prevent IndexError from being raised, the following
# check will raise a more specific ValueError if the string is empty
nextchar = s[end:end + 1]
# Normally we expect nextchar == '"'
if nextchar != '"':
if nextchar in _ws:
end = _w(s, end).end()
nextchar = s[end:end + 1]
# Trivial empty object
if nextchar == '':
if object_pairs_hook is not None:
result = object_pairs_hook(pairs)
return result, end + 1
pairs =
if object_hook is not None:
pairs = object_hook(pairs)
return pairs, end + 1
elif nextchar != '"':
raise ValueError(errmsg(
"Expecting property name enclosed in double quotes", s, end))
end += 1
while True:
key, end = scanstring(s, end, strict)
key = memo_get(key, key)
# To skip some function call overhead we optimize the fast paths where
# the JSON key separator is ": " or just ":".
if s[end:end + 1] != ':':
end = _w(s, end).end()
if s[end:end + 1] != ':':
raise ValueError(errmsg("Expecting ':' delimiter", s, end))
end += 1
try:
if s[end] in _ws:
end += 1
if s[end] in _ws:
end = _w(s, end + 1).end()
except IndexError:
pass
try:
value, end = scan_once(s, end)
except StopIteration as err:
raise ValueError(errmsg("Expecting value", s, err.value)) from None
pairs_append((key, value))
try:
nextchar = s[end]
if nextchar in _ws:
end = _w(s, end + 1).end()
nextchar = s[end]
except IndexError:
nextchar = ''
end += 1
if nextchar == '':
break
elif nextchar != ',':
raise ValueError(errmsg("Expecting ',' delimiter", s, end - 1))
end = _w(s, end).end()
nextchar = s[end:end + 1]
end += 1
if nextchar != '"':
raise ValueError(errmsg(
"Expecting property name enclosed in double quotes", s, end - 1))
if object_pairs_hook is not None:
result = object_pairs_hook(pairs)
return result, end
pairs = dict(pairs)
if object_hook is not None:
pairs = object_hook(pairs)
return pairs, end
JSONArray
def JSONArray(s_and_end, scan_once, _w=WHITESPACE.match, _ws=WHITESPACE_STR):
s, end = s_and_end
values = []
nextchar = s[end:end + 1]
if nextchar in _ws:
end = _w(s, end + 1).end()
nextchar = s[end:end + 1]
# Look-ahead for trivial empty array
if nextchar == ']':
return values, end + 1
_append = values.append
while True:
try:
value, end = scan_once(s, end)
except StopIteration as err:
raise ValueError(errmsg("Expecting value", s, err.value)) from None
_append(value)
nextchar = s[end:end + 1]
if nextchar in _ws:
end = _w(s, end + 1).end()
nextchar = s[end:end + 1]
end += 1
if nextchar == ']':
break
elif nextchar != ',':
raise ValueError(errmsg("Expecting ',' delimiter", s, end - 1))
try:
if s[end] in _ws:
end += 1
if s[end] in _ws:
end = _w(s, end + 1).end()
except IndexError:
pass
return values, end
scanner.make_scanner
def py_make_scanner(context):
parse_object = context.parse_object
parse_array = context.parse_array
parse_string = context.parse_string
match_number = NUMBER_RE.match
strict = context.strict
parse_float = context.parse_float
parse_int = context.parse_int
parse_constant = context.parse_constant
object_hook = context.object_hook
object_pairs_hook = context.object_pairs_hook
memo = context.memo
def _scan_once(string, idx):
try:
nextchar = string[idx]
except IndexError:
raise StopIteration(idx)
if nextchar == '"':
return parse_string(string, idx + 1, strict)
elif nextchar == '':
return parse_object((string, idx + 1), strict,
_scan_once, object_hook, object_pairs_hook, memo)
elif nextchar == '[':
return parse_array((string, idx + 1), _scan_once)
elif nextchar == 'n' and string[idx:idx + 4] == 'null':
return None, idx + 4
elif nextchar == 't' and string[idx:idx + 4] == 'true':
return True, idx + 4
elif nextchar == 'f' and string[idx:idx + 5] == 'false':
return False, idx + 5
m = match_number(string, idx)
if m is not None:
integer, frac, exp = m.groups()
if frac or exp:
res = parse_float(integer + (frac or '') + (exp or ''))
else:
res = parse_int(integer)
return res, m.end()
elif nextchar == 'N' and string[idx:idx + 3] == 'NaN':
return parse_constant('NaN'), idx + 3
elif nextchar == 'I' and string[idx:idx + 8] == 'Infinity':
return parse_constant('Infinity'), idx + 8
elif nextchar == '-' and string[idx:idx + 9] == '-Infinity':
return parse_constant('-Infinity'), idx + 9
else:
raise StopIteration(idx)
def scan_once(string, idx):
try:
return _scan_once(string, idx)
finally:
memo.clear()
return _scan_once
make_scanner = c_make_scanner or py_make_scanner
【讨论】:
以上是关于json.loads() 是不是容易受到任意代码执行的影响?的主要内容,如果未能解决你的问题,请参考以下文章
json.loads() 返回一个 unicode 对象而不是字典