屏蔽python日志中的敏感信息

Posted

技术标签:

【中文标题】屏蔽python日志中的敏感信息【英文标题】:Mask out sensitive information in python log 【发布时间】:2018-07-01 00:25:13 【问题描述】:

考虑下面的代码

try:
    r = requests.get('https://sensitive:passw0rd@what.ever/')
    r.raise_for_status()
except requests.HTTPError:
    logging.exception("Failed to what.ever")

这里,如果端点返回不成功的http状态码,将记录以下内容

Traceback (most recent call last):
  File "a.py", line 5, in <module>
    r.raise_for_status()
  File "venv/lib/python3.5/site-packages/requests/models.py", line 928, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://sensitive:passw0rd@what.ever/

问题是密码被记录了。我可以创建一个日志过滤器来完全过滤掉这一行。但是,如果密码只是以某种方式屏蔽掉,那会更方便。由于没有字符串传递给logging.exception,因此在应用程序端进行过滤很棘手。我可以在日志记录框架的哪个位置转换日志记录?

【问题讨论】:

那个 url 是一个神奇的值还是你从一些变量创建的 url? 这实际上是一种魔法,但如果您有任何建议,我都会全力以赴。 ?????? 相关:Hiding Sensitive Data from Logs with Python 【参考方案1】:

显然,这是通过Formatter 完成的。下面的例子

import logging
import re


class SensitiveFormatter(logging.Formatter):
    """Formatter that removes sensitive information in urls."""
    @staticmethod
    def _filter(s):
        return re.sub(r':\/\/(.*?)\@', r'://', s)

    def format(self, record):
        original = logging.Formatter.format(self, record)
        return self._filter(original)

这样使用

import logging
import requests

from sensitive_formatter import SensitiveFormatter

LOG_FORMAT = \
    '%(asctime)s [%(threadName)-16s] %(filename)27s:%(lineno)-4d %(levelname)7s| %(message)s'
logging.basicConfig(level=logging.DEBUG)
log = logging.getLogger(__name__)

# Don't actually configure your logging like this, just to showcase
# the above answer. :)
for handler in logging.root.handlers:
   handler.setFormatter(SensitiveFormatter(LOG_FORMAT))

log.warning('https://not:shown@httpbin.org/basic-auth/expected-user/expected-pass')
try:
    r = requests.get('https://not:shown@httpbin.org/basic-auth/expected-user/expected-pass')
    r.raise_for_status()
except requests.exceptions.RequestException as e:
    log.exception('boom!')

用户/密码将被屏蔽。请参阅下面的示例日志

$ python log_example.py 
2018-05-18 11:59:22,703 [MainThread      ]                      log.py:14   WARNING| https://httpbin.org/basic-auth/user/secret
2018-05-18 11:59:22,747 [MainThread      ]           connectionpool.py:824    DEBUG| Starting new HTTPS connection (1): httpbin.org
2018-05-18 11:59:23,908 [MainThread      ]           connectionpool.py:396    DEBUG| https://httpbin.org:443 "DELETE /basic-auth/user/secret HTTP/1.1" 405 178
2018-05-18 11:59:23,913 [MainThread      ]                      log.py:19     ERROR| boom!
Traceback (most recent call last):
  File "log.py", line 17, in <module>
    r.raise_for_status()
  File "/Users/vidstige/src/so/venv/lib/python3.6/site-packages/requests/models.py", line 935, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 405 Client Error: METHOD NOT ALLOWED for url: https://httpbin.org/basic-auth/user/secret

【讨论】:

你能显示密码被屏蔽后的错误日志是什么样的吗? @StevenVascellaro 绝对 SensitiveFormatter 也可以在没有参数的情况下实例化吗? 是的,可以。然后将使用默认值&amp;#39;%(message)s

以上是关于屏蔽python日志中的敏感信息的主要内容,如果未能解决你的问题,请参考以下文章

银行应用系统日志文件敏感信息脱敏处理方法

[安全开发]日志敏感信息检测-1-身份证

postgresql_anonymizer使用

postgresql_anonymizer使用

python 实现敏感词屏蔽小程序

如何屏蔽 json 字符串中存在的密码?