helm install 后 Grafana pod 不断重启

Posted

技术标签:

【中文标题】helm install 后 Grafana pod 不断重启【英文标题】:Grafana pod keeps restarting after helm install 【发布时间】:2019-04-13 09:02:39 【问题描述】:

我有一个干净的 AKS 集群,我部署了 prometheus-operator 图表。 Grafana pod 显示了大量的重新启动。我的集群版本是 1.11.3。 Grafana 日志如下。其他人遇到这个问题吗?

File in configmap grafana-dashboard-k8s-node-rsrc-use.json ADDED
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 543, in _update_chunk_length
    self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''
 During handling of the above exception, another exception occurred:
 Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 302, in _error_catcher
    yield
  File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 598, in read_chunked
    self._update_chunk_length()
  File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 547, in _update_chunk_length
    raise httplib.IncompleteRead(line)
http.client.IncompleteRead: IncompleteRead(0 bytes read)
 During handling of the above exception, another exception occurred:
 Traceback (most recent call last):
  File "/app/sidecar.py", line 58, in <module>
    main()
  File "/app/sidecar.py", line 54, in main
    watchForChanges(label, targetFolder)
  File "/app/sidecar.py", line 23, in watchForChanges
    for event in w.stream(v1.list_config_map_for_all_namespaces):
  File "/usr/local/lib/python3.6/site-packages/kubernetes/watch/watch.py", line 124, in stream
    for line in iter_resp_lines(resp):
  File "/usr/local/lib/python3.6/site-packages/kubernetes/watch/watch.py", line 45, in iter_resp_lines
    for seg in resp.read_chunked(decode_content=False):
  File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 626, in read_chunked
    self._original_response.close()
  File "/usr/local/lib/python3.6/contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 320, in _error_catcher
    raise ProtocolError('Connection broken: %r' % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))

【问题讨论】:

看起来你有一个 python sidecar。你有 grafana 的部署/pod 定义吗? 是的,pod 中有三个容器。 kiwigrid/k8s-sidecar:0.0.3 kiwigrid/k8s-sidecar:0.0.3 grafana/grafana:5.3.1 你用什么来安装这个的?我跟随的指南没有边车 helm install stable/prometheus-operator 【参考方案1】:

基于Prometheus operator repository... Grafana pod 上的 sidecar 容器无法联系 Grafana 并重新加载/刷新正在监视的 configmap 上定义的仪表板。

所以这是 Grafana 容器失败的症状...您可以检查 Grafana 容器日志中的 Grafana 容器吗?

【讨论】:

Grafana 容器的日志显示正常,我可以在浏览器中查看仪表板。吊舱重新启动也已趋于平稳。在最初的 12 个小时左右有 280 个,之后就没有了。仪表板似乎正在工作,但有点令人不安的是,我仍然在原始问题中看到边车容器的日志中出现故障。【参考方案2】:

这可以通过更新到更新版本的边车容器来修复,因为它是一个已修复的已知错误

【讨论】:

以上是关于helm install 后 Grafana pod 不断重启的主要内容,如果未能解决你的问题,请参考以下文章

helm v3 部署prometheus 与 grafana

Mac 环境安装 k8s, helm, harbor

helm 部署 grafana

无法通过 Helm 安装 Grafana

通过 Terraform Helm 提供程序设置 grafana.ini

Grafana 图表的 helm 模板转义值