收到错误“获取 http://localhost:9443/metrics：拨打 tcp 127.0.0.1:9443：连接：连接被拒绝”

Posted 2023-02-15

技术标签:

【中文标题】收到错误“获取 http://localhost:9443/metrics：拨打 tcp 127.0.0.1:9443：连接：连接被拒绝”【英文标题】：Getting error "Get http://localhost:9443/metrics: dial tcp 127.0.0.1:9443: connect: connection refused" 【发布时间】：2019-06-21 04:31:09 【问题描述】：

我正在尝试使用我的 Hyperledger Fabric v1.4 网络配置 Prometheus 和 Grafana，以分析对等点和链码 mertics。在遵循此documentation 之后，我已将对等容器的端口9443 映射到我的主机端口9443。我还将 provider 条目更改为 prometheus 下的 metrics 部分 core.yml 的对等。我在docker-compose.yml中配置了prometheus和grafana，方式如下。

  prometheus:
    image: prom/prometheus:v2.6.1
    container_name: prometheus
    volumes:
    - ./prometheus/:/etc/prometheus/
    - prometheus_data:/prometheus
    command:
    - '--config.file=/etc/prometheus/prometheus.yml'
    - '--storage.tsdb.path=/prometheus'
    - '--web.console.libraries=/etc/prometheus/console_libraries'
    - '--web.console.templates=/etc/prometheus/consoles'
    - '--storage.tsdb.retention=200h'
    - '--web.enable-lifecycle'
    restart: unless-stopped
    ports:
    - 9090:9090
    networks:
    - basic
    labels:
    org.label-schema.group: "monitoring"

  grafana:
    image: grafana/grafana:5.4.3
    container_name: grafana
    volumes:
    - grafana_data:/var/lib/grafana
    - ./grafana/datasources:/etc/grafana/datasources
    - ./grafana/dashboards:/etc/grafana/dashboards
    - ./grafana/setup.sh:/setup.sh
    entrypoint: /setup.sh
    environment:
    - GF_SECURITY_ADMIN_USER=ADMIN_USER
    - GF_SECURITY_ADMIN_PASSWORD=ADMIN_PASS
    - GF_USERS_ALLOW_SIGN_UP=false
    restart: unless-stopped
    ports:
    - 3000:3000
    networks:
    - basic
    labels:
    org.label-schema.group: "monitoring"

当我在远程 centos 机器上curl 0.0.0.0:9443/metrics 时，我得到了所有指标列表。但是，当我使用上述配置运行 Prometheus 时，它会抛出错误 Get http://localhost:9443/metrics: dial tcp 127.0.0.1:9443: connect: connection refused。这就是我的prometheus.yml 的样子。

global:
  scrape_interval:     15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 10s
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'peer_metrics'
    scrape_interval: 10s
    static_configs:
      - targets: ['localhost:9443']

甚至，当我在浏览器中访问端点 http://localhost:9443/metrics 时，我得到了所有指标。我在这里做错了什么。为什么 Prometheus 指标会显示在其界面上而不是同行的界面上？

【问题讨论】：

【参考方案1】：

您的 prometheus 容器未在主机网络上运行。它在自己的桥上运行（由 docker-compose 创建的桥）。因此，peer 的抓取配置应指向对等容器的 IP。

推荐的解决方法：

在与结构网络相同的网络中运行 prometheus 和 grafana。在你的 docker-compose for prometheus 堆栈中，你可以像这样引用它：

networks:
  default:
    external:
      name: <your-hyperledger-network>

（使用docker network ls查找网络名称）

然后你可以在你的抓取配置中使用http://<peer_container_name>:9443

【讨论】：

我在 docker-compose.yml 本身中添加了 prometheus 和 grafana 配置。我添加了网络：基本：驱动程序：顶部的桥。普罗米修斯工作正常。在 prometheus 界面上看到目标时已启动。但是，当我在 grafana 中添加数据源 localhost:9443 时，它会显示 HTTP Bad error Gateway。在 docker-compose.yml 添加网络时：默认值：外部：名称：基本，我收到错误“网络基本声明为外部，但找不到。请使用docker network create basic 手动创建网络，然后重试。' 在通过 docker network inspect 检查网络时，我可以看到 prometheus 和 grafana 容器与其他结构容器属于同一网络。 “basic-network”示例结构使用的网络称为“net_basic”（而不是“basic”）。 @KartikChauhan 在 Grafana 中，您应该只添加 prometheus 作为数据源 - 即 prometheus:9090【参考方案2】：

由于目标不在 prometheus 容器内运行，因此无法通过 localhost 访问它们。您需要通过主机私有 IP 或将localhost 替换为docker.for.mac.localhost 或host.docker.internal 来访问它们。

【讨论】：

您应该只使用host.docker.internal。 docker.for.mac.localhost 和 docker.for.win.localhost 已弃用。【参考方案3】：

问题：在 Prometheus 上，您添加了一个用于抓取的服务，但在 http://localhost:9090/targets 上，端点状态为 Down 出现错误：

获取http://localhost:9091/metrics：拨打tcp 127.0.0.1:9091：连接：连接被拒绝

解决方案：在prometheus.yml 上，您需要验证

curl -v http://<serviceip>:<port>/metrics

注意：如果您指向另一个 docker 容器中的某个服务，您的 localhost 可能不是 localhost，而是servicename（docker ps 中显示的服务名称）或docker.host.internal （运行 docker 容器的内部 ip）。

对于这个例子：我将使用 2 个 docker 容器 prometheus 和“myService”。

sudo docker ps

CONTAINER ID        IMAGE                     CREATED                        PORTS                    NAMES
abc123        prom/prometheus:latest        2 hours ago               0.0.0.0:9090->9090/tcp         prometheus
def456        myService/myService:latest         2 hours ago               0.0.0.0:9091->9091/tcp         myService

然后编辑文件prometheus.yml（并重新运行prometheus）

- job_name: myService
  scrape_interval: 15s
  scrape_timeout: 10s
  metrics_path: /metrics
  static_configs:
    - targets: // Presenting you 3 options
      - localhost:9091 // simple localhost 
      - docker.host.internal:9091 // the localhost of agent that runs the docker container
      - myService:9091 // docker container name (worked in my case)

【讨论】：

在 Linux 上 - myService:9090 可能是首选方式【参考方案4】：

我记得我通过下载适用于 windows 的 Prometheus 节点导出器解决了这个问题。

查看此链接https://medium.com/@facundofarias/setting-up-a-prometheus-exporter-on-windows-b3e45f1235a5

【讨论】：

【参考方案5】：

注意此解决方案不适用于 docker swarm。它适用于旨在在 overlay 网络上运行的独立容器（多容器）。

我们在使用 overlay 网络时遇到同样的错误，这里是解决方案（静态而非动态）

此配置不起作用：

global:
  scrape_interval:     15s
  evaluation_interval: 15s

  external_labels:
    monitor: 'promswarm'

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node'
    static_configs:
      - targets: [ 'localhost:9100' ]

即使http://docker.for.mac.localhost:9100/ 可用，这个也没有，但是 prometheus 找不到 node-exporter。所以下面的也不起作用：

global:
  scrape_interval:     15s
  evaluation_interval: 15s

  external_labels:
    monitor: 'promswarm'

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']


  - job_name: 'node'
    static_configs:
      - targets: [ 'docker.for.mac.localhost:9100'  ]

但只需使用其容器 ID，我们就可以通过其端口号访问该服务。

docker ps
CONTAINER ID   IMAGE                    COMMAND                  CREATED          STATUS          PORTS                                       NAMES
a58264faa1a4   prom/prometheus          "/bin/prometheus --c…"   5 minutes ago    Up 5 minutes    0.0.0.0:9090->9090/tcp, :::9090->9090/tcp   unruffled_solomon
62310f56f64a   grafana/grafana:latest   "/run.sh"                42 minutes ago   Up 42 minutes   0.0.0.0:3000->3000/tcp, :::3000->3000/tcp   wonderful_goldberg
7f1da9796af3   prom/node-exporter       "/bin/node_exporter …"   48 minutes ago   Up 48 minutes   0.0.0.0:9100->9100/tcp, :::9100->9100/tcp   intelligent_panini

所以我们有7f1da9796af3 prom/node-exporter ID，我们可以将yml 文件更新为：

global:
  scrape_interval:     15s
  evaluation_interval: 15s

  external_labels:
    monitor: 'promswarm'

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']


  - job_name: 'node'
    static_configs:
      - targets: [ '7f1da9796af3:9100'  ]

不工作

工作

更新

我自己对这种硬编码的解决方案并不满意，因此在进行了一些其他搜索后，发现了一种使用 overlay 网络中的--network-alias NAME 的更可靠的方法，该容器将可以通过那个名字。所以yml 看起来像这样：

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']


  - job_name: 'node'
    static_configs:
      - targets: [ 'node_exporter:9100' ]

其中名称node_exporter 是使用run 子命令创建的别名。例如

docker run --rm  -d  -v "/:/host:ro,rslave" --network cloud --network-alias node_exporter --pid host -p 9100:9100   prom/node-exporter  --path.rootfs=/host

简而言之，在覆盖 cloud 网络上，您可以使用 node_exporter:<PORT> 访问节点导出器。

【讨论】：

以上是关于收到错误“获取 http://localhost:9443/metrics：拨打 tcp 127.0.0.1:9443：连接：连接被拒绝”的主要内容，如果未能解决你的问题，请参考以下文章