无法从 Airflow 应用程序访问 Vault 服务器

Posted

技术标签:

【中文标题】无法从 Airflow 应用程序访问 Vault 服务器【英文标题】:Unable to reach Vault server from Airflow application 【发布时间】:2021-09-24 14:39:46 【问题描述】:

我正在尝试使用 docker-compose 在我的本地计算机上使用 Airflow 将 Vault 设置为 s secrets 后端,但无法建立连接。我建立在 Official Airflow docker-compose 文件之上。我已将 Vault 添加为服务,并将 VAULT_ADDR=http://vault:8200 添加为 Airflow 应用程序的环境变量。

在我的一天中,我试图从 Vault 中获取一个秘密,但连接被拒绝。

当服务运行时,我可以访问 Vault CLI 并创建机密,这意味着 Vault 运行良好。我还尝试了docker compose exec -- airflow-webserver curl http://vault:8200 以查看 dag 是否存在问题,但我收到相同的连接被拒绝错误。 我还尝试了docker compose exec -- airflow-webserver curl http://flower:5555,只是为了查看 docker 网络是否工作正常,并且它从flower 服务返回了正确的响应。

# example dag
from airflow.decorators import dag, task
from airflow.hooks.base import BaseHook
from airflow.utils.dates import days_ago

default_args = 
    'owner': 'BooHoo'



@dag(default_args=default_args, schedule_interval=None, start_date=days_ago(2), tags=['example'])
def get_secrets():
    @task()
    def get():
        conn = BaseHook.get_connection(conn_id='slack_conn_id')
        print(f"Password: conn.password, Login: conn.login, URI: conn.get_uri(), Host: conn.host")

    get()


get_secrets_dag = get_secrets()

这是 docker compose 文件。

version: '3'
x-airflow-common:
  &airflow-common
  image: apache/airflow:2.1.0-python3.7
  environment:
    &airflow-common-env
    AIRFLOW__CORE__EXECUTOR: CeleryExecutor
    AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
    AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
    AIRFLOW__CORE__FERNET_KEY: ''
    AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
    AIRFLOW__CORE__LOAD_EXAMPLES: 'false'   # default is true
    AIRFLOW__WEBSERVER__EXPOSE_CONFIG: 'true'
    #    AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
    AIRFLOW__SECRETS__BACKEND: 'airflow.providers.hashicorp.secrets.vault.VaultBackend'
    AIRFLOW__SECRETS__BACKEND_KWARGS: '"connections_path": "connections", "variables_path": "variables", "mount_point": "secrets", "token": "$VAULT_DEV_ROOT_TOKEN_ID"'
    VAULT_ADDR: 'http://vault:8200'
    SLACK_WEBHOOK_URL: "$SLACK_WEBHOOK_URL"
  volumes:
    - ./src/dags:/opt/airflow/dags
    - ./logs:/opt/airflow/logs
  user: "$AIRFLOW_UID:-50000:$AIRFLOW_GID:-50000"
  depends_on:
    redis:
      condition: service_healthy
    postgres:
      condition: service_healthy
    vault:
      condition: service_healthy

services:
  vault:
    image: vault:latest
    ports:
      - "8200:8200"
    environment:
      VAULT_ADDR: 'http://0.0.0.0:8200'
      VAULT_DEV_ROOT_TOKEN_ID: "$VAULT_DEV_ROOT_TOKEN_ID"
    cap_add:
      - IPC_LOCK
    command: vault server -dev
    healthcheck:
      test: [ "CMD", "vault", "status" ]
      interval: 5s
      retries: 5
    restart: always

  postgres:
    # service configuration
    

  redis:
    # service configurations

  airflow-webserver:
    <<: *airflow-common
    command: webserver
    ports:
      - "8080:8080"
    healthcheck:
      test: [ "CMD", "curl", "--fail", "http://localhost:8080/health" ]
      interval: 10s
      timeout: 10s
      retries: 5
    restart: always

  airflow-scheduler:
    <<: *airflow-common
    command: scheduler
    healthcheck:
      test: [ "CMD-SHELL", 'airflow jobs check --job-type SchedulerJob --hostname "$$HOSTNAME"' ]
      interval: 10s
      timeout: 10s
      retries: 5
    restart: always

  airflow-worker:
    <<: *airflow-common
    command: celery worker
    healthcheck:
      test:
        - "CMD-SHELL"
        - 'celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$$HOSTNAME"'
      interval: 10s
      timeout: 10s
      retries: 5
    restart: always

  airflow-init:
    <<: *airflow-common
    command: version
    environment:
      <<: *airflow-common-env
      _AIRFLOW_DB_UPGRADE: 'true'
      _AIRFLOW_WWW_USER_CREATE: 'true'
      _AIRFLOW_WWW_USER_USERNAME: $_AIRFLOW_WWW_USER_USERNAME:-airflow
      _AIRFLOW_WWW_USER_PASSWORD: $_AIRFLOW_WWW_USER_PASSWORD:-airflow

  flower:
    <<: *airflow-common
    # service configuration

volumes:
  postgres-db-volume:

【问题讨论】:

【参考方案1】:

我认为您需要在命令中指定开发侦听地址:

vault server -dev -dev-listen-address="0.0.0.0:8200"

或设置

VAULT_DEV_LISTEN_ADDRESS0.0.0.0:8200

这里是文档:https://www.vaultproject.io/docs/commands/server#dev-options

【讨论】:

非常感谢!

以上是关于无法从 Airflow 应用程序访问 Vault 服务器的主要内容,如果未能解决你的问题,请参考以下文章

无法使用应用服务上的 Azure MSI 访问 Key Vault

Azure Key Vault 机密访问间歇性地无法连接套接字异常

从 Azure Functions 访问 Azure Key Vault 时访问被拒绝

Google Cloud Composer (Apache Airflow) 无法访问日志文件

无法从 Azure Key Vault 获取令牌

使用 MSI 从本地 Service Fabric 群集访问 Key Vault