在具有私有 GKE 集群的 Terraform 上使用 Kubernetes 提供程序

Posted

技术标签:

【中文标题】在具有私有 GKE 集群的 Terraform 上使用 Kubernetes 提供程序【英文标题】:Use Kubernetes provider on Terraform with private GKE cluster 【发布时间】:2021-05-24 21:15:41 【问题描述】:

我想使用 Terraform 上的 kubernetes 提供程序与私有 GKE 集群进行交互。我可以成功创建集群但是我无法创建命名空间,我一直收到超时错误。身份验证不是问题,因为我可以在本地运行 kubectl ... 命令而不会出现问题。我认为这个问题与集群是私有的事实有关(因为我发现的所有示例都与公共集群有关)。有谁知道如何将 kubernetes 提供程序连接到私有 GKE 集群?

我的main.tf 文件:

provider "google" 
 project = "<PROJECT_ID>"


variable "cluster_name" 
  default = "<CLUSTER_NAME>"


resource "google_container_cluster" "composer_cluster" 
  name      = var.cluster_name
  location  = "europe-west1-b"

  # Node
  initial_node_count = 1
  node_config 
    disk_size_gb  = 100
    disk_type     = "pd-standard"
    machine_type  = "n1-standard-4"
    metadata      = 
      disable-legacy-endpoints= true
    
    oauth_scopes    = ["https://www.googleapis.com/auth/cloud-platform"]
    service_account = "<SERVICE_ACCOUNT>"
  

  # Network
  network     = "<NETWORK>"
  subnetwork  = "<SUBNETWORK>"

  # IP allocation
  private_cluster_config 
    enable_private_endpoint= true
    enable_private_nodes= true
    master_global_access_config 
      enabled= true
    
    master_ipv4_cidr_block= "172.16.32.0/28"
  
  ip_allocation_policy 
    cluster_ipv4_cidr_block= "10.92.0.0/14"
    services_ipv4_cidr_block= "10.82.240.0/20"
  

  # Security
  enable_kubernetes_alpha= false
  enable_legacy_abac= false
  enable_intranode_visibility= true
  master_authorized_networks_config 
  network_policy 
    enabled= true
    provider= "CALICO"
  
  enable_shielded_nodes= true

  # Timeouts
  timeouts 
    create = "30m"
    update = "40m"
  


data "google_client_config" "current" 

provider "kubernetes" 
  host    = google_container_cluster.composer_cluster.private_cluster_config[0].private_endpoint
  token   = data.google_client_config.current.access_token
  client_certificate = base64decode(google_container_cluster.composer_cluster.master_auth[0].client_certificate)
  client_key = base64decode(google_container_cluster.composer_cluster.master_auth[0].client_key)
  cluster_ca_certificate = base64decode(google_container_cluster.composer_cluster.master_auth[0].cluster_ca_certificate)


resource "null_resource" "get-credentials" 
 depends_on = [google_container_cluster.composer_cluster] 
 provisioner "local-exec" 
   command = "gcloud container clusters get-credentials $google_container_cluster.composer_cluster.name --internal-ip --zone europe-west1-b --project <PROJECT_ID>"
 


resource "kubernetes_namespace" "namespace" 
  metadata 
    labels = 
      app = "create-namespace"
    
    name = "<NAMESPACE>"
  
  depends_on = [null_resource.get-credentials]

输出:

oogle_container_cluster.composer_cluster: Creating...
google_container_cluster.composer_cluster: Still creating... [10s elapsed]
google_container_cluster.composer_cluster: Still creating... [20s elapsed]
google_container_cluster.composer_cluster: Still creating... [30s elapsed]
google_container_cluster.composer_cluster: Still creating... [40s elapsed]
google_container_cluster.composer_cluster: Still creating... [50s elapsed]
google_container_cluster.composer_cluster: Still creating... [1m0s elapsed]
google_container_cluster.composer_cluster: Still creating... [1m10s elapsed]
google_container_cluster.composer_cluster: Still creating... [1m20s elapsed]
google_container_cluster.composer_cluster: Still creating... [1m30s elapsed]
google_container_cluster.composer_cluster: Still creating... [1m40s elapsed]
google_container_cluster.composer_cluster: Still creating... [1m50s elapsed]
google_container_cluster.composer_cluster: Still creating... [2m0s elapsed]
google_container_cluster.composer_cluster: Still creating... [2m10s elapsed]
google_container_cluster.composer_cluster: Still creating... [2m20s elapsed]
google_container_cluster.composer_cluster: Still creating... [2m30s elapsed]
google_container_cluster.composer_cluster: Still creating... [2m40s elapsed]
google_container_cluster.composer_cluster: Still creating... [2m50s elapsed]
google_container_cluster.composer_cluster: Still creating... [3m0s elapsed]
google_container_cluster.composer_cluster: Still creating... [3m10s elapsed]
google_container_cluster.composer_cluster: Still creating... [3m20s elapsed]
google_container_cluster.composer_cluster: Still creating... [3m30s elapsed]
google_container_cluster.composer_cluster: Still creating... [3m40s elapsed]
google_container_cluster.composer_cluster: Still creating... [3m50s elapsed]
google_container_cluster.composer_cluster: Still creating... [4m0s elapsed]
google_container_cluster.composer_cluster: Still creating... [4m10s elapsed]
google_container_cluster.composer_cluster: Still creating... [4m20s elapsed]
google_container_cluster.composer_cluster: Still creating... [4m30s elapsed]
google_container_cluster.composer_cluster: Still creating... [4m40s elapsed]
google_container_cluster.composer_cluster: Still creating... [4m50s elapsed]
google_container_cluster.composer_cluster: Still creating... [5m0s elapsed]
google_container_cluster.composer_cluster: Creation complete after 5m2s [id=projects/<PROJECT_ID>/locations/europe-west1-b/clusters/<CLUSTER_NAME>]
kubernetes_namespace.namespace: Creating...
kubernetes_namespace.namespace: Still creating... [10s elapsed]
kubernetes_namespace.namespace: Still creating... [20s elapsed]
kubernetes_namespace.namespace: Still creating... [30s elapsed]

Error: Post "https://172.16.32.2/api/v1/namespaces": dial tcp 172.16.32.2:443: i/o timeout

【问题讨论】:

【参考方案1】:

似乎仍然是一个身份验证问题,我已经成功运行 gcloud module 对 GKE 集群进行身份验证:

module "gcloud" 
  source  = "terraform-google-modules/gcloud/google"
  version = "~> 0.5"

  platform = "linux"

  create_cmd_entrypoint  = "gcloud"
  create_cmd_body        = "container clusters get-credentials $google_container_cluster.composer_cluster.name --region=$var.zone"


provider "kubernetes" 
# the authorization is handled by running gcloud clusters get-credentials using the gcloud terraform module


resource "kubernetes_deployment" "main" 
....


【讨论】:

您是否将它与私有 GKE 集群一起使用?我注意到您的 gcloud 命令中没有 --internal-ip 标志。我尝试使用带有internal_ip="true" 的 gcloud 模块,但没有成功,仍然超时。有什么想法吗? 是的,我们为每个项目使用私有 GKE 集群。但我们仍然有public endpoints with authorised networks。所以我们不需要internal_ip="true" 标志。

以上是关于在具有私有 GKE 集群的 Terraform 上使用 Kubernetes 提供程序的主要内容,如果未能解决你的问题,请参考以下文章

Terraformed 私有 GKE 集群自动化访问

在 GKE 集群上使用 Terraform 部署 Helm 工作负载

如何使用 Terraform 创建一个健康的 VPC-Native GKE 集群?

如何建立从GAE(具有公共访问权限)到私有GKE集群的安全连接。

GKE terraform 的标签更改导致整个集群崩溃

将现有 GKE 集群添加到 terraform stat 文件