Keycloak Kubernetes GKE NGINX Ingress - 重新加载页面时重新启动 pod 后会话丢失并返回 502 Bad Gateway

Posted

技术标签:

【中文标题】Keycloak Kubernetes GKE NGINX Ingress - 重新加载页面时重新启动 pod 后会话丢失并返回 502 Bad Gateway【英文标题】:Keycloak Kubernetes GKE NGINX Ingress - Session get lost after pod restart on page reload and returns 502 Bad Gateway 【发布时间】:2021-01-25 03:34:22 【问题描述】:

我在 GKE 中设置了一个 Keycloak 集群,并使用 nginx 作为入口控制器。我使用了 Codecentrics Helm Chart:[https://github.com/codecentric/helm-charts/tree/master/charts/keycloak][Keycloak Helm Chart]

我正在为 JGroups 使用 JDBC_PING 并具有以下 cli 脚本和 Ingress 配置。我将副本设置为 2。当我杀死一个 pod 时,会话仍然可用并且一切正常,我可以在 keycloak 管理界面中导航并执行所有操作。但是当我按 F5 重新加载页面时,我收到 502 Bad Gateway 错误。有时它确实恢复了,我可以重新加载,一切都很好,但有时我必须完全删除 cookie 才能让它再次工作。

我不确定问题出在哪里。

浏览器中的 Cookie:

mysql 表 JGROUPSPING:

入口注解:

  annotations: 
    kubernetes.io/ingress.class: "nginx"
    kubernetes.io/tls-acme: "true"
    nginx.ingress.kubernetes.io/force-ssl-redirect: "false"
    nginx.ingress.kubernetes.io/limit-rate: "150"
    nginx.ingress.kubernetes.io/limit-rps: "150"
    nginx.ingress.kubernetes.io/session-cookie-change-on-failure: "true"
    nginx.ingress.kubernetes.io/affinity: "cookie"
    nginx.ingress.kubernetes.io/session-cookie-name: "route"
    nginx.ingress.kubernetes.io/session-cookie-expires: "21600"
    nginx.ingress.kubernetes.io/session-cookie-max-age: "21600"
    nginx.ingress.kubernetes.io/server-snippet: |
      location /auth/realms/master/metrics 
          return 403;
       

额外的环境:

# Additional environment variables for Keycloak
extraEnv: |
  - name: KEYCLOAK_STATISTICS
    value: all
  - name: PROXY_ADDRESS_FORWARDING
    value: "true"
  - name: KEYCLOAK_USER
    value: ' .Values.ADMIN_USER '
  - name: KEYCLOAK_PASSWORD
    value: ' .Values.ADMIN_PASS '
  - name: JAVA_OPTS
    value: >-
      -XX:+UseContainerSupport
      -XX:MaxRAMPercentage=50.0
      -Djava.net.preferIPv4Stack=true
      -Djboss.modules.system.pkgs=$JBOSS_MODULES_SYSTEM_PKGS
      -Djava.awt.headless=true
  - name: JGROUPS_DISCOVERY_PROTOCOL
    value: JDBC_PING  
  - name: CACHE_OWNERS_COUNT
    value: "2"
  - name: CACHE_OWNERS_AUTH_SESSIONS_COUNT
    value: "2"
  - name: DB_VENDOR
    value: mysql
  - name: DB_ADDR
    value: "127.0.0.1"
  - name: DB_PORT
    value: "3306"
  - name: DB_DATABASE
    value: keycloak_prod
  - name: DB_USER
    value: ' .Values.SQL_USER '
  - name: DB_PASSWORD
    value: ' .Values.SQL_PASS '

Keycloak CLI 脚本:

embed-server --server-config=standalone-ha.xml --std-out=echo
batch


echo Configuring node identifier

## Sets the node identifier to the node name (= pod name). Node identifiers have to be unique. They can have a
## maximum length of 23 characters. Thus, the chart's fullname template truncates its length accordingly.
/subsystem=transactions:write-attribute(name=node-identifier, value=$jboss.node.name)
echo NodeName: $jboss.node.name
echo Finished configuring node identifier

echo CUSTOM_CONFIG: executing CONFIG FOR K8S Failover Support


echo "------------------------------------------------------------------------------------------------------------"
echo "---------------------------------CUSTOM STARTUP CONFIG------------------------------------------------------"
echo "------------------------------------------------------------------------------------------------------------"

## JDBC PING

/subsystem=infinispan/cache-container=keycloak/distributed-cache=sessions:write-attribute(name=owners, value=$env.CACHE_OWNERS_COUNT:2)
/subsystem=infinispan/cache-container=keycloak/distributed-cache=authenticationSessions:write-attribute(name=owners, value=$env.CACHE_OWNERS_COUNT:2)
/subsystem=infinispan/cache-container=keycloak/distributed-cache=offlineSessions:write-attribute(name=owners, value=$env.CACHE_OWNERS_COUNT:2)
/subsystem=infinispan/cache-container=keycloak/distributed-cache=loginFailures:write-attribute(name=owners, value=$env.CACHE_OWNERS_COUNT:2)

/subsystem=jgroups/stack=tcp:remove()
/subsystem=jgroups/stack=tcp:add()
/subsystem=jgroups/stack=tcp/transport=TCP:add(socket-binding="jgroups-tcp")
/subsystem=jgroups/stack=tcp/protocol=JDBC_PING:add()
/subsystem=jgroups/stack=tcp/protocol=JDBC_PING/property=datasource_jndi_name:add(value=java:jboss/datasources/KeycloakDS)

/subsystem=jgroups/stack=tcp/protocol=JDBC_PING/property=initialize_sql:add(value="CREATE TABLE IF NOT EXISTS JGROUPSPING (own_addr varchar(200) NOT NULL, cluster_name varchar(200) NOT NULL, updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, ping_data varbinary(5000) DEFAULT NULL, PRIMARY KEY (own_addr, cluster_name)) ENGINE=InnoDB DEFAULT CHARSET=utf8")

/subsystem=jgroups/stack=tcp/protocol=MERGE3:add()
/subsystem=jgroups/stack=tcp/protocol=FD_SOCK:add(socket-binding="jgroups-tcp-fd")
/subsystem=jgroups/stack=tcp/protocol=FD:add()
/subsystem=jgroups/stack=tcp/protocol=VERIFY_SUSPECT:add()
/subsystem=jgroups/stack=tcp/protocol=pbcast.NAKACK2:add()
/subsystem=jgroups/stack=tcp/protocol=UNICAST3:add()
/subsystem=jgroups/stack=tcp/protocol=pbcast.STABLE:add()
/subsystem=jgroups/stack=tcp/protocol=pbcast.GMS:add()
/subsystem=jgroups/stack=tcp/protocol=pbcast.GMS/property=max_join_attempts:add(value=5)
/subsystem=jgroups/stack=tcp/protocol=MFC:add()
/subsystem=jgroups/stack=tcp/protocol=FRAG3:add()

/subsystem=jgroups/stack=udp:remove()
/subsystem=jgroups/channel=ee:write-attribute(name=stack, value=tcp)
/socket-binding-group=standard-sockets/socket-binding=jgroups-mping:remove()


## Cache Setup for Failover
/subsystem=infinispan/cache-container=keycloak/distributed-cache=sessions:remove()
/subsystem=infinispan/cache-container=keycloak/distributed-cache=authenticationSessions:remove()
/subsystem=infinispan/cache-container=keycloak/distributed-cache=offlineSessions:remove()
/subsystem=infinispan/cache-container=keycloak/distributed-cache=clientSessions:remove()
/subsystem=infinispan/cache-container=keycloak/distributed-cache=offlineClientSessions:remove()
/subsystem=infinispan/cache-container=keycloak/distributed-cache=loginFailures:remove()

/subsystem=infinispan/cache-container=keycloak/replicated-cache=sessions:add()
/subsystem=infinispan/cache-container=keycloak/replicated-cache=authenticationSessions:add()
/subsystem=infinispan/cache-container=keycloak/replicated-cache=offlineSessions:add()
/subsystem=infinispan/cache-container=keycloak/replicated-cache=clientSessions:add()
/subsystem=infinispan/cache-container=keycloak/replicated-cache=offlineClientSessions:add()
/subsystem=infinispan/cache-container=keycloak/replicated-cache=loginFailures:add()

echo "------------------------------------------------------------------------------------------------------------"
echo "---------------------------------CUSTOM STARTUP CONFIG DONE!------------------------------------------------"
echo "------------------------------------------------------------------------------------------------------------"

run-batch

try
    :resolve-expression(expression=$env.JGROUPS_DISCOVERY_EXTERNAL_IP)
    /subsystem=jgroups/stack=tcp/transport=TCP/property=external_addr/:add(value=$env.JGROUPS_DISCOVERY_EXTERNAL_IP)
catch
    echo "JGROUPS_DISCOVERY_EXTERNAL_IP maybe not set."
end-try

stop-embedded-server

重启 Pod 的日志: log-restarted-pod.txt

仍在运行的 pod 的日志: log-still-running-pod.txt

【问题讨论】:

你能解决这个问题吗?如果是,请分享为解决此问题所做的工作。我也面临同样的问题。当我刷新 keycloak 页面时,它会给出 502 bad gateway。 【参考方案1】:

我设法解决了这个问题,我们需要在 ingress.yaml 文件中添加以下注释。

nginx.ingress.kubernetes.io/proxy-buffer-size: "12k"

【讨论】:

我认为现在在图表本身中也将其用作默认值:nginx.ingress.kubernetes.io/proxy-buffer-size: "128k"。不过谢谢你的回答。

以上是关于Keycloak Kubernetes GKE NGINX Ingress - 重新加载页面时重新启动 pod 后会话丢失并返回 502 Bad Gateway的主要内容,如果未能解决你的问题,请参考以下文章

在具有私有 GKE 集群的 Terraform 上使用 Kubernetes 提供程序

将文件放置在 GKE 上的 Kubernetes 持久卷存储中

将 kubernetes(GKE) 服务层指标发送到 GCP 负载均衡器

在 GKE 上的 Kubernetes Horizo​​ntalPodAutoscaler 上描述的指标是啥?

Prometheus 从 GKE 中的 kubernetes api 获得 403 禁止

为 GKE kubernetes 集群选择节点大小