Google Container Engine (Kubernetes):Websocket (Socket.io) 不适用于多个副本

Posted

技术标签:

【中文标题】Google Container Engine (Kubernetes):Websocket (Socket.io) 不适用于多个副本【英文标题】:Google Container Engine (Kubernetes): Websocket (Socket.io) not working on multiple replicas 【发布时间】:2017-04-06 05:18:31 【问题描述】:

我是 Google Container Engine (GKE) 的新手。在localhost 上运行时,它工作正常,但是当我使用 GKE 部署到生产环境时,出现 websocket 错误。

我的节点应用程序是使用Hapi.jsSocket.io 开发的,我的结构如下图所示。

Application Architecture

我正在使用 Glue 来编写 Hapi 服务器。下面是我的manifest.json


...
"connections": [
    
      "host": "app",
      "address": "0.0.0.0",
      "port": 8000,
      "labels": ["api"],
      "routes": 
        "cors": false,
        "security": 
          "hsts": false,
          "xframe": true,
          "xss": true,
          "noOpen": true,
          "noSniff": true
        
      ,
      "router": 
        "stripTrailingSlash": true
      ,
      "load": 
        "maxHeapUsedBytes": 1073741824,
        "maxRssBytes": 1610612736,
        "maxEventLoopDelay": 5000
      
    ,
    
      "host": "app",
      "address": "0.0.0.0",
      "port": 8099,
      "labels": ["web"],
      "routes": 
        "cors": true,
        "security": 
          "hsts": false,
          "xframe": true,
          "xss": true,
          "noOpen": true,
          "noSniff": true
        
      ,
      "router": 
        "stripTrailingSlash": true
      ,
      "load": 
        "maxHeapUsedBytes": 1073741824,
        "maxRssBytes": 1610612736,
        "maxEventLoopDelay": 5000
      
    ,
    
      "host": "app",
      "address": "0.0.0.0",
      "port": 8999,
      "labels": ["admin"],
      "routes": 
        "cors": true,
        "security": 
          "hsts": false,
          "xframe": true,
          "xss": true,
          "noOpen": true,
          "noSniff": true
        
      ,
      "router": 
        "stripTrailingSlash": true
      ,
      "load": 
        "maxHeapUsedBytes": 1073741824,
        "maxRssBytes": 1610612736,
        "maxEventLoopDelay": 5000
      ,
      "state": 
        "ttl": null,
        "isSecure": false,
        "isHttpOnly": true,
        "path": null,
        "domain": null,
        "encoding": "none",
        "clearInvalid": false,
        "strictHeader": true
      
    
  ],
...

还有我的nginx.conf

worker_processes                5; ## Default: 1
worker_rlimit_nofile            8192;
error_log                       /dev/stdout info;

events 
  worker_connections            4096; ## Default: 1024


http 
    access_log                  /dev/stdout;

    server 
        listen                  80          default_server;
        listen                  [::]:80     default_server;

        # Redirect all HTTP requests to HTTPS with a 301 Moved Permanently response.
        return                  301         https://$host$request_uri;
    

    server 
        listen                  443         ssl default_server;
        listen                  [::]:443    ssl default_server;
        server_name             _;

        # Configure ssl
        ssl_certificate         /etc/secret/ssl/myapp.com.csr;
        ssl_certificate_key     /etc/secret/ssl/myapp.com.key;
        include                 /etc/nginx/ssl-params.conf;
    

    server 
        listen                  443         ssl;
        listen                  [::]:443    ssl;
        server_name             api.myapp.com;

        location / 
            proxy_pass          http://api_app/;
            proxy_set_header    Host                $http_host;
            proxy_set_header    X-Real-IP           $remote_addr;
            proxy_set_header    X-Forwarded-For     $proxy_add_x_forwarded_for;

            # Handle Web Socket connections
            proxy_http_version  1.1;
            proxy_set_header    Upgrade     $http_upgrade;
            proxy_set_header    Connection  "upgrade";
        
    

    server 
        listen                  443         ssl;
        listen                  [::]:443    ssl;
        server_name             myapp.com;

        location / 
            proxy_pass          http://web_app/;
            proxy_set_header    Host                $http_host;
            proxy_set_header    X-Real-IP           $remote_addr;
            proxy_set_header    X-Forwarded-For     $proxy_add_x_forwarded_for;

            # Handle Web Socket connections
            proxy_http_version  1.1;
            proxy_set_header    Upgrade     $http_upgrade;
            proxy_set_header    Connection  "upgrade";
        
    

    server 
        listen                  443         ssl;
        listen                  [::]:443    ssl;
        server_name             admin.myapp.com;

        location / 
            proxy_pass          http://admin_app/;
            proxy_set_header    Host                $http_host;
            proxy_set_header    X-Real-IP           $remote_addr;
            proxy_set_header    X-Forwarded-For     $proxy_add_x_forwarded_for;

            # Handle Web Socket connections
            proxy_http_version  1.1;
            proxy_set_header    Upgrade     $http_upgrade;
            proxy_set_header    Connection  "upgrade";
        
    

    # Define your "upstream" servers - the
    # servers request will be sent to
    upstream api_app 
        server                  localhost:8000;
    

    upstream web_app 
        server                  localhost:8099;
    

    upstream admin_app 
        server                  localhost:8999;
    

Kubernetes 服务app-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: app-nginx
  labels:
    app: app-nginx
spec:
  type: LoadBalancer
  ports:
    # The port that this service should serve on.
    - port: 80
      targetPort: 80
      protocol: TCP
      name: http
    - port: 443
      targetPort: 443
      protocol: TCP
      name: https
  # Label keys and values that must match in order to receive traffic for this service.
  selector:
    app: app-nginx

Kubernetes 部署app-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: app-nginx
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: app-nginx
    spec:
      containers:
        - name: nginx
          image: us.gcr.io/myproject/nginx
          ports:
            - containerPort: 80
              name: http
            - containerPort: 443
              name: https
          volumeMounts:
              # This name must match the volumes.name below.
            - name: ssl-secret
              readOnly: true
              mountPath: /etc/secret/ssl
        - name: app
          image: us.gcr.io/myproject/bts-server
          ports:
            - containerPort: 8000
              name: api
            - containerPort: 8099
              name: web
            - containerPort: 8999
              name: admin
          volumeMounts:
              # This name must match the volumes.name below.
            - name: client-secret
              readOnly: true
              mountPath: /etc/secret/client
            - name: admin-secret
              readOnly: true
              mountPath: /etc/secret/admin
      volumes:
        - name: ssl-secret
          secret:
            secretName: ssl-key-secret
        - name: client-secret
          secret:
            secretName: client-key-secret
        - name: admin-secret
          secret:
            secretName: admin-key-secret

我正在使用Cloudflare SSL full strict

从浏览器控制台获取错误:

WebSocket connection to 'wss://api.myapp.com/socket.io/?EIO=3&transport=websocket&sid=4Ky-y9K7J0XotrBFAAAQ' failed: WebSocket is closed before the connection is established.
https://api.myapp.com/socket.io/?EIO=3&transport=polling&t=LYByND2&sid=4Ky-y9K7J0XotrBFAAAQ Failed to load resource: the server responded with a status of 400 ()
VM50:35 WebSocket connection to 'wss://api.myapp.com/socket.io/?EIO=3&transport=websocket&sid=FsCGx-UE7ohrsSSqAAAT' failed: Error during WebSocket handshake: Unexpected response code: 502WrappedWebSocket @ VM50:35WS.doOpen @ socket.io.js:6605Transport.open @ socket.io.js:4695Socket.probe @ socket.io.js:3465Socket.onOpen @ socket.io.js:3486Socket.onHandshake @ socket.io.js:3546Socket.onPacket @ socket.io.js:3508(anonymous function) @ socket.io.js:3341Emitter.emit @ socket.io.js:6102Transport.onPacket @ socket.io.js:4760callback @ socket.io.js:4510(anonymous function) @ socket.io.js:5385exports.decodePayloadAsBinary @ socket.io.js:5384exports.decodePayload @ socket.io.js:5152Polling.onData @ socket.io.js:4514(anonymous function) @ socket.io.js:4070Emitter.emit @ socket.io.js:6102Request.onData @ socket.io.js:4231Request.onLoad @ socket.io.js:4312xhr.onreadystatechange @ socket.io.js:4184
socket.io.js:4196 GET https://api.myapp.com/socket.io/?EIO=3&transport=polling&t=LYByNpy&sid=FsCGx-UE7ohrsSSqAAAT 400 ()

这里是 Nginx 的日志:

[22/Nov/2016:12:10:19 +0000] "GET /socket.io/?EIO=3&transport=websocket&sid=MGc--oncQbQI6NOZAAAX HTTP/1.1" 101 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (Khtml, like Gecko) Chrome/54.0.2840.99 Safari/537.36"
10.8.0.1 - - [22/Nov/2016:12:10:19 +0000] "POST /socket.io/?EIO=3&transport=polling&t=LYByQBw&sid=MGc--oncQbQI6NOZAAAX HTTP/1.1" 200 2 "https://myapp.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"
10.128.0.2 - - [22/Nov/2016:12:10:20 +0000] "GET /socket.io/?EIO=3&transport=polling&t=LYByQKp HTTP/1.1" 200 101 "https://myapp.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"
10.8.0.1 - - [22/Nov/2016:12:10:21 +0000] "GET /socket.io/?EIO=3&transport=polling&t=LYByQWo&sid=c5nkusT9fEPRsu2rAAAY HTTP/1.1" 200 24 "https://myapp.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"
2016/11/22 12:10:21 [error] 6#6: *157 connect() failed (111: Connection refused) while connecting to upstream, client: 10.8.0.1, server: api.myapp.com, request: "GET /socket.io/?EIO=3&transport=polling&t=LYByQaN&sid=c5nkusT9fEPRsu2rAAAY HTTP/1.1", upstream: "http://[::1]:8000/socket.io/?EIO=3&transport=polling&t=LYByQaN&sid=c5nkusT9fEPRsu2rAAAY", host: "api.myapp.com", referrer: "https://myapp.com/"
2016/11/22 12:10:21 [warn] 6#6: *157 upstream server temporarily disabled while connecting to upstream, client: 10.8.0.1, server: api.myapp.com, request: "GET /socket.io/?EIO=3&transport=polling&t=LYByQaN&sid=c5nkusT9fEPRsu2rAAAY HTTP/1.1", upstream: "http://[::1]:8000/socket.io/?EIO=3&transport=polling&t=LYByQaN&sid=c5nkusT9fEPRsu2rAAAY", host: "api.myapp.com", referrer: "https://myapp.com/"
10.8.0.1 - - [22/Nov/2016:12:10:22 +0000] "GET /socket.io/?EIO=3&transport=polling&t=LYByQaN&sid=c5nkusT9fEPRsu2rAAAY HTTP/1.1" 200 4 "https://myapp.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"

更新

当我在 app-deployment.yaml 中将 replicas 更改为 1 时,它就可以工作了。但我认为这不是一个好的解决方案。我需要 3 个副本

apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: app-nginx
    spec:
      replicas: 1
      template:
        metadata:
          labels:
            app: app-nginx

如何让它与 3 个副本一起工作?

【问题讨论】:

NGINX 是否有任何定义的入口规则?请记住,L7 LB 还没有 websocket 支持。添加类型: Ingress metadata: annotations: kubernetes.io/ingress.class: "nginx" 应该这样做 感谢@George。我没有入口。您能否详细说明如何使其适用于我的情况? @George,我使用 L4 LB。所以,我认为问题是sessionAffinity。我们可以为 Kubernetes 服务定义Load distribution algorithm 吗? cloud.google.com/compute/docs/load-balancing/network/… 我为造成的混乱道歉。 3 个副本中有 1 个正在工作的事实有点奇怪,但是是否可以重写您的 nginx 配置以使用 ip,而不是 dns(如 127.0.0.1 而不是“localhost”)并确保它已启动或正在侦听指定的端口。 抱歉@George 回复晚了。我会试试看。但是可以在Service中设置sessionAffinity 【参考方案1】:

在我更新 Kubernetes 服务模板以使用 sessionAffinity: ClientIP 后,它现在可以工作了。但是第一次按Ctrl + F5 时会出现一些错误,第二次按它就可以了。

Error during WebSocket handshake: Unexpected response code: 400

但是,我仍然从服务器获取数据。所以我觉得没关系。

更新的服务模板

apiVersion: v1
kind: Service
metadata:
  name: app-nginx
  labels:
    app: app-nginx
spec:
  sessionAffinity: ClientIP
  type: LoadBalancer
  ports:
    # The port that this service should serve on.
    - port: 80
      targetPort: 80
      protocol: TCP
      name: http
    - port: 443
      targetPort: 443
      protocol: TCP
      name: https
  # Label keys and values that must match in order
  # to receive traffic for this service.
  selector:
    app: app-nginx

【讨论】:

谢谢!我遇到了同样的问题,“sessionAffinity:ClientIP”解决了这个问题。你的帖子节省了我的时间。 @NikhilMaheshwari,很高兴听到这个消息 太棒了!!!!!!拯救了我的一天

以上是关于Google Container Engine (Kubernetes):Websocket (Socket.io) 不适用于多个副本的主要内容,如果未能解决你的问题,请参考以下文章

具有外部 IP 的 Google Container Engine,没有负载平衡器

更改 Google Container Engine 集群的权限

使用 Container OS (COS) 在 Google Container Engine 中挂载 NFS 卷

使用 Go 在 Google Container/Compute Engine 中登录到 Google Cloud

Google Cloud Platform:无法从 Container Engine 访问 Pubsub

Google Container Engine (Kubernetes):Websocket (Socket.io) 不适用于多个副本