来自 AWS EC2 上重定位实例的 502 Bad Gateway

Posted

技术标签:

【中文标题】来自 AWS EC2 上重定位实例的 502 Bad Gateway【英文标题】:502 Bad Gateway from relocated instance on AWS EC2 【发布时间】:2015-06-10 11:27:43 【问题描述】:

我试图将一些实例从 tokio 地区(顺便说一下,它工作正常)移动到圣保罗地区,然后我跟着this basic steps 执行但是当我从生成的 AMI 启动实例并打开时,它在浏览器中显示“502 Bad Gateway”消息。

这个重定位服务器上的主要组件是:nginx、uwsgi、django、supervisor、new relic。

这个重新定位的服务器的所有配置都是相同的,所以我重新启动了所有服务,似乎 nginx 运行良好但是它有一个详细信息来应用我的站点的配置文件的下一个配置:

nginx/sites-available/mysite

server 
    listen 80;
    server_name mysite.com;

    access_log /var/log/nginx/site_access.log;
    error_log /var/log/nginx/site_error.log;

    location /static 
        alias  /home/ubuntu/apps/site/static/;
    

    location /media/  
        alias /home/ubuntu/apps/site/media/;
    

    location / 
    client_max_body_size 400M;
    proxy_read_timeout 120;
        proxy_connect_timeout 120;
    proxy_set_header Host $http_host; 
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header Client-IP $remote_addr;
        proxy_set_header X-Forwarded-For $remote_addr;
        proxy_pass http://127.0.0.1:8888;
    proxy_buffering off;
    

说实话,我希望它正常运行,因为 http://127.0.0.1:8888 正在工作,但我不明白 nginx conexion 损坏的原因,我需要一些帮助以便我可以研究一下更多的。我还忘了什么?

更新:

嗯...根据@Michael - sqlbot 的建议,我检查了日志文件,根据这个文件:

/var/log/nginx/site_error.log

2015/04/06 15:34:31 [error] 832#0: *12 connect() failed (111: Connection refused)
while connecting to upstream, client:
190.233.157.2, server: mysite.com, request: "GET /favicon.ico HTTP/1.1", upstream:
"http://127.0.0.1:8888/favicon.ico", host: "54.207.136.99"

我将再次验证连接,这就是向我展示的:

$ ping 127.0.0.1

PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_req=1 ttl=64 time=0.035 ms
64 bytes from 127.0.0.1: icmp_req=2 ttl=64 time=0.028 ms
64 bytes from 127.0.0.1: icmp_req=3 ttl=64 time=0.028 ms
64 bytes from 127.0.0.1: icmp_req=4 ttl=64 time=0.026 ms
--- 127.0.0.1 ping statistics ---

然后我用 curl 试了一下,大约 30 秒后,它打印出以下内容:

$ curl 127.0.0.1:8888

curl: (56) Recv failure: Connection reset by peer

我遇到了这个奇怪的错误,这是什么意思?

更新 2:

uwsgi上有mysite的配置文件和它们的logs文件,但是和tokio上服务器的消息是一样的(正常工作),所以我放弃这是一个问题uwsgi:

/etc/uwsgi/apps-enabled/mysite.ini

[uwsgi]
vhost = true
plugins = python
socket = /tmp/mysite.sock
master = true
enable-threads = true
processes = 2
wsgi-file = /home/ubuntu/apps/mysite/mysite/wsgi.py
virtualenv = /home/ubuntu/.venv/mysite
chdir = /home/ubuntu/apps/mysite
touch-reload = /home/ubuntu/apps/mysite/reload

/var/log/uwsgi/app/mysite.log

[uWSGI] getting INI configuration from /usr/share/uwsgi/conf/default.ini
[uWSGI] getting INI configuration from /etc/uwsgi/apps-enabled/mysite.ini
Sun Apr 12 18:29:55 2015 - *** Starting uWSGI 1.0.3-debian (64bit) on [Sun Apr 12 18:29:55 2015] ***
Sun Apr 12 18:29:55 2015 - compiled with version: 4.6.3 on 17 July 2012 02:26:54
Sun Apr 12 18:29:55 2015 - current working directory: /
Sun Apr 12 18:29:55 2015 - writing pidfile to /run/uwsgi/app/mysite/pid
Sun Apr 12 18:29:55 2015 - detected binary path: /usr/bin/uwsgi-core
Sun Apr 12 18:29:55 2015 - setgid() to 33
Sun Apr 12 18:29:55 2015 - setuid() to 33
Sun Apr 12 18:29:55 2015 - your memory page size is 4096 bytes
Sun Apr 12 18:29:55 2015 - VirtualHosting mode enabled.
Sun Apr 12 18:29:55 2015 - uwsgi socket 0 bound to UNIX address /run/uwsgi/app/mysite/socket fd 5
Sun Apr 12 18:29:55 2015 - uwsgi socket 1 bound to UNIX address /tmp/mysite.sock fd 6
Sun Apr 12 18:29:55 2015 - Python version: 2.7.3 (default, Aug  1 2012, 05:25:23)  [GCC 4.6.3]
Sun Apr 12 18:29:55 2015 - Set PythonHome to /home/ubuntu/.venv/mysite
Sun Apr 12 18:29:55 2015 - Python main interpreter initialized at 0x916120
Sun Apr 12 18:29:55 2015 - threads support enabled
Sun Apr 12 18:29:55 2015 - your server socket listen backlog is limited to 100 connections
Sun Apr 12 18:29:55 2015 - *** Operational MODE: preforking ***
Sun Apr 12 18:29:57 2015 - WSGI application 0 (mountpoint='') ready on interpreter 0x916120 pid: 1137 (default app)
Sun Apr 12 18:29:57 2015 - *** uWSGI is running in multiple interpreter mode ***
Sun Apr 12 18:29:57 2015 - spawned uWSGI master process (pid: 1137)
Sun Apr 12 18:29:57 2015 - spawned uWSGI worker 1 (pid: 1236, cores: 1)
Sun Apr 12 18:29:57 2015 - spawned uWSGI worker 2 (pid: 1237, cores: 1)
Sun Apr 12 18:29:57 2015 - unable to stat() /home/ubuntu/apps/mysite/reload, reload will be triggered as soon as the file is created

更新 3:

我输入了netstat -nap -p | grep 8888,它显示了我:

tcp        0      0 127.0.0.1:8888          0.0.0.0:*               LISTEN      7550/python

然后我输入ps aux | grep 7550 和...

ubuntu    7550  2.4  0.4  65752 15568 ?        S    21:44   0:00 /home/ubuntu/.venv/mysite/bin/python /home/ubuntu/.venv/mysite/bin/gunicorn_django -w 3 --user=ubuntu --group=ubuntu --log-level=debug --timeout 120 --log-file=/var/log/gunicorn/mysite.log -b 127.0.0.1:8888
ubuntu    7585  0.0  0.0   8104   924 pts/1    S+   21:44   0:00 grep --color=auto 7550

嗯,我检查了cat /var/log/gunicorn/mysite.log,我得到了这个:

Traceback (most recent call last):
  File "/home/ubuntu/.venv/mysite/bin/gunicorn_django", line 8, in <module>
    load_entry_point('gunicorn==0.14.6', 'console_scripts', 'gunicorn_django')()
  File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/app/djangoapp.py", line 132, in run
    DjangoApplication("%prog [OPTIONS] [SETTINGS_PATH]").run()
  File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/app/base.py", line 124, in run
    Arbiter(self).run()
  File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 185, in run
    self.halt(reason=inst.reason, exit_status=inst.exit_status)
  File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 280, in halt
    self.stop()
  File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 328, in stop
    self.reap_workers()
  File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 419, in reap_workers
    raise HaltServer(reason, self.WORKER_BOOT_ERROR)
gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3>
2015-04-12 21:44:36 [7550] [INFO] Starting gunicorn 0.14.6
2015-04-12 21:44:36 [7550] [DEBUG] Arbiter booted
2015-04-12 21:44:36 [7550] [INFO] Listening at: http://127.0.0.1:8888 (7550)
2015-04-12 21:44:36 [7550] [INFO] Using worker: sync
2015-04-12 21:44:36 [7558] [INFO] Booting worker with pid: 7558
2015-04-12 21:44:36 [7559] [INFO] Booting worker with pid: 7559
2015-04-12 21:44:36 [7560] [INFO] Booting worker with pid: 7560
Production environment is up!
Production environment is up!
Production environment is up!

嗯...... Gunicorn 似乎失败了(它在 virtualenv 中),所以我在调试模式下检查了执行:

gunicorn mysite.wsgi:application --preload --debug --log-level debug

2015-04-12 22:32:42 [9085] [DEBUG] Current configuration:
2015-04-12 22:32:42 [9085] [DEBUG]   access_log_format: "%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"
2015-04-12 22:32:42 [9085] [DEBUG]   accesslog: None
2015-04-12 22:32:42 [9085] [DEBUG]   backlog: 2048
2015-04-12 22:32:42 [9085] [DEBUG]   bind: 127.0.0.1:8000
2015-04-12 22:32:42 [9085] [DEBUG]   check_config: False
2015-04-12 22:32:42 [9085] [DEBUG]   config: None
2015-04-12 22:32:42 [9085] [DEBUG]   daemon: False
2015-04-12 22:32:42 [9085] [DEBUG]   debug: True
2015-04-12 22:32:42 [9085] [DEBUG]   default_proc_name: mysite.wsgi:application
2015-04-12 22:32:42 [9085] [DEBUG]   django_settings: None
2015-04-12 22:32:42 [9085] [DEBUG]   errorlog: -
2015-04-12 22:32:42 [9085] [DEBUG]   graceful_timeout: 30
2015-04-12 22:32:42 [9085] [DEBUG]   group: 1000
2015-04-12 22:32:42 [9085] [DEBUG]   keepalive: 2
2015-04-12 22:32:42 [9085] [DEBUG]   limit_request_field_size: 8190
2015-04-12 22:32:42 [9085] [DEBUG]   limit_request_fields: 100
2015-04-12 22:32:42 [9085] [DEBUG]   limit_request_line: 4094
2015-04-12 22:32:42 [9085] [DEBUG]   logconfig: None
2015-04-12 22:32:42 [9085] [DEBUG]   logger_class: simple
2015-04-12 22:32:42 [9085] [DEBUG]   loglevel: debug
2015-04-12 22:32:42 [9085] [DEBUG]   max_requests: 0
2015-04-12 22:32:42 [9085] [DEBUG]   on_reload: <function on_reload at 0x7f6f421e9320>
2015-04-12 22:32:42 [9085] [DEBUG]   on_starting: <function on_starting at 0x7f6f421e91b8>
2015-04-12 22:32:42 [9085] [DEBUG]   pidfile: None
2015-04-12 22:32:42 [9085] [DEBUG]   post_fork: <function post_fork at 0x7f6f421e9758>
2015-04-12 22:32:42 [9085] [DEBUG]   post_request: <function post_request at 0x7f6f421e9b18>
2015-04-12 22:32:42 [9085] [DEBUG]   pre_exec: <function pre_exec at 0x7f6f421e98c0>
2015-04-12 22:32:42 [9085] [DEBUG]   pre_fork: <function pre_fork at 0x7f6f421e95f0>
2015-04-12 22:32:42 [9085] [DEBUG]   pre_request: <function pre_request at 0x7f6f421e9a28>
2015-04-12 22:32:42 [9085] [DEBUG]   preload_app: True
2015-04-12 22:32:42 [9085] [DEBUG]   proc_name: None
2015-04-12 22:32:42 [9085] [DEBUG]   pythonpath: None
2015-04-12 22:32:42 [9085] [DEBUG]   secure_scheme_headers: 'X-FORWARDED-PROTOCOL': 'ssl', 'X-FORWARDED-SSL': 'on'
2015-04-12 22:32:42 [9085] [DEBUG]   spew: False
2015-04-12 22:32:42 [9085] [DEBUG]   timeout: 30
2015-04-12 22:32:42 [9085] [DEBUG]   tmp_upload_dir: None
2015-04-12 22:32:42 [9085] [DEBUG]   umask: 0
2015-04-12 22:32:42 [9085] [DEBUG]   user: 1000
2015-04-12 22:32:42 [9085] [DEBUG]   when_ready: <function when_ready at 0x7f6f421e9488>
2015-04-12 22:32:42 [9085] [DEBUG]   worker_class: sync
2015-04-12 22:32:42 [9085] [DEBUG]   worker_connections: 1000
2015-04-12 22:32:42 [9085] [DEBUG]   worker_exit: <function worker_exit at 0x7f6f421e9c80>
2015-04-12 22:32:42 [9085] [DEBUG]   workers: 1
2015-04-12 22:32:42 [9085] [DEBUG]   x_forwarded_for_header: X-FORWARDED-FOR
2015-04-12 22:32:42 [9085] [WARNING] debug mode: app isn't preloaded.
2015-04-12 22:32:42 [9085] [INFO] Starting gunicorn 0.14.6
2015-04-12 22:32:42 [9085] [DEBUG] Arbiter booted
2015-04-12 22:32:42 [9085] [INFO] Listening at: http://127.0.0.1:8000 (9085)
2015-04-12 22:32:42 [9085] [INFO] Using worker: sync
2015-04-12 22:32:42 [9088] [INFO] Booting worker with pid: 9088
^[[A^C2015-04-12 22:34:38 [9088] [INFO] Worker exiting (pid: 9088)
2015-04-12 22:34:38 [9085] [INFO] Handling signal: int
2015-04-12 22:34:38 [9085] [INFO] Shutting down: Master

到目前为止,我知道 gunicorn 存在问题,它失败并重新启动并再次失败,但这些消息并没有向我显示明确的错误......还有其他想法吗?我开始感到很困惑:S

【问题讨论】:

查看日志文件怎么样? @Michael-sqlbot 请再看一下,我添加了更新。 所以,nginx 实际上没问题,但是您在端口 8888 上侦听的服务却不行。那是你需要看的地方。 “对等体”指的是 8888 上的服务。 @geoom,我检查了日志(最初,我假设您使用的是 uwsgi,但似乎它是 gunicorn 作为服务器); gunicorn 的“工人无法启动”基本上意味着 - 出了点问题。限制搜索:django 的 runserver (python manage.py runserver) 启动是否没有错误? 做一件事:尝试使用 runserver 运行您的应用程序。或者你拥有的任何开发服务器,继续使用你的应用程序,只是为了检查你的应用程序是否真的能够启动。我猜如果未设置某些环境变量,您的应用程序正在崩溃。但在那种情况下,日志中会出现一些东西。无论如何..只要检查一下。 【参考方案1】:

实际上...... 环境变量是罪魁祸首(我没有意识到),它们没有正确配置,因此当 Gunicorn 尝试运行 Django 时,Django 崩溃了。

并且我通过检查所有环境变量并根据我的实例 EC2 正确设置解决了这个问题...非常感谢 @Serj Zaharchenko 提供简单但强大的线索。

【讨论】:

【参考方案2】:

找到了,不知道能不能解决你的问题。

gunicorn_django 文件的第一行是“#!/opt/django/env/mysite/bin/python”,这是我的 virtualenviroment python 路径的路径。通过将其替换为“#!/usr/bin/env python”解决了问题

【讨论】:

你的意思是我应该替换/bin/gunicorn_django 文件的第一行吗? ...我不这么认为,tokio的机器没有这个修改,我更喜欢尝试一种不那么打扰的方式...谢谢建议

以上是关于来自 AWS EC2 上重定位实例的 502 Bad Gateway的主要内容,如果未能解决你的问题,请参考以下文章

AWS 负载均衡器 502

AWS仅允许来自S3的EC2流量

无法将 s3 与来自 aws lambda 的 ec2 文件夹同步

从 AWS EC2 实例上的 github 企业克隆时出现 SSL 错误

AWS CodeDeploy Blue/Green with ASG - 失败的部署不断重启 EC2 实例

AWS EC2 停用