气流网络服务器启动 - gunicorn 工作人员正在关闭

Posted

技术标签:

【中文标题】气流网络服务器启动 - gunicorn 工作人员正在关闭【英文标题】:airflow webserver starting - gunicorn workers shutting down 【发布时间】:2018-09-24 06:49:10 【问题描述】:

我在 docker 上的 centos7 上运行气流 1.8,但我的网络服务器无法访问浏览器。我通过 pip2.7 安装了气流。 Flower ui 显示正常,initdb 运行连接到 postgres 和 redis 后端,使用 CeleryExecutor,在 ECS 上运行,我以 root 身份运行。网络服务器正在通过气流网络服务器部署到默认 8080。

根据下面显示的日志,有谁知道 gunicorn 工人退出的原因/解决方案是什么?具体来说,好像是这一行

ERROR - [0 / 0] some workers seem to have died and gunicorndid not restart them as expected

整个日志...

[2018-04-13 20:05:01,161] db.py:287 INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
Done.
[2018-04-13 20:05:02,358] __init__.py:57 INFO - Using executor CeleryExecutor
/usr/local/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
  .format(x=modname), ExtDeprecationWarning
  ____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/

[2018-04-13 20:05:03,363] [1] models.py:167 INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-13 20:05:04,488] __init__.py:57 INFO - Using executor CeleryExecutor
[2018-04-13 20:05:04 +0000] [18] [INFO] Starting gunicorn 19.3.0
[2018-04-13 20:05:04 +0000] [18] [INFO] Listening at: http://0.0.0.0:8080 (18)
[2018-04-13 20:05:04 +0000] [18] [INFO] Using worker: sync
[2018-04-13 20:05:04 +0000] [24] [INFO] Booting worker with pid: 24
[2018-04-13 20:05:05 +0000] [25] [INFO] Booting worker with pid: 25
/usr/local/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
  .format(x=modname), ExtDeprecationWarning
[2018-04-13 20:05:05 +0000] [26] [INFO] Booting worker with pid: 26
[2018-04-13 20:05:05 +0000] [27] [INFO] Booting worker with pid: 27
/usr/local/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
  .format(x=modname), ExtDeprecationWarning
Running the Gunicorn Server with:
Workers: 4 sync
Host: 0.0.0.0:8080
Timeout: 120
Logfiles: - -
=================================================================            
/usr/local/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
  .format(x=modname), ExtDeprecationWarning
/usr/local/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
  .format(x=modname), ExtDeprecationWarning
[2018-04-13 20:05:06,461] [24] models.py:167 INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-13 20:05:07,873] [1] cli.py:723 ERROR - [0 / 0] some workers seem to have died and gunicorndid not restart them as expected
[2018-04-13 20:05:08,271] [27] models.py:167 INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-13 20:05:08,271] [25] models.py:167 INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-13 20:05:08,271] [26] models.py:167 INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-13 20:05:09 +0000] [25] [INFO] Parent changed, shutting down: <Worker 25>
[2018-04-13 20:05:09 +0000] [25] [INFO] Worker exiting (pid: 25)
[2018-04-13 20:05:09 +0000] [26] [INFO] Parent changed, shutting down: <Worker 26>
[2018-04-13 20:05:09 +0000] [26] [INFO] Worker exiting (pid: 26)
[2018-04-13 20:05:09 +0000] [27] [INFO] Parent changed, shutting down: <Worker 27>
[2018-04-13 20:05:09 +0000] [27] [INFO] Worker exiting (pid: 27)

我发誓不久前我有这个工作,不知道发生了什么。这是我安装的 pip 包列表

airflow (1.8.0)
alembic (0.8.10)
amqp (2.2.2)
asn1crypto (0.24.0)
awscli (1.15.4)
Babel (2.5.3)
backports-abc (0.5)
billiard (3.5.0.3)
boto3 (1.7.4)
botocore (1.10.4)
celery (4.0.2)
certifi (2018.1.18)
cffi (1.11.5)
chardet (3.0.4)
click (6.7)
colorama (0.3.7)
croniter (0.3.20)
cryptography (2.2.2)
Cython (0.28.2)
dill (0.2.7.1)
docutils (0.14)
enum34 (1.1.6)
Flask (0.11.1)
Flask-Admin (1.4.1)
Flask-Cache (0.13.1)
Flask-Login (0.2.11)
flask-swagger (0.2.13)
Flask-WTF (0.12)
flower (0.9.2)
funcsigs (1.0.0)
future (0.15.2)
futures (3.2.0)
gitdb2 (2.0.3)
GitPython (2.1.9)
gunicorn (19.3.0)
idna (2.6)
ipaddress (1.0.19)
itsdangerous (0.24)
Jinja2 (2.8.1)
jmespath (0.9.3)
kombu (4.1.0)
lockfile (0.12.2)
lxml (3.8.0)
Mako (1.0.7)
Markdown (2.6.11)
MarkupSafe (1.0)
ndg-httpsclient (0.4.4)
numpy (1.14.2)
ordereddict (1.1)
pandas (0.22.0)
pip (9.0.3)
psutil (4.4.2)
psycopg2-binary (2.7.4)
pyasn1 (0.4.2)
pycparser (2.18)
Pygments (2.2.0)
pyOpenSSL (17.5.0)
python-daemon (2.1.2)
python-dateutil (2.7.2)
python-editor (1.0.3)
python-nvd3 (0.14.2)
python-slugify (1.1.4)
pytz (2018.4)
PyYAML (3.12)
redis (2.10.6)
requests (2.18.4)
rsa (3.4.2)
s3transfer (0.1.13)
setproctitle (1.1.10)
setuptools (39.0.1)
singledispatch (3.4.0.3)
six (1.11.0)
smmap2 (2.0.3)
SQLAlchemy (1.2.6)
tabulate (0.7.7)
thrift (0.9.3)
tornado (5.0.2)
Unidecode (1.0.22)
urllib3 (1.22)
vine (1.1.4)
Werkzeug (0.14.1)
wheel (0.31.0)
WTForms (2.1)
zope.deprecation (4.3.0)

更新 我从源代码安装,现在从网络服务器收到此错误

[2018-04-14 00:20:48,594] cli.py:718 ERROR - [0 / 0] some workers seem to have died and gunicorndid not restart them as expected
[2018-04-14 00:20:50,396] models.py:197 INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-14 00:20:50,396] models.py:197 INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-14 00:20:50,396] models.py:197 INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-14 00:24:18,135] cli.py:725 ERROR - No response from gunicorn master within 120 seconds
[2018-04-14 00:24:23,032] cli.py:726 ERROR - Shutting down webserver

我认为这是 https://issues.apache.org/jira/browse/AIRFLOW-1235 在 gunicorn 工人死亡时关闭网络服务器的结果。我觉得……

更新 好的,这以某种方式自行修复。不知道怎么做,因为我做了很多事情,但是使用 greenlet、eventlet、gevent 安装 gunicorn 可能会有所帮助,并且它可能是我的入口点上的一些东西,也许是在 airflow webserver 之后执行 airflow webserver 的并发性。留下这个问题,因为我之前也遇到过 puckel 安装,并且很想知道这是否是其他人面临的错误以及这个问题是什么。

【问题讨论】:

是的,我在 aws ecs 上使用 puckel 时遇到了这个问题。如何解决? 【参考方案1】:

所以,当你从源代码安装时,你得到了https://issues.apache.org/jira/browse/AIRFLOW-1235 的修复,我认为当工人死亡时它会重新启动主人和工人。 我还看到我的员工因 mysql 会话/连接变坏而死亡。 EG 来自 SQLAlchemy 的异常,或者关于事务由于并发锁而失败并需要重试,围绕该问题的 Airflow 模型没有任何逻辑,或者 InvalidRequestError: This session is in 'prepared' state; no further SQL can be emitted within this transaction. 但通常不是 AT 启动。

我在启动时遇到的两次错误是由于 AWS 中的安全组而无法建立与数据库的连接,以及当我们的 3000 多个 dag 花了很长时间才添加到 DAG Bag 中时工人的超时被绊倒了,他们在设置代码完成之前关闭了自己。我很想看看这个设置代码是否可以改进或移出工人。

【讨论】:

以上是关于气流网络服务器启动 - gunicorn 工作人员正在关闭的主要内容,如果未能解决你的问题,请参考以下文章

gunicorn 使用主管启动时会引发数据库错误,手动启动时是不是正常工作?

即使所有配置文件都正确,gunicorn 服务也没有启动

当一些请求被发送时,Gunicorn只启动工作人员

气流:如何删除 DAG?

气流回填不起作用

python gunicorn 配置文件在哪