芹菜工人在 aws 弹性豆茎中失败 [退出:芹菜工人(退出状态 1;未预期)]

Posted

技术标签:

【中文标题】芹菜工人在 aws 弹性豆茎中失败 [退出:芹菜工人(退出状态 1;未预期)]【英文标题】:Celery workers failing in aws elastic beanstalk [exited: celeryd-worker (exit status 1; not expected)] 【发布时间】:2019-06-19 23:48:12 【问题描述】:

我一直在尝试遵循有关如何将带有 celery worker 的 django 应用程序部署到 aws elastic beanstalk 的详尽说明:

How to run a celery worker with Django app scalable by AWS Elastic Beanstalk?

我在安装 pycurl 时遇到了一些问题,但通过以下评论解决了:

Pip Requirements.txt --global-option causing installation errors with other packages. "option not recognized"

然后我得到了:

[2019-01-26T06:43:04.865Z] INFO  [12249] - [Application update app-190126_134200@28/AppDeployStage0/EbExtensionPostBuild/Infra-EmbeddedPostBuild/postbuild_1_raiseflags/Command 05_celery_tasks_run] : Activity execution failed, because: /usr/bin/env: bash
  : No such file or directory
   (ElasticBeanstalk::ExternalInvocationError)

但也解决了:原来我必须将“celery_configuration.txt”文件转换为 UNIX EOL(我使用的是 Windows,Notepad++ 自动将其转换为 Windows EOL)。

通过所有这些修改,我可以成功部署项目。但问题是周期性任务没有运行。

我明白了:

2019-01-26 09:12:57,337 INFO exited: celeryd-beat (exit status 1; not expected)
2019-01-26 09:12:58,583 INFO spawned: 'celeryd-worker' with pid 25691
2019-01-26 09:12:59,453 INFO spawned: 'celeryd-beat' with pid 25695
2019-01-26 09:12:59,666 INFO exited: celeryd-worker (exit status 1; not expected)
2019-01-26 09:13:00,790 INFO spawned: 'celeryd-worker' with pid 25705
2019-01-26 09:13:00,791 INFO exited: celeryd-beat (exit status 1; not expected)
2019-01-26 09:13:01,915 INFO exited: celeryd-worker (exit status 1; not expected)
2019-01-26 09:13:03,919 INFO spawned: 'celeryd-worker' with pid 25728
2019-01-26 09:13:03,920 INFO spawned: 'celeryd-beat' with pid 25729
2019-01-26 09:13:05,985 INFO exited: celeryd-worker (exit status 1; not expected)
2019-01-26 09:13:06,091 INFO exited: celeryd-beat (exit status 1; not expected)
2019-01-26 09:13:07,092 INFO gave up: celeryd-beat entered FATAL state, too many start retries too quickly
2019-01-26 09:13:09,096 INFO spawned: 'celeryd-worker' with pid 25737
2019-01-26 09:13:10,084 INFO exited: celeryd-worker (exit status 1; not expected)
2019-01-26 09:13:11,085 INFO gave up: celeryd-worker entered FATAL state, too many start retries too quickly

我也有这部分日志:

[2019-01-26T09:13:00.583Z] INFO  [25247] - [Application update app-190126_161213@43/AppDeployStage1/AppDeployPostHook/run_supervised_celeryd.sh] : Completed activity. Result:
  [program:celeryd-worker]
  ; Set full path to celery program if using virtualenv
  command=/opt/python/run/venv/bin/celery worker -A raiseflags --loglevel=INFO

  directory=/opt/python/current/app
  user=nobody
  numprocs=1
  stdout_logfile=/var/log/celery-worker.log
  stderr_logfile=/var/log/celery-worker.log
  autostart=true
  autorestart=true
  startsecs=10

  ; Need to wait for currently executing tasks to finish at shutdown.
  ; Increase this if you have very long running tasks.
  stopwaitsecs = 600

  ; When resorting to send SIGKILL to the program to terminate it
  ; send SIGKILL to its whole process group instead,
  ; taking care of its children as well.
  killasgroup=true

  ; if rabbitmq is supervised, set its priority higher
  ; so it starts first
  priority=998

  environment=PYTHONPATH="/opt/python/current/app/:",PATH="/opt/python/run/venv/bin/:%%(ENV_PATH)s",RDS_PORT="5432",RDS_DB_NAME="ebdb",RDS_USERNAME="foobar",PYCURL_SSL_LIBRARY="nss",DJANGO_SETTINGS_MODULE="raiseflags.settings",RDS_PASSWORD="foobar",RDS_HOSTNAME="something.something.eu-west-1.rds.amazonaws.com"

  [program:celeryd-beat]
  ; Set full path to celery program if using virtualenv
  command=/opt/python/run/venv/bin/celery beat -A raiseflags --loglevel=INFO --workdir=/tmp -S django --pidfile /tmp/celerybeat.pid

  directory=/opt/python/current/app
  user=nobody
  numprocs=1
  stdout_logfile=/var/log/celery-beat.log
  stderr_logfile=/var/log/celery-beat.log
  autostart=true
  autorestart=true
  startsecs=10

  ; Need to wait for currently executing tasks to finish at shutdown.
  ; Increase this if you have very long running tasks.
  stopwaitsecs = 600

  ; When resorting to send SIGKILL to the program to terminate it
  ; send SIGKILL to its whole process group instead,
  ; taking care of its children as well.
  killasgroup=true

  ; if rabbitmq is supervised, set its priority higher
  ; so it starts first
  priority=998

  environment=PYTHONPATH="/opt/python/current/app/:",PATH="/opt/python/run/venv/bin/:%%(ENV_PATH)s",RDS_PORT="5432",RDS_DB_NAME="ebdb",RDS_USERNAME="puigdemontAWS",PYCURL_SSL_LIBRARY="nss",DJANGO_SETTINGS_MODULE="raiseflags.settings",RDS_PASSWORD="holahola",RDS_HOSTNAME="aa1m59206y4fljn.cdreg3t50bbl.eu-west-1.rds.amazonaws.com"
  No config updates to processes
  celeryd-beat: ERROR (not running)
  celeryd-beat: ERROR (abnormal termination)
  celeryd-worker: ERROR (not running)
  celeryd-worker: ERROR (abnormal termination)
[2019-01-26T09:13:00.583Z] INFO  [25247] - [Application update app-190126_161213@43/AppDeployStage1/AppDeployPostHook] : Completed activity. Result:
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/appdeploy/post.
[2019-01-26T09:13:00.583Z] INFO  [25247] - [Application update app-190126_161213@43/AppDeployStage1] : Completed activity. Result:
  Application version switch - Command CMD-AppDeploy stage 1 completed
[2019-01-26T09:13:00.583Z] INFO  [25247] - [Application update app-190126_161213@43/AddonsAfter] : Starting activity...
[2019-01-26T09:13:00.583Z] INFO  [25247] - [Application update app-190126_161213@43/AddonsAfter/ConfigLogRotation] : Starting activity...
[2019-01-26T09:13:00.583Z] INFO  [25247] - [Application update app-190126_161213@43/AddonsAfter/ConfigLogRotation/10-config.sh] : Starting activity...
[2019-01-26T09:13:00.756Z] INFO  [25247] - [Application update app-190126_161213@43/AddonsAfter/ConfigLogRotation/10-config.sh] : Completed activity. Result:
  Disabled forced hourly log rotation.
[2019-01-26T09:13:00.756Z] INFO  [25247] - [Application update app-190126_161213@43/AddonsAfter/ConfigLogRotation] : Completed activity. Result:
  Successfully execute hooks in directory /opt/elasticbeanstalk/addons/logpublish/hooks/config.

我不知道它是否与错误有关,但请注意 [[ PATH="/opt/python/run/venv/bin/:%%(ENV_PATH)s" ]] 行上方 - -> ENV_PATH 不应该是别的吗?:

environment=PYTHONPATH="/opt/python/current/app/:",PATH="/opt/python/run/venv/bin/:%%(ENV_PATH)s",RDS_PORT="5432",RDS_DB_NAME="ebdb",RDS_USERNAME="foobar",PYCURL_SSL_LIBRARY="nss",DJANGO_SETTINGS_MODULE="raiseflags.settings",RDS_PASSWORD="foobar",RDS_HOSTNAME="something.something.eu-west-1.rds.amazonaws.com"

我是第一次使用 celery 部署应用程序,说实话我真的很迷茫。我为解决前两个错误付出了很多努力(我真的很业余),现在我得到了这个我什至不知道从哪里开始。

另外,我不确定我是否以正确的方式使用“celery_configuration.txt”。我唯一编辑的是两个地方写着“django_app”,我把它改成了“raiseflags”(我的 django 项目的名称)。这是正确的吗?

有人知道怎么解决吗?如果需要,我可以粘贴我的文件,但它们就像第一个链接中提供的一样。我正在使用 Windows。

非常感谢!

【问题讨论】:

【参考方案1】:

好的,问题与我所指的 PATH 行无关。我只需要在我的设置中的 INSTALLED_APPS 中添加“django_celery_beat”和“django_celery_results”。py

我后来提到与 Fran 交谈的连接错误是因为我需要在 settings.py 文件中设置 BROKER_URL 而不是 CELERY_BROKER_URL。我想这与我没有在 celery.py 文件中的 app.autodiscover_tasks() 中将“CELERY”指定为命名空间有关(尽管在链接的问题中他们这样做了,但我没有这样做,因为我使用的是不同的芹菜的版本)。

感谢 Fran 所做的一切,特别是指出我应该查看 celery 错误日志。我不知道该怎么做。如果任何其他业余爱好者也在苦苦挣扎,请知道您必须“eb ssh”到您的实例,然后“tail -n 40 /var/log/celery-worker.log”和“”tail -n 40 /var/log/ celery-beat.log”(其中“40”是您要阅读的行数)。我知道这对很多人来说听起来很明显,但是,愚蠢的我,我不知道。

(顺便说一句,我还在为芹菜工人的问题苦苦挣扎,找不到 pycurl 模块,但这与这个问题无关)。

【讨论】:

【参考方案2】:

参考你指出的那一行 environment=PYTHONPATH="/opt/python/current/app/:",PATH="/opt/python/run/venv/bin/:%%(ENV_PATH)s",RDS_PORT="5432",RDS_DB_NAME="ebdb",RDS_USERNAME="foobar",PYCURL_SSL_LIBRARY="nss",DJANGO_SETTINGS_MODULE="raiseflags.settings",RDS_PASSWORD="foobar",RDS_HOSTNAME="something.something.eu-west-1.rds.amazonaws.com",您是从某处复制此行吗?因为我在您发布的链接中没有看到它。 在链接的答案中是environment=$celeryenv,其中$celeryenv 被定义为

celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'`
celeryenv=$celeryenv%?```

【讨论】:

抱歉,这里有误会。这一行来自错误日志,显示(我猜)这是环境。但是,是的,我将它定义为链接的答案,你展示的方式。 如果你用 ssh 连接到远程机器并给出命令cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g',输出是什么?另外,也许您可​​以查看 celery 日志,以了解更多详细信息 如果我回显 $ENV_PATH 我什么也得不到。如果我回显 $PATH 我得到这个:prntscr.com/mcfmjl (编辑:我输入了几次命令,因为我什么也没得到,也不知道该怎么做。然后我才进入回声。澄清以防这会影响结果) 重新部署并再次连接 ssh 并输入您的命令。这就是我得到的:prntscr.com/mcfr2o 芹菜原木呢?你打开了吗?如果您尝试在没有守护进程的情况下运行 celery,它会启动吗?我使用 celery 部署了许多应用程序,但从未使用 beanstalk,所以我不知道您是否必须在配置中自定义路径或者这些是标准位置。有时直接运行 celery 可以揭示问题是守护进程还是任务本身。并且检查您拥有的所有日志(如celery 日志和supervisor 日志)是强制性的。

以上是关于芹菜工人在 aws 弹性豆茎中失败 [退出:芹菜工人(退出状态 1;未预期)]的主要内容,如果未能解决你的问题,请参考以下文章

芹菜工人过早退出不会调用 on_failure

从多处理开始芹菜工人

如何优雅地重启芹菜工人?

Django 1.6 + RabbitMQ 3.2.3 + Celery 3.1.9 - 为啥我的芹菜工人死于:WorkerLostError:工人过早退出:信号11(SIGSEGV)

OpenCV(imread)操作卡在弹性豆茎中

芹菜。运行单个芹菜节拍 + 多个芹菜工人规模