如何在 AWS Elastic Beanstalk 上运行 celery worker？

Posted 2023-03-04

技术标签:

【中文标题】如何在 AWS Elastic Beanstalk 上运行 celery worker？【英文标题】：How to run a celery worker on AWS Elastic Beanstalk? 【发布时间】：2016-11-28 17:15:33 【问题描述】：

版本：

Django 1.9.8 芹菜 3.1.23 django-celery 3.1.17 Python 2.7

我正在尝试在 AWS Elastic Beanstalk 上运行我的 celery worker。我使用 Amazon SQS 作为 celery 代理。

这是我的 settings.py

INSTALLED_APPS += ('djcelery',)
import djcelery
djcelery.setup_loader()
BROKER_URL = "sqs://%s:%s@" % (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY.replace('/', '%2F'))

当我在终端上键入以下行时，它会在我的本地启动工作人员。此外，我还创建了一些任务并且它们执行正确。如何在 AWS EB 上执行此操作？

python manage.py celery worker --loglevel=INFO

我在 *** 上找到了 this 问题。它说我应该在部署后执行脚本的 .ebextensions 文件夹中添加一个 celery 配置。但它不起作用。我会很感激任何帮助。安装主管后，我什么也没做。也许这就是我所缺少的。这是脚本。

files:
  "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

      # Get django environment variables
      celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
      celeryenv=$celeryenv%?

      # Create celery configuration script
      celeryconf="[program:celeryd]
      command=/opt/python/run/venv/bin/celery worker --loglevel=INFO

      directory=/opt/python/current/app
      user=nobody
      numprocs=1
      stdout_logfile=/var/log/celery-worker.log
      stderr_logfile=/var/log/celery-worker.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 600

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      ; priority=998

      environment=$celeryenv"

      # Create the celery supervisord conf script
      echo "$celeryconf" | tee /opt/python/etc/celery.conf

      # Add configuration script to supervisord conf (if not there already)
      if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
          then
          echo "[include]" | tee -a /opt/python/etc/supervisord.conf
          echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Reread the supervisord config
      supervisorctl -c /opt/python/etc/supervisord.conf reread

      # Update supervisord in cache without restarting all services
      supervisorctl -c /opt/python/etc/supervisord.conf update

      # Start/Restart celeryd through supervisord
      supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd

来自 EB 的日志：看起来它可以工作，但仍然无法执行我的任务。

-------------------------------------
/opt/python/log/supervisord.log
-------------------------------------
2016-08-02 10:45:27,713 CRIT Supervisor running as root (no user in config file)
2016-08-02 10:45:27,733 INFO RPC interface 'supervisor' initialized
2016-08-02 10:45:27,733 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2016-08-02 10:45:27,733 INFO supervisord started with pid 2726
2016-08-02 10:45:28,735 INFO spawned: 'httpd' with pid 2812
2016-08-02 10:45:29,737 INFO success: httpd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2016-08-02 10:47:14,684 INFO stopped: httpd (exit status 0)
2016-08-02 10:47:15,689 INFO spawned: 'httpd' with pid 4092
2016-08-02 10:47:16,727 INFO success: httpd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2016-08-02 10:47:23,701 INFO spawned: 'celeryd' with pid 4208
2016-08-02 10:47:23,854 INFO stopped: celeryd (terminated by SIGTERM)
2016-08-02 10:47:24,858 INFO spawned: 'celeryd' with pid 4214
2016-08-02 10:47:35,067 INFO success: celeryd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2016-08-02 10:52:36,240 INFO stopped: httpd (exit status 0)
2016-08-02 10:52:37,245 INFO spawned: 'httpd' with pid 4460
2016-08-02 10:52:38,278 INFO success: httpd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2016-08-02 10:52:45,677 INFO stopped: celeryd (exit status 0)
2016-08-02 10:52:46,682 INFO spawned: 'celeryd' with pid 4514
2016-08-02 10:52:46,860 INFO stopped: celeryd (terminated by SIGTERM)
2016-08-02 10:52:47,865 INFO spawned: 'celeryd' with pid 4521
2016-08-02 10:52:58,054 INFO success: celeryd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2016-08-02 10:55:03,135 INFO stopped: httpd (exit status 0)
2016-08-02 10:55:04,139 INFO spawned: 'httpd' with pid 4745
2016-08-02 10:55:05,173 INFO success: httpd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2016-08-02 10:55:13,143 INFO stopped: celeryd (exit status 0)
2016-08-02 10:55:14,147 INFO spawned: 'celeryd' with pid 4857
2016-08-02 10:55:14,316 INFO stopped: celeryd (terminated by SIGTERM)
2016-08-02 10:55:15,321 INFO spawned: 'celeryd' with pid 4863
2016-08-02 10:55:25,518 INFO success: celeryd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)

【问题讨论】：

您是否尝试过查看 eb-tools.log（参见 ***.com/questions/12836834/…）来检查您的部署？此外，这些钩子似乎不是官方的，因此您可能需要做更多的事情 - 如junkheap.net/blog/2013/05/20/…中所述是的，这不是官方的，但正如我所说，有人让它工作了。我会查看您发送的内容并回复您。谢谢我已经检查过你之前发给我的链接。在发布到这里之前，我已经检查过了。当我在部署后检查日志时，它说没有命令“supervisorctl”。稍后我会给日志提供更多详细信息。可以看到上面的设置。我也用过亚马逊sqs。据我所知，客户和工人之间没有区别。差异来自运行工人。如果您不在一个实例上运行工作程序，它会充当客户端并可以创建任务。 Worker 从 sqs 获取任务并运行它。 【参考方案1】：

解决这个问题后我忘了添加答案。这就是我修复它的方式。我在 .ebextensions 文件夹中创建了一个新文件“99-celery.config”。在这个文件中，我添加了这段代码，它运行良好。（不要忘记在第 16 行更改您的项目名称，我的是 molocate_eb）

files:
  "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

      # Get django environment variables
      celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
      celeryenv=$celeryenv%?

      # Create celery configuraiton script
      celeryconf="[program:celeryd]
      ; Set full path to celery program if using virtualenv
      command=/opt/python/current/app/molocate_eb/manage.py celery worker --loglevel=INFO

      directory=/opt/python/current/app
      user=nobody
      numprocs=1
      stdout_logfile=/var/log/celery-worker.log
      stderr_logfile=/var/log/celery-worker.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 600

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      priority=998

      environment=$celeryenv"

      # Create the celery supervisord conf script
      echo "$celeryconf" | tee /opt/python/etc/celery.conf

      # Add configuration script to supervisord conf (if not there already)
      if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
          then
          echo "[include]" | tee -a /opt/python/etc/supervisord.conf
          echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Reread the supervisord config
      supervisorctl -c /opt/python/etc/supervisord.conf reread

      # Update supervisord in cache without restarting all services
      supervisorctl -c /opt/python/etc/supervisord.conf update

      # Start/Restart celeryd through supervisord
      supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd

编辑：如果 AWS 上出现主管错误，请确保；

您使用的是 Python 2 而不是 Python 3，因为 Supervisor 不适用于 Python 3。不要忘记将主管添加到您的 requirements.txt。如果它仍然给出错误（发生在我身上一次），只需“重建环境”，它可能会工作。

【讨论】：

这个 celeryenv=$celeryenv%? 有什么作用？剧本不是我写的，有问题的可以找作者。这是解决问题的好方法，但这种方法有一个重要的问题。该文件将保留在部署挂钩目录中，如果您回滚到以前的应用程序版本，则不会被删除。即部署了我的应用程序的 V1，我在我的应用程序的 V2 中添加了 celery。部署后，我意识到 V2 有一个重大错误，我需要将服务器回滚到 V1。将 V1 部署到 beanstalk 环境会失败，因为这个 hook 脚本仍然存在于 post deploy 文件夹中，但是 celery 代码不再在环境中。您确定脚本在回滚中仍然存在吗？我遇到的一个讨厌的问题是 celeryenv 变量。当我尝试运行主管时，出现“格式错误的字符串”错误。问题是，一些环境变量中有一个 '%' 字符并且它们没有被转义，所以这让 Python 偏离了轨道。要解决此问题，请将| sed 's/%/%%/g' 附加到celeryenv= ... 行。【参考方案2】：

您可以使用主管来运行 celery。这将在恶魔进程中运行芹菜。

[program:tornado-8002]
directory: name of the director where django project lies
command: command to run celery // python manage.py celery
stderr_logfile = /var/log/supervisord/tornado-stderr.log
stdout_logfile = /var/log/supervisord/tornado-stdout.log

【讨论】：

以上是关于如何在 AWS Elastic Beanstalk 上运行 celery worker？的主要内容，如果未能解决你的问题，请参考以下文章