如何在 AWS Elastic Beanstalk 上运行 celery worker?

Posted

技术标签:

【中文标题】如何在 AWS Elastic Beanstalk 上运行 celery worker?【英文标题】:How to run a celery worker on AWS Elastic Beanstalk? 【发布时间】:2016-11-28 17:15:33 【问题描述】:

版本:

Django 1.9.8 芹菜 3.1.23 django-celery 3.1.17 Python 2.7

我正在尝试在 AWS Elastic Beanstalk 上运行我的 celery worker。我使用 Amazon SQS 作为 celery 代理。

这是我的 settings.py

INSTALLED_APPS += ('djcelery',)
import djcelery
djcelery.setup_loader()
BROKER_URL = "sqs://%s:%s@" % (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY.replace('/', '%2F'))

当我在终端上键入以下行时,它会在我的本地启动工作人员。此外,我还创建了一些任务并且它们执行正确。如何在 AWS EB 上执行此操作?

python manage.py celery worker --loglevel=INFO

我在 *** 上找到了 this 问题。 它说我应该在部署后执行脚本的 .ebextensions 文件夹中添加一个 celery 配置。但它不起作用。我会很感激任何帮助。安装主管后,我什么也没做。也许这就是我所缺少的。 这是脚本。

files:
  "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

      # Get django environment variables
      celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
      celeryenv=$celeryenv%?

      # Create celery configuration script
      celeryconf="[program:celeryd]
      command=/opt/python/run/venv/bin/celery worker --loglevel=INFO

      directory=/opt/python/current/app
      user=nobody
      numprocs=1
      stdout_logfile=/var/log/celery-worker.log
      stderr_logfile=/var/log/celery-worker.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 600

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      ; priority=998

      environment=$celeryenv"

      # Create the celery supervisord conf script
      echo "$celeryconf" | tee /opt/python/etc/celery.conf

      # Add configuration script to supervisord conf (if not there already)
      if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
          then
          echo "[include]" | tee -a /opt/python/etc/supervisord.conf
          echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Reread the supervisord config
      supervisorctl -c /opt/python/etc/supervisord.conf reread

      # Update supervisord in cache without restarting all services
      supervisorctl -c /opt/python/etc/supervisord.conf update

      # Start/Restart celeryd through supervisord
      supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd

来自 EB 的日志:看起来它可以工作,但仍然无法执行我的任务。

-------------------------------------
/opt/python/log/supervisord.log
-------------------------------------
2016-08-02 10:45:27,713 CRIT Supervisor running as root (no user in config file)
2016-08-02 10:45:27,733 INFO RPC interface 'supervisor' initialized
2016-08-02 10:45:27,733 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2016-08-02 10:45:27,733 INFO supervisord started with pid 2726
2016-08-02 10:45:28,735 INFO spawned: 'httpd' with pid 2812
2016-08-02 10:45:29,737 INFO success: httpd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2016-08-02 10:47:14,684 INFO stopped: httpd (exit status 0)
2016-08-02 10:47:15,689 INFO spawned: 'httpd' with pid 4092
2016-08-02 10:47:16,727 INFO success: httpd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2016-08-02 10:47:23,701 INFO spawned: 'celeryd' with pid 4208
2016-08-02 10:47:23,854 INFO stopped: celeryd (terminated by SIGTERM)
2016-08-02 10:47:24,858 INFO spawned: 'celeryd' with pid 4214
2016-08-02 10:47:35,067 INFO success: celeryd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2016-08-02 10:52:36,240 INFO stopped: httpd (exit status 0)
2016-08-02 10:52:37,245 INFO spawned: 'httpd' with pid 4460
2016-08-02 10:52:38,278 INFO success: httpd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2016-08-02 10:52:45,677 INFO stopped: celeryd (exit status 0)
2016-08-02 10:52:46,682 INFO spawned: 'celeryd' with pid 4514
2016-08-02 10:52:46,860 INFO stopped: celeryd (terminated by SIGTERM)
2016-08-02 10:52:47,865 INFO spawned: 'celeryd' with pid 4521
2016-08-02 10:52:58,054 INFO success: celeryd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2016-08-02 10:55:03,135 INFO stopped: httpd (exit status 0)
2016-08-02 10:55:04,139 INFO spawned: 'httpd' with pid 4745
2016-08-02 10:55:05,173 INFO success: httpd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2016-08-02 10:55:13,143 INFO stopped: celeryd (exit status 0)
2016-08-02 10:55:14,147 INFO spawned: 'celeryd' with pid 4857
2016-08-02 10:55:14,316 INFO stopped: celeryd (terminated by SIGTERM)
2016-08-02 10:55:15,321 INFO spawned: 'celeryd' with pid 4863
2016-08-02 10:55:25,518 INFO success: celeryd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)

【问题讨论】:

您是否尝试过查看 eb-tools.log(参见 ***.com/questions/12836834/…)来检查您的部署? 此外,这些钩子似乎不是官方的,因此您可能需要做更多的事情 - 如junkheap.net/blog/2013/05/20/…中所述 是的,这不是官方的,但正如我所说,有人让它工作了。我会查看您发送的内容并回复您。谢谢 我已经检查过你之前发给我的链接。在发布到这里之前,我已经检查过了。当我在部署后检查日志时,它说没有命令“supervisorctl”。稍后我会给日志提供更多详细信息。 可以看到上面的设置。我也用过亚马逊sqs。据我所知,客户和工人之间没有区别。差异来自运行工人。如果您不在一个实例上运行工作程序,它会充当客户端并可以创建任务。 Worker 从 sqs 获取任务并运行它。 【参考方案1】:

解决这个问题后我忘了添加答案。 这就是我修复它的方式。 我在 .ebextensions 文件夹中创建了一个新文件“99-celery.config”。 在这个文件中,我添加了这段代码,它运行良好。 (不要忘记在第 16 行更改您的项目名称,我的是 molocate_eb)

files:
  "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

      # Get django environment variables
      celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
      celeryenv=$celeryenv%?

      # Create celery configuraiton script
      celeryconf="[program:celeryd]
      ; Set full path to celery program if using virtualenv
      command=/opt/python/current/app/molocate_eb/manage.py celery worker --loglevel=INFO

      directory=/opt/python/current/app
      user=nobody
      numprocs=1
      stdout_logfile=/var/log/celery-worker.log
      stderr_logfile=/var/log/celery-worker.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 600

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      priority=998

      environment=$celeryenv"

      # Create the celery supervisord conf script
      echo "$celeryconf" | tee /opt/python/etc/celery.conf

      # Add configuration script to supervisord conf (if not there already)
      if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
          then
          echo "[include]" | tee -a /opt/python/etc/supervisord.conf
          echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Reread the supervisord config
      supervisorctl -c /opt/python/etc/supervisord.conf reread

      # Update supervisord in cache without restarting all services
      supervisorctl -c /opt/python/etc/supervisord.conf update

      # Start/Restart celeryd through supervisord
      supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd

编辑:如果 AWS 上出现主管错误,请确保;

您使用的是 Python 2 而不是 Python 3,因为 Supervisor 不适用于 Python 3。 不要忘记将主管添加到您的 requirements.txt。 如果它仍然给出错误(发生在我身上一次),只需“重建环境”,它可能会工作。

【讨论】:

这个 celeryenv=$celeryenv%? 有什么作用? 剧本不是我写的,有问题的可以找作者。 这是解决问题的好方法,但这种方法有一个重要的问题。该文件将保留在部署挂钩目录中,如果您回滚到以前的应用程序版本,则不会被删除。即部署了我的应用程序的 V1,我在我的应用程序的 V2 中添加了 celery。部署后,我意识到 V2 有一个重大错误,我需要将服务器回滚到 V1。将 V1 部署到 beanstalk 环境会失败,因为这个 hook 脚本仍然存在于 post deploy 文件夹中,但是 celery 代码不再在环境中。 您确定脚本在回滚中仍然存在吗? 我遇到的一个讨厌的问题是 celeryenv 变量。当我尝试运行主管时,出现“格式错误的字符串”错误。问题是,一些环境变量中有一个 '%' 字符并且它们没有被转义,所以这让 Python 偏离了轨道。要解决此问题,请将| sed 's/%/%%/g' 附加到celeryenv= ... 行。【参考方案2】:

您可以使用主管来运行 celery。这将在恶魔进程中运行芹菜。

[program:tornado-8002]
directory: name of the director where django project lies
command: command to run celery // python manage.py celery
stderr_logfile = /var/log/supervisord/tornado-stderr.log
stdout_logfile = /var/log/supervisord/tornado-stdout.log

【讨论】:

以上是关于如何在 AWS Elastic Beanstalk 上运行 celery worker?的主要内容,如果未能解决你的问题,请参考以下文章