在 Elastic Beanstalk 上启动 SQS celery worker
Posted
技术标签:
【中文标题】在 Elastic Beanstalk 上启动 SQS celery worker【英文标题】:Start SQS celery worker on Elastic Beanstalk 【发布时间】:2018-12-13 15:27:54 【问题描述】:我正在尝试在 EB 上启动一个 celery worker,但遇到一个无法解释的错误。
.ebextensions dir
配置文件中的命令:
03_celery_worker:
command: "celery worker --app=config --loglevel=info -E --workdir=/opt/python/current/app/my_project/"
列出的命令在我的本地机器上运行良好(只需更改 workdir 参数)。
来自 EB 的错误:
活动执行失败,因为:/opt/python/run/venv/local/lib/python3.6/site-packages/celery/platforms.py:796:RuntimeWarning:您正在以超级用户权限运行工作程序:这是 绝对不推荐!
和
开始新的 HTTPS 连接 (1):eu-west-1.queue.amazonaws.com (ElasticBeanstalk::ExternalInvocationError)
我已经用参数--uid=2
更新了 celery worker 命令,权限错误消失了,但命令执行仍然失败,原因是
外部调用错误
任何建议我做错了什么?
【问题讨论】:
【参考方案1】:外部调用错误
据我了解,这意味着无法从 EB 容器命令运行列出的命令。需要在服务器上创建一个脚本并从该脚本运行 celery。 This post 描述了如何做到这一点。
更新:
需要在.ebextensions
目录下创建配置文件。我称它为celery.config
。上面帖子的链接提供了一个几乎可以正常工作的脚本。需要做一些小的补充才能 100% 正确工作。我在安排定期任务时遇到了问题(celery beat)。以下是有关如何修复的步骤:
安装(添加到要求)django-celery beat pip install django-celery-beat
,将其添加到已安装的应用程序中,并在启动 celery beat 时使用 --scheduler
参数。说明是here。
在脚本中指定运行脚本的用户。对于 celery worker,它是 celery
用户,它是在脚本前面添加的(如果不存在)。当我尝试启动 celery beat 时出现错误 PermissionDenied。这意味着 celery 用户没有所有必要的权限。我使用 ssh 登录到 EB,查看了所有用户的列表 (cat /etc/passwd
) 并决定使用 daemon 用户。
列出的步骤解决了 celery beat 错误。使用脚本更新的配置文件如下(celery.config):
files:
"/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":
mode: "000755"
owner: root
group: root
content: |
#!/usr/bin/env bash
# Create required directories
sudo mkdir -p /var/log/celery/
sudo mkdir -p /var/run/celery/
# Create group called 'celery'
sudo groupadd -f celery
# add the user 'celery' if it doesn't exist and add it to the group with same name
id -u celery &>/dev/null || sudo useradd -g celery celery
# add permissions to the celery user for r+w to the folders just created
sudo chown -R celery:celery /var/log/celery/
sudo chown -R celery:celery /var/run/celery/
# Get django environment variables
celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/%/%%/g' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
celeryenv=$celeryenv%?
# Create CELERY configuration script
celeryconf="[program:celeryd]
directory=/opt/python/current/app
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery worker -A config.celery:app --loglevel=INFO --logfile=\"/var/log/celery/%%n%%I.log\" --pidfile=\"/var/run/celery/%%n.pid\"
user=celery
numprocs=1
stdout_logfile=/var/log/celery-worker.log
stderr_logfile=/var/log/celery-worker.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 60
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
environment=$celeryenv"
# Create CELERY BEAT configuraiton script
celerybeatconf="[program:celerybeat]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery beat -A config.celery:app --loglevel=INFO --scheduler django_celery_beat.schedulers:DatabaseScheduler --logfile=\"/var/log/celery/celery-beat.log\" --pidfile=\"/var/run/celery/celery-beat.pid\"
directory=/opt/python/current/app
user=daemon
numprocs=1
stdout_logfile=/var/log/celerybeat.log
stderr_logfile=/var/log/celerybeat.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 60
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=999
environment=$celeryenv"
# Create the celery supervisord conf script
echo "$celeryconf" | tee /opt/python/etc/celery.conf
echo "$celerybeatconf" | tee /opt/python/etc/celerybeat.conf
# Add configuration script to supervisord conf (if not there already)
if ! grep -Fxq "celery.conf" /opt/python/etc/supervisord.conf
then
echo "[include]" | tee -a /opt/python/etc/supervisord.conf
echo "files: uwsgi.conf celery.conf celerybeat.conf" | tee -a /opt/python/etc/supervisord.conf
fi
# Enable supervisor to listen for HTTP/XML-RPC requests.
# supervisorctl will use XML-RPC to communicate with supervisord over port 9001.
# Source: https://askubuntu.com/questions/911994/supervisorctl-3-3-1-http-localhost9001-refused-connection
if ! grep -Fxq "[inet_http_server]" /opt/python/etc/supervisord.conf
then
echo "[inet_http_server]" | tee -a /opt/python/etc/supervisord.conf
echo "port = 127.0.0.1:9001" | tee -a /opt/python/etc/supervisord.conf
fi
# Reread the supervisord config
supervisorctl -c /opt/python/etc/supervisord.conf reread
# Update supervisord in cache without restarting all services
supervisorctl -c /opt/python/etc/supervisord.conf update
# Start/Restart celeryd through supervisord
supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd
supervisorctl -c /opt/python/etc/supervisord.conf restart celerybeat
commands:
01_killotherbeats:
command: "ps auxww | grep 'celery beat' | awk 'print $2' | sudo xargs kill -9 || true"
ignoreErrors: true
02_restartbeat:
command: "supervisorctl -c /opt/python/etc/supervisord.conf restart celerybeat"
leader_only: true
需要注意的一点:在我的项目中celery.py
文件在config
目录下,这就是为什么我在启动celery worker和celery beat的时候写-A config.celery:app
【讨论】:
以上是关于在 Elastic Beanstalk 上启动 SQS celery worker的主要内容,如果未能解决你的问题,请参考以下文章
让 django celery worker 在 elastic-beanstalk 上启动的问题
在 Elastic Beanstalk 上启动 SQS celery worker
使用 cloudformation 在 Elastic Beanstalk 上启动 docker 多容器