Celery 超时后没有将任务放回 RabbitMQ 队列

Posted 2023-03-10

技术标签:

【中文标题】Celery 超时后没有将任务放回 RabbitMQ 队列【英文标题】：Celery did not put task back in RabbitMQ queue after timeout 【发布时间】：2015-11-29 19:44:46 【问题描述】：

我在 Heroku 上运行 Celery 工作者，其中一项任务达到了超时限制。当我手动重试时，一切正常，所以可能是连接问题。我使用 RabbitMQ 作为代理，并且 Celery 被配置为对任务进行后期确认（CELERY_ACKS_LATE=True）。我预计任务会返回到 RabbitMQ 队列并由另一个工作人员再次处理，但它没有发生。当工作人员超时时，我是否需要配置其他任何任务才能返回到 RabbitMQ 队列？

这里是日志：

Traceback (most recent call last): 
  File "/app/.heroku/python/lib/python3.4/site-packages/billiard/pool.py", line 639, in on_hard_timeout 
    raise TimeLimitExceeded(job._timeout) 
billiard.exceptions.TimeLimitExceeded: TimeLimitExceeded(60,) 
[2015-09-02 06:22:14,504: ERROR/MainProcess] Hard time limit (60s) exceeded for simulator.tasks.run_simulations[4e269d24-87a5-4038-b5b5-bc4252c17cbb] 
[2015-09-02 06:22:18,877: INFO/MainProcess] missed heartbeat from celery@420cc07b-f5ba-4226-91c9-84a949974daa 
[2015-09-02 06:22:18,922: ERROR/MainProcess] Process 'Worker-1' pid:9 exited with 'signal 9 (SIGKILL)'

【问题讨论】：

【参考方案1】：

您似乎达到了 Celery 的时间限制。 http://docs.celeryproject.org/en/latest/userguide/workers.html#time-limits

默认情况下，Celery 不会为任务实现重试逻辑，因为它不知道重试对您的任务是否安全。也就是说，您的任务需要是幂等的，这样重试才能安全。

因此，由于任务失败而导致的任何重试都应在任务中进行。请参阅此处的示例：http://docs.celeryproject.org/en/latest/reference/celery.app.task.html#celery.app.task.Task.retry

您的任务可能会超时有几个原因，但您最清楚。该任务可能已超时，因为处理数据的时间过长或获取数据的时间过长。

如果您认为该任务尝试连接到某些服务失败，我建议减少连接超时间隔并在您的任务中添加重试逻辑。如果您的任务处理数据花费的时间太长，请尝试将数据分成块并以这种方式处理。 Celery 对此有很好的支持：http://docs.celeryproject.org/en/latest/userguide/canvas.html#chunks

【讨论】：

以上是关于Celery 超时后没有将任务放回 RabbitMQ 队列的主要内容，如果未能解决你的问题，请参考以下文章