Datax-Web失败任务重跑

Posted Demonson

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Datax-Web失败任务重跑相关的知识,希望对你有一定的参考价值。

单机dataxweb失败重跑

vim /data/datax/py/rerun_datax.py

# -*- coding: utf-8 -*-
import pymysql
from datetime import datetime

run_datetime = datetime.strftime(datetime.now(), '%Y-%m-%d %H:%M:%S')
sql = "UPDATE job_info SET trigger_next_time=CONCAT(UNIX_TIMESTAMP(NOW() + INTERVAL 5 MINUTE),'000') WHERE last_handle_code=500 AND trigger_next_time <>0 AND trigger_status<>0"
conn = pymysql.connect(host="127.0.0.1",port=3306,user="***",password="***",db="dataxweb")
cur = conn.cursor()
effect_row = cur.execute(sql)
if effect_row > 0:
    print(run_datetime + "   datax-web重跑: " + str(effect_row) + "条失败任务将在5分钟后重跑!!!")
else:
    print(run_datetime + "   没有失败任务!!!")
conn.commit()
cur.close()
conn.close()

每小时检测一次:

0 * * * * /usr/bin/python /data/datax/py/rerun_datax.py >> /data/datax/py/rerun_datax.log 2>&1

集群类型的失败任务重跑

vim /data4hadoop/scripts/datax_web/rerun_datax.py

# -*- coding: utf-8 -*-
import pymysql
from datetime import datetime

run_datetime = datetime.strftime(datetime.now(), '%Y-%m-%d %H:%M:%S')
sql = "UPDATE job_info SET trigger_next_time=CONCAT(UNIX_TIMESTAMP(NOW() + INTERVAL 5 MINUTE),'000') \\
       WHERE id in (select id from (SELECT f.id FROM job_info f \\
       LEFT JOIN (SELECT trigger_time,handle_time,handle_code ,job_id FROM job_log  \\
                   WHERE  trigger_time > (NOW() + INTERVAL - 1 DAY) AND handle_code IN (0,200)) g ON g.job_id=f.id \\
       WHERE f.last_handle_code NOT IN (0,200) AND f.trigger_status =1 AND g.handle_code IS NULL) t) "
#因为任务是多节点执行,所以需要在job_log中过滤掉真正失败的任务

conn = pymysql.connect(host="127.0.0.1",port=3306,user="datax_user",password="******",db="dataxweb")
cur = conn.cursor()
effect_row = cur.execute(sql)
if effect_row > 0:
    print(run_datetime + "   datax-web重跑: " + str(effect_row) + "条失败任务将在5分钟后重跑!!!")
else:
    print(run_datetime + "   没有失败任务!!!")
conn.commit()
cur.close()
conn.close()

0 * * * * /usr/bin/python3 /data4hadoop/scripts/datax_web/rerun_datax.py >> /data4hadoop/scripts/datax_web/rerun_datax.log 2>&1

以上是关于Datax-Web失败任务重跑的主要内容,如果未能解决你的问题,请参考以下文章

Airflow 重跑dag中部分失败的任务

python定时重跑获取数据

RF失败案例重跑

TestNG测试用例重跑详解及实践优化

TestNG测试用例重跑详解及实践优化

Jenkins:python-unitest选择失败的版本号进行重跑失败的用例