datax的mysql2hdfs文件系统高可用配置教程
Posted 闭关苦炼内功
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了datax的mysql2hdfs文件系统高可用配置教程相关的知识,希望对你有一定的参考价值。
datax的mysql2hdfs文件系统高可用配置文档
关键参数配置信息:
- hdfs-site.xml,core-site.xml
"hadoopConfig":
"dfs.client.failover.proxy.provider.bcluster": "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
"dfs.ha.namenodes.bcluster": "nn1,nn2",
"dfs.namenode.rpc-address.bcluster.nn1": "bigdata02:8020",
"dfs.namenode.rpc-address.bcluster.nn2": "bigdata03:8020",
"dfs.nameservices": "bcluster"
,
"defaultFS": "hdfs://bcluster",
详细配置如下:
[hdfs@demo01 ~]$ cat mysql2hive_ods_demo.json
"job":
"setting":
"speed":
"channel":2
,
"content": [
"reader":
"name": "mysqlreader",
"parameter":
"username": "root",
"password": "123456",
"connection": [
"querySql": [
"select id
,shop_name
,platform_name
,admin_name
,admin_phone
,version
,create_time
,update_time
from tb_demo;"
],
"jdbcUrl": [
"jdbc:mysql://10.0.0.1:3306/db_demo"
]
]
,
"writer":
"name": "hdfswriter",
"parameter":
"hadoopConfig":
"dfs.client.failover.proxy.provider.bcluster": "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
"dfs.ha.namenodes.bcluster": "nn1,nn2",
"dfs.namenode.rpc-address.bcluster.nn1": "bigdata02:8020",
"dfs.namenode.rpc-address.bcluster.nn2": "bigdata03:8020",
"dfs.nameservices": "bcluster"
,
"defaultFS": "hdfs://bcluster",
"fileType": "orc",
"path": "/warehouse/tablespace/managed/hive/demo.db/ods_tb_demo_df/bizdate=$bizdate",
"fileName": "ods_tb_demo_df",
"column": [
"name":"id","type":"string",
"name":"shop_name","type":"string",
"name":"platform_name","type":"string",
"name":"admin_name","type":"string",
"name":"admin_phone","type":"string",
"name":"version","type":"string",
"name":"create_time","type":"string",
"name":"update_time","type":"string"
],
"writeMode": "append",
"fieldDelimiter": "\\t",
"compress": "SNAPPY"
]
[hdfs@demo01 ~]$
跑datax数据同步测试
python /usr/local/datax/bin/datax.py -p"-Dbizdate='20221206'" mysql2hive_ods_demo.json
[hdfs@demo01 ~]$ python /usr/local/datax/bin/datax.py -p"-Dbizdate='20221206'" mysql2hive_ods_demo.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2022-12-07 04:52:44.523 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2022-12-07 04:52:44.529 [main] INFO Engine - the machine info =>
2022-12-07 04:52:44.546 [main] INFO Engine -
...
2022-12-07 04:52:44.559 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
2022-12-07 04:52:44.560 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2022-12-07 04:52:44.560 [main] INFO JobContainer - DataX jobContainer starts job.
2022-12-07 04:52:44.562 [main] INFO JobContainer - Set jobId = 0
2022-12-07 04:52:44.805 [job-0] INFO OriginalConfPretreatmentUtil - Available
...
2022-12-07 04:52:55.852 [job-0] INFO JobContainer - DataX Reader.Job [mysqlreader] do post work.
2022-12-07 04:52:55.852 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2022-12-07 04:52:55.852 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /usr/hdp/datax/hook
2022-12-07 04:52:55.953 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 1 | 1 | 1 | 0.023s | 0.023s | 0.023s
PS Scavenge | 1 | 1 | 1 | 0.012s | 0.012s | 0.012s
2022-12-07 04:52:55.954 [job-0] INFO JobContainer - PerfTrace not enable!
2022-12-07 04:52:55.954 [job-0] INFO StandAloneJobContainerCommunicator - Total 78 records, 33832 bytes | Speed 3.30KB/s, 7 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2022-12-07 04:52:55.954 [job-0] INFO JobContainer -
任务启动时刻 : 2022-12-07 04:52:44
任务结束时刻 : 2022-12-07 04:52:55
任务总计耗时 : 11s
任务平均流量 : 3.30KB/s
记录写入速度 : 7rec/s
读出记录总数 : 78
读写失败总数 : 0
[hdfs@demo01 ~]$
如此,datax的mysql2hdfs文件系统高可用配置文档 完毕
以上是关于datax的mysql2hdfs文件系统高可用配置教程的主要内容,如果未能解决你的问题,请参考以下文章