oozie bundle 调用多个coordinator

Posted 1573=

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了oozie bundle 调用多个coordinator相关的知识,希望对你有一定的参考价值。

bundle job可以绑定多个coordinator.

语法:

<bundle-app name=[NAME]  xmlns=‘uri:oozie:bundle:0.1‘> 
  <controls>
       <kick-off-time>[DATETIME]</kick-off-time>    #运行时间
  </controls>
   <coordinator name=[NAME] >
       <app-path>[COORD-APPLICATION-PATH]</app-path> # coordinator.xml所在目录
          <configuration>                 #传给coordinator应用的参数
            <property>
              <name>[PROPERTY-NAME]</name>   
              <value>[PROPERTY-VALUE]</value>
            </property>
            ...
         </configuration>
   </coordinator>
   ...
</bundle-app>  

官网给出的例子(绑定两个coordinator):

<bundle-app name=‘APPNAME‘ xmlns:xsi=‘http://www.w3.org/2001/XMLSchema-instance‘ xmlns=‘uri:oozie:bundle:0.1‘> 
  <controls>
       <kick-off-time>${kickOffTime}</kick-off-time>
  </controls>
   <coordinator name=‘coordJobFromBundle1‘ >
       <app-path>${appPath}</app-path>
       <configuration>
         <property>
              <name>startTime1</name>
              <value>${START_TIME}</value>
          </property>
         <property>
              <name>endTime1</name>
              <value>${END_TIME}</value>
          </property>
      </configuration>
   </coordinator>
   <coordinator name=‘coordJobFromBundle2‘ >
       <app-path>${appPath2}</app-path>
       <configuration>
         <property>
              <name>startTime2</name>
              <value>${START_TIME2}</value>
          </property>
         <property>
              <name>endTime2</name>
              <value>${END_TIME2}</value>
          </property>
      </configuration>
   </coordinator>
</bundle-app>

我们工作时的(简略版)bundle.xml:

<bundle-app name=‘APPNAME‘ xmlns:xsi=‘http://www.w3.org/2001/XMLSchema-instance‘ 
xmlns=‘uri:oozie:bundle:0.2‘> 
    <coordinator name=‘coordJobFromBundle1‘ >
       <app-path>${appPath}</app-path>   
   </coordinator>
   <coordinator name=‘coordJobFromBundle2‘ >
       <app-path>${appPath2}</app-path>
   </coordinator>
 
</bundle-app>

coordinator.xml:

<coordinator-app name="cron-coord" frequency="${coord:minutes(6)}" start="${start}" 
end="${end}" timezone="UTC" xmlns="uri:oozie:coordinator:0.2">
    <action>
        <workflow>
            <app-path>${workflowAppUri}</app-path>
            <configuration>
                <property>
                    <name>jobTracker</name>
                    <value>${jobTracker}</value>
                </property>
                <property>
                    <name>nameNode</name>
                    <value>${nameNode}</value>
                </property>
                <property>
                    <name>queueName</name>
                    <value>${queueName}</value>
                </property>
                <property>
                    <name>mainClass</name>
                    <value>com.ocn.itv.rinse.ErrorCollectRinse</value>
                </property>
                <property>
                    <name>mainClass2</name>
                    <value>com.ocn.itv.rinse.UserCollectRinse</value>
                </property>
                <property>
                    <name>jarName</name>
                    <value>ocn-itv-spark-3.0.3-rc1.jar</value>
                </property>
            </configuration>
        </workflow>
    </action>
</coordinator-app>

workflow.xml:

<workflow-app  name="spark-example1" xmlns="uri:oozie:workflow:0.5">  
    <start to="forking"/> 
    <fork name="forking">
        <path start="firstparalleljob"/>
        <path start="secondparalleljob"/>
    </fork>    
    <action name="firstparalleljob">
        <spark xmlns="uri:oozie:spark-action:0.2">  
            <job-tracker>${jobTracker}</job-tracker>  
            <name-node>${nameNode}</name-node>
            <configuration>  
                <property>  
                    <name>mapred.job.queue.name</name>  
                    <value>${queueName}</value>  
                </property>                  
            </configuration>            
            <master>yarn-cluster</master>
            <mode>cluster</mode>
            <name>Spark Example</name>
            <class>${mainClass}</class>            
            <jar>${jarName}</jar> 
            <spark-opts>${sparkopts}</spark-opts> 
            <arg>${input}</arg>            
        </spark >   
        <ok to="joining"/>
        <error to="fail"/>    
    </action> 
    <action name="secondparalleljob">
         <spark xmlns="uri:oozie:spark-action:0.2">  
            <job-tracker>${jobTracker}</job-tracker>  
            <name-node>${nameNode}</name-node>
            <configuration>  
                <property>  
                    <name>mapred.job.queue.name</name>  
                    <value>${queueName}</value>  
                </property>                  
            </configuration>            
            <master>yarn-cluster</master>
            <mode>cluster</mode>
            <name>Spark Example2</name>
            <class>${mainClass2}</class>            
            <jar>${jarName}</jar> 
            <spark-opts>${sparkopts}</spark-opts> 
            <arg>${input}</arg>            
        </spark >  
        <ok to="joining"/>
        <error to="fail"/>    
    </action>   
    <join name="joining" to="end"/>
      <kill name="fail">  
       <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>  
    </kill>  
   <end name="end"/>  
</workflow-app> 

job.properties

nameNode=hdfs://hgdp-001:8020
jobTracker=hgdp-001:8032
queueName=default
input=2017-05-09
hdfspath=user/root
examplesRoot=ocn-itv-oozie      #自定义全局目录
oozie.use.system.libpath=True    #是否启动系统lib库
sparkopts=--executor-memory 1G  
start=2017-09-04T00:05+0800    #coordinator任务开始时间
end=2017-09-04T00:36+0800      #coordinator任务结束时间
start2=2017-09-01T00:06+0800
end2=2017-09-04T00:36+0800
oozie.libpath=${nameNode}/${hdfspath}/${examplesRoot}/lib/          #用户自定义lib库(存放jar包)
workflowAppUri=${nameNode}/${hdfspath}/${examplesRoot}/wf/spark/fork/
workflowAppUri2=${nameNode}/${hdfspath}/${examplesRoot}/wf/spark/single/  #coordinator定时调度对应的workflow.xml所在目录
appPath=${nameNode}/${hdfspath}/${examplesRoot}/cd/single/
appPath2=${nameNode}/${hdfspath}/${examplesRoot}/cd/single1/        #bundle调用对应的coordinator.xml所在目录
oozie.bundle.application.path=${nameNode}/${hdfspath}/${examplesRoot}/bd/bd1/    #bundle.xml所在目录
#一个bundle调用多个coordinator

 

以上是关于oozie bundle 调用多个coordinator的主要内容,如果未能解决你的问题,请参考以下文章

oozie 命令行 中文

大数据篇:oozie与spark2整合进行资源调度

大数据篇:oozie与spark2整合进行资源调度

Oozie-coordinator调度

管理 Hadoop 作业的工作流调度系统——Oozie

在 Oozie-Spark 动作中添加多个罐子