Bundle Overview
Bundle 是Oozie任务组织架构中higher-level的组织形式,在技术实现上,它是coordinator应用的一个集合,在业务上用户可以将多个coordinator 应用组合起来形成一个数据管道,在Bundle内的coordinator没有上下依赖关系,用户可以通过coordinator的依赖数据来将coordinator组织成一个数据管道流向。
概念定义
- Kick-off-time:
Bundle提交运行的时间;
- Bundle Application:
Bundle应用,定义了包含的coordinator 信息以及开始时间,Bundle应用以xml的形式展现出来,信息可参数化;
- Bundle Job:
实例化的Bundle应用,提交的任务参数将被赋值,执行;
- Bundle Definition Language:
Bundle Job
bundle任务状态:
PREP, RUNNING, RUNNINGWITHERROR, SUSPENDED, PREPSUSPENDED, SUSPENDEDWITHERROR, PAUSED, PAUSEDWITHERROR, PREPPAUSED, SUCCEEDED, DONEWITHERROR, KILLED, FAILED .
bundle任务状态机:
PREP --> PREPSUSPENDED | PREPPAUSED | RUNNING | KILLED
RUNNING --> RUNNINGWITHERROR | SUSPENDED | PAUSED | SUCCEEDED | KILLED
RUNNINGWITHERROR --> RUNNING | SUSPENDEDWITHERROR | PAUSEDWITHERROR | DONEWITHERROR | FAILED | KILLED
PREPSUSPENDED --> PREP | KILLED
SUSPENDED --> RUNNING | KILLED
SUSPENDEDWITHERROR --> RUNNINGWITHERROR | KILLED
PREPPAUSED --> PREP | KILLED
PAUSED --> SUSPENDED | RUNNING | KILLED
PAUSEDWITHERROR --> SUSPENDEDWITHERROR | RUNNINGWITHERROR | KILLED
bundle任务启动的条件:
- 到达 kick-off-time 时间,一般情况下改值不设置;
- 用户通过客户端发起了启动bundle任务的命令;
bundle应用xml配置说明:
Demo1:
<bundle-app name='bundle-app' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='uri:oozie:bundle:0.1'>
<coordinator name='coord-1'>
<app-path>${nameNode}/user/${userName}/${examplesRoot}/apps/aggregator/coordinator.xml</app-path>
<configuration>
<property>
<name>start</name>
<value>${start}</value>
</property>
<property>
<name>end</name>
<value>${end}</value>
</property>
</configuration>
</coordinator>
</bundle-app>
Demo2:
<bundle-app name='APPNAME' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='uri:oozie:bundle:0.1'>
<controls>
<kick-off-time>${kickOffTime}</kick-off-time>
</controls>
<coordinator name='coordJobFromBundle1' >
<app-path>${appPath}</app-path>
<configuration>
<property>
<name>startTime1</name>
<value>${START_TIME}</value>
</property>
<property>
<name>endTime1</name>
<value>${END_TIME}</value>
</property>
</configuration>
</coordinator>
<coordinator name='coordJobFromBundle2' >
<app-path>${appPath2}</app-path>
<configuration>
<property>
<name>startTime2</name>
<value>${START_TIME2}</value>
</property>
<property>
<name>endTime2</name>
<value>${END_TIME2}</value>
</property>
</configuration>
</coordinator>
</bundle-app>
Demo3:
<bundle-app name='APPNAME' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='uri:oozie:bundle:0.2'>
<parameters>
<property>
<name>appPath</name>
</property>
<property>
<name>appPath2</name>
<value>hdfs://foo:8020/user/joe/job/job.properties</value>
</property>
</parameters>
<controls>
<kick-off-time>${kickOffTime}</kick-off-time>
</controls>
<coordinator name='coordJobFromBundle1' >
<app-path>${appPath}</app-path>
<configuration>
<property>
<name>startTime1</name>
<value>${START_TIME}</value>
</property>
<property>
<name>endTime1</name>
<value>${END_TIME}</value>
</property>
</configuration>
</coordinator>
<coordinator name='coordJobFromBundle2' >
<app-path>${appPath2}</app-path>
<configuration>
<property>
<name>startTime2</name>
<value>${START_TIME2}</value>
</property>
<property>
<name>endTime2</name>
<value>${END_TIME2}</value>
</property>
</configuration>
</coordinator>
</bundle-app>
- **name:
**bundle应用的名称; - **parameters:
**传递给coordinator应用的参数; - **controls:
**设置kick-off-time 任务启动时间; - **coordinator:
**coordinator应用,bundle应用配置中至少要有一个coordinator应用;
**name:
**coordinator任务的名称,用户对bundle任务进行杀死,重跑,挂起等操作的时候,需要使用coordinator任务名称将命令扩散到下一层任务组织中;
**app-path:
**coordinator应用的定义文件所在地址,必需;
**configuration:
**传递给coordinator应用的参数;
可以使用properties设置全局变量:
oozie.bundle.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/bundle
nameNode=hdfs://localhost:8020
jobTracker=localhost:8021
queueName=default
outputDir=bundle
examplesRoot=examples
start=2010-01-01T01:00Z
end=2010-01-01T03:00Z
userName=${user.name}
网友评论