Oozie 大数据开发标配 (三)-实战案例
一、workflow回顾
如何定义一个 Wrokflow
- job.properties
- 关键点:指向workflow.xml 文件所在的HDFS位置
- workflow.xml
- 定义文件
- xml文件
- 包含几点:
- start
- action
MapReduce、Hive、Sqoop、Shell- ok
- fail
- kill
- end
- lib 目录
依赖的jar包
workflow.xml 编写
二、HiveAction
<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0">
...
<action name="[NODE-NAME]">
<hive xmlns="uri:oozie:hive-action:1.0">
<resource-manager>[RESOURCE-MANAGER]</resource-manager>
<name-node>[NAME-NODE]</name-node>
<prepare>
<delete path="[PATH]"/>
...
<mkdir path="[PATH]"/>
...
</prepare>
<job-xml>[HIVE SETTINGS FILE]</job-xml>
<configuration>
<property>
<name>[PROPERTY-NAME]</name>
<value>[PROPERTY-VALUE]</value>
</property>
...
</configuration>
<script>[HIVE-SCRIPT]</script>
<param>[PARAM-VALUE]</param>
...
<param>[PARAM-VALUE]</param>
<file>[FILE-PATH]</file>
...
<archive>[FILE-PATH]</archive>
...
</hive>
<ok to="[NODE-NAME]"/>
<error to="[NODE-NAME]"/>
</action>
...
</workflow-app>
example:
<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:1.0">
...
<action name="myfirsthivejob">
<hive xmlns="uri:oozie:hive-action:1.0">
<resource-manager>foo:8032</resource-manager>
<name-node>bar:8020</name-node>
<prepare>
<delete path="${jobOutput}"/>
</prepare>
<configuration>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
</configuration>
<script>myscript.q</script>
<param>InputDir=/home/tucu/input-data</param>
<param>OutputDir=${jobOutput}</param>
</hive>
<ok to="myotherjob"/>
<error to="errorcleanup"/>
</action>
...
</workflow-app>
三、Coordinator
Oozie Coordinator Specification 官网DOC
Triggering Mechanism(触发机制)
As of now, the Oozie coordinator supports two of the most commom triggering mechanisms, namely time(基于时间) and data availability(数据可用性).
四、Oozie和DolphinScheduler对比
为者常成,行者常至
自由转载-非商用-非衍生-保持署名(创意共享3.0许可证)