- Apache Oozie Tutorial
- Apache Oozie - Home
- Apache Oozie - Introduction
- Apache Oozie - Workflow
- Apache Oozie - Property File
- Apache Oozie - Coordinator
- Apache Oozie - Bundle
- Apache Oozie - CLI and Extensions
- Apache Oozie Useful Resources
- Apache Oozie - Quick Guide
- Apache Oozie - Useful Resources
- Apache Oozie - Discussion
Apache Oozie - CLI and Extensions
By this time, you have a good understanding of Oozie workflows, coordinators and bundles. In the last part of this tutorial, let’s touch base some of the other important concepts in Oozie.
Command Line Tools
We have seen a few commands earlier to run the jobs of workflow, coordinator and bundle. Oozie provides a command line utility, Oozie, to perform job and admin tasks.
oozie version : show client version
Following are some of the other job operations −
oozie job <OPTIONS> : -action <arg> coordinator rerun on action ids (requires -rerun); coordinator log retrieval on action ids (requires -log) -auth <arg> select authentication type [SIMPLE|KERBEROS] -change <arg> change a coordinator/bundle job -config <arg> job configuration file '.xml' or '.properties' -D <property = value> set/override value for given property -date <arg> coordinator/bundle rerun on action dates (requires -rerun) -definition <arg> job definition -doas <arg> doAs user, impersonates as the specified user -dryrun Supported in Oozie-2.0 or later versions ONLY - dryrun or test run a coordinator job, job is not queued -info <arg> info of a job -kill <arg> kill a job -len <arg> number of actions (default TOTAL ACTIONS, requires -info) -localtime use local time (default GMT) -log <arg> job log -nocleanup do not clean up output-events of the coordinator rerun actions (requires -rerun) -offset <arg> job info offset of actions (default '1', requires -info) -oozie <arg> Oozie URL -refresh re-materialize the coordinator rerun actions (requires -rerun) -rerun <arg> rerun a job (coordinator requires -action or -date; bundle requires -coordinator or -date) -resume <arg> resume a job -run run a job -start <arg> start a job -submit submit a job -suspend <arg> suspend a job -value <arg> new endtime/concurrency/pausetime value for changing a coordinator job;new pausetime value for changing a bundle job -verbose verbose mode
To check the status of the job, following commands can be used.
-auth <arg> select authentication type [SIMPLE|KERBEROS] -doas <arg> doAs user, impersonates as the specified user. -filter <arg> user = <U>; name = <N>; group = <G>; status = <S>; ... -jobtype <arg> job type ('Supported in Oozie-2.0 or later versions ONLY - coordinator' or 'wf' (default)) -len <arg> number of jobs (default '100') -localtime use local time (default GMT) -offset <arg> jobs offset (default '1') -oozie <arg> Oozie URL -verbose verbose mode
For example − To check the status of the Oozie system you can run the following command −
oozie admin -oozie http://localhost:8080/oozie -status
Validating a Workflow XML −
oozie validate myApp/workflow.xml
It performs an XML Schema validation on the specified workflow XML file.
Action Extensions
We have seen hive extensions. Similarly, Oozie provides more action extensions few of them are as below −
Email Action
The email action allows sending emails in Oozie from a workflow application. An email action must provide to addresses, cc addresses (optional), a subject and a body. Multiple recipients of an email can be provided as comma separated addresses.
All the values specified in the email action can be parameterized (templated) using EL expressions.
Example
<workflow-app name = "sample-wf" xmlns = "uri:oozie:workflow:0.1"> ... <action name = "an-email"> <email xmlns = "uri:oozie:email-action:0.1"> <to>julie@xyz.com,max@abc.com</to> <cc>jax@xyz.com</cc> <subject>Email notifications for ${wf:id()}</subject> <body>The wf ${wf:id()} successfully completed.</body> </email> <ok to = "main_job"/> <error to = "kill_job"/> </action> ... </workflow-app>
Shell Action
The shell action runs a Shell command. The workflow job will wait until the Shell command completes before continuing to the next action.
To run the Shell job, you have to configure the shell action with the =job-tracker=, name-node and Shell exec elements as well as the necessary arguments and configuration. A shell action can be configured to create or delete HDFS directories before starting the Shell job.
The shell launcher configuration can be specified with a file, using the job-xml element, and inline, using the configuration elements.
Example
How to run any shell script?
<workflow-app xmlns = 'uri:oozie:workflow:0.3' name = 'shell-wf'> <start to = 'shell1' /> <action name = 'shell1'> <shell xmlns = "uri:oozie:shell-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <file>path_of_file_name</file> </shell> <ok to = "end" /> <error to = "fail" /> </action> <kill name = "fail"> <message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}] </message> </kill> <end name = 'end' /> </workflow-app>
Similarly, we can have many more actions like ssh, sqoop, java action, etc.
Additional Resources
Oozie official documentation website is the best resource to understand Oozie in detail.