- Apache Oozie Tutorial
- Apache Oozie - Home
- Apache Oozie - Introduction
- Apache Oozie - Workflow
- Apache Oozie - Property File
- Apache Oozie - Coordinator
- Apache Oozie - Bundle
- Apache Oozie - CLI and Extensions
- Apache Oozie Useful Resources
- Apache Oozie - Quick Guide
- Apache Oozie - Useful Resources
- Apache Oozie - Discussion
Apache Oozie - Property File
Oozie workflows can be parameterized. The parameters come from a configuration file called as property file. We can run multiple jobs using same workflow by using multiple .property files (one property for each job).
Suppose we want to change the jobtracker url or change the script name or value of a param.
We can specify a config file (.property) and pass it while running the workflow.
Property File
Variables like ${nameNode} can be passed within the workflow definition. The value of this variable will be replaced at the run time with the value defined in the ‘.properties’ file.
Following is an example of a property file we will use in our workflow example.
File name -- job1.properties
# proprties nameNode = hdfs://rootname jobTracker = xyz.com:8088 script_name_external = hdfs_path_of_script/external.hive script_name_orc=hdfs_path_of_script/orc.hive script_name_copy=hdfs_path_of_script/Copydata.hive database = database_name
Now to use this property file we will have to update the workflow and pass the parameters in a workflow as shown in the following program.
<!-- This is a comment --> <workflow-app xmlns = "uri:oozie:workflow:0.4" name = "simple-Workflow"> <start to = "Create_External_Table" /> <action name = "Create_External_Table"> <hive xmlns = "uri:oozie:hive-action:0.4"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <script>${script_name_external}</script> </hive> <ok to = "Create_orc_Table" /> <error to = "kill_job" /> </action> <action name = "Create_orc_Table"> <hive xmlns = "uri:oozie:hive-action:0.4"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <script>${script_name_orc}</script> </hive> <ok to = "Insert_into_Table" /> <error to = "kill_job" /> </action> <action name = "Insert_into_Table"> <hive xmlns = "uri:oozie:hive-action:0.4"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <script>${script_name_copy}</script> <param>${database}</param> </hive> <ok to = "end" /> <error to = "kill_job" /> </action> <kill name = "kill_job"> <message>Job failed</message> </kill> <end name = "end" /> </workflow-app>
Now to use the property file in this workflow we will have to pass the –config while running the workflow.
oozie job --oozie http://host_name:8080/oozie --config edgenode_path/job1.properties -D oozie.wf.application.path hdfs://Namenodepath/pathof_workflow_xml/workflow.xml –run
Note − The property file should be on the edge node (not in HDFS), whereas the workflow and hive scripts will be in HDFS.
At run time, all the parameters in ${} will be replaced by its corresponding value in the .properties file.
Also a single property file can have more parameters than required in a single workflow and no error will be thrown. This makes it possible to run more than one workflow by using the same properties file. But if the property file does not have a parameter required by a workflow then an error will occur.