Sqoop - Codegen

This chapter describes the importance of ‘codegen’ tool. From the viewpoint of object-oriented application, every database table has one DAO class that contains ‘getter’ and ‘setter’ methods to initialize objects. This tool (-codegen) generates the DAO class automatically.

It generates DAO class in Java, based on the Table Schema structure. The Java definition is instantiated as a part of the import process. The main usage of this tool is to check if Java lost the Java code. If so, it will create a new version of Java with the default delimiter between fields.


The following is the syntax for Sqoop codegen command.

$ sqoop codegen (generic-args) (codegen-args) 
$ sqoop-codegen (generic-args) (codegen-args)


Let us take an example that generates Java code for the emp table in the userdb database.

The following command is used to execute the given example.

$ sqoop codegen \
--connect jdbc:mysql://localhost/userdb \
--username root \ 
--table emp

If the command executes successfully, then it will produce the following output on the terminal.

14/12/23 02:34:40 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5
14/12/23 02:34:41 INFO tool.CodeGenTool: Beginning code generation
14/12/23 02:34:42 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hadoop/compile/9a300a1f94899df4a9b10f9935ed9f91/emp.java uses or 
   overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

14/12/23 02:34:47 INFO orm.CompilationManager: Writing jar file: 


Let us take a look at the output. The path, which is in bold, is the location that the Java code of the emp table generates and stores. Let us verify the files in that location using the following commands.

$ cd /tmp/sqoop-hadoop/compile/9a300a1f94899df4a9b10f9935ed9f91/
$ ls

If you want to verify in depth, compare the emp table in the userdb database and emp.java in the following directory