This section presents you various set of Mock Tests related to Sqoop. You can download these sample mock tests at your local machine and solve offline at your convenience. Every mock test is supplied with a mock test key to let you verify the final score and grade yourself.
Q 1 - During import to hive using sqoop the data is
The data is first staged into a temporary location as a HDFS file and then loaded into the hive table.
Q 2 - While importing data to hive using sqoop, if data already exists in hive table then the default behaviour is
The default behavior is to append data into existing hive table.
Q 3 - To overwrite data present in hive table while importing data using sqoop, the sqoop parameter is
The --hive-overwrite parameter truncates the hive table before loading the data.
Q 4 - The temporary location to which sqoop moves the data before loading into hive is specified by the parameter
The --target-dir parameter mentions the directory used for temporary staging the data before loading into the hive table.
Q 5 - If the target hive table is partitioned then sqoop behavior is which of the following?
Sqoop supports loading into hive partitions using additional parameters in the sqoop command.
Q 6 - The parameter(s) used to laod data using sqoop into the hive partitions is/are
both partition-key and partition value are passed in to load data into hive partitioned table.
Q 7 - The data type of the column used for partition name while importing data using sqoop ino hive can be
Sqoop can only take strings as partition column names while loading data to hive.
Q 8 - The parameter --hive-drop-import-delims does which of the following?
the parameter --hive-drop-import-delims removes the mentioned characters.
Q 9 - The purpose of --hive-delims-replacement parameter in sqoop is to
As the characters \n, \t, and \01 may interfere with the data giving incorrect result, these can be replaced with a suitable string using this parameter.
Q 10 - HIve shows more row count than imported by sqoop. What can be the reason?
The new line characters present in data will increase the number of rows.
Q 11 - The parameter --hive-import can be used with
This parameter can be used with both hive and text files.
Q 12 - To import data to HBase using sqoop the parameter(s) required is/are
sqoop needs to mention both the hbase table name and column family to do the import.
Q 13 - If the hbase table to which sqoop is importing data does not exist then
Unlike hive where sqoop creates the table if it does not exist, in HBase the job fails.
Q 14 - The parameter used to identify the individual row in HBase while importing data to it using sqoop is
the parameter --hbase-row-key is used in sqoop to identify each row in the HBase table.
Q 15 - The parameter that can create a hbase table using sqoop when importing data to hbase is
If the–create-hbase-table is mentioned during the import then the Hbase table can get created using sqoop if it does not already exist.
Q 16 - After importing a table into HBAse you find that the number of rows inserted is fewer than in the source. The possible reason is −
As Hbase does not allow the rows with all NULL values, those rows were skipped during import and caused fewer row counts.
Q 17 - The property in sqoop that allows rows with all NULL values to be inserted into HBAse tables is −
The property sqoop.hbase.add.row.key instructs Sqoop to insert the row key column twice, once as a row identifier and then again in the data itself. Even if all other columns contain NULL, at least the column used for the row key won’t be null, which will allow the insertion of the row into HBase.
Q 18 - When inserting data using sqoop into Hbase table in one physical node, the different parallel tasks of sqoop import create a bottleneck. This can be solved by
By creating more regions, the Hbase table get split into many nodes in the HBAse cluster, which help load data faster from the sqoop parallel load tasks.
Q 19 - The parameters in sqoop command can be passed in to Oozie by using which tags?
The <args> tag can contain the parameters of a sqoop command when scheduling with Oozie.
Q 20 - In both import and export scenario, the role of ValidationThreshold is to determine if
The ValidationThreshold - Determines if the error margin between the source and target are acceptable: Absolute, Percentage Tolerant, etc. Default implementation is AbsoluteValidationThreshold which ensures the row counts from source and targets are the same.
Q 21 - The comparison of row counts between the source system and the target database while loading the data using sqoop is done using the parameter
The –validate parameter is used to show the result of row comparison between source and target.
Q 22 - The sqoop export/import jobs canbe stored and used again and again by using
Running a sqoop job by using sqoop-job statement saves the job into metastore which can be retrived later and used again and again
$ sqoop-job --create jobname -- import --connect jdbc:mysql://example.com/db \ --table mytable
Q 23 - What is achieved by the command – sqoop job –exec myjob
This is the command to execute a sqoop job already saved in the metastore.
Q 24 - The tool in sqoop which combines two data sets and preserves only the latest values using a primary key is
The Sqoop-merge tool combines two datasets and preserves the latest records. The column marked for primary key is indicated by the parameter –merge-key
Q 25 - The tool that populates a Hive metastore with a definition for a table based on a database table previously imported to HDFS is
Define in Hive a table named emps with a definition based on a database table named employees −
$ sqoop create-hive-table --connect jdbc:mysql://db.example.com/corp \ --table employees --hive-table emps
|Question Number||Answer Key|