- Sqoop Tutorial
- Sqoop - Home
- Sqoop - Introduction
- Sqoop - Installation
- Sqoop - Import
- Sqoop - Import-All-Tables
- Sqoop - Export
- Sqoop - Sqoop Job
- Sqoop - Codegen
- Sqoop - Eval
- Sqoop - List Databases
- Sqoop - List Tables
- Sqoop Useful Resources
- Sqoop - Questions and Answers
- Sqoop - Quick Guide
- Sqoop - Useful Resources
- Sqoop - Discussion
Sqoop Mock Test
This section presents you various set of Mock Tests related to Sqoop. You can download these sample mock tests at your local machine and solve offline at your convenience. Every mock test is supplied with a mock test key to let you verify the final score and grade yourself.
Sqoop Mock Test IV
Q 1 - During import to hive using sqoop the data is
A - directly loaded to existing hive table
B - first moved into a hive directory as a hdfs file
Answer : B
Explanation
The data is first staged into a temporary location as a HDFS file and then loaded into the hive table.
Q 2 - While importing data to hive using sqoop, if data already exists in hive table then the default behaviour is
A - The incoming data is appended to hive table
B - the incoming data replaces data in hive table
C - The data only gets updated using the primary key of the hive table
Answer : A
Explanation
The default behavior is to append data into existing hive table.
Q 3 - To overwrite data present in hive table while importing data using sqoop, the sqoop parameter is
Answer : B
Explanation
The --hive-overwrite parameter truncates the hive table before loading the data.
Q 4 - The temporary location to which sqoop moves the data before loading into hive is specified by the parameter
Answer : A
Explanation
The --target-dir parameter mentions the directory used for temporary staging the data before loading into the hive table.
Q 5 - If the target hive table is partitioned then sqoop behavior is which of the following?
A - not load data into hive partitions
C - sqoop command will halt for user input for partition names
D - load data into hive partitions by using additional parameters
Answer : D
Explanation
Sqoop supports loading into hive partitions using additional parameters in the sqoop command.
Q 6 - The parameter(s) used to laod data using sqoop into the hive partitions is/are
A - --hive-partition-key and -hive-partition-value
Answer : A
Explanation
both partition-key and partition value are passed in to load data into hive partitioned table.
Q 7 - The data type of the column used for partition name while importing data using sqoop ino hive can be
Answer : A
Explanation
Sqoop can only take strings as partition column names while loading data to hive.
Q 8 - The parameter --hive-drop-import-delims does which of the following?
A - replaces the hive delimiters with sqoop delimiters
B - drops the rows which do not have the \n,\t,\01 delimiters
C - removes all the \n,\t and \01 characters
D - drops the columns which do not have the \n,\t,\01 delimiters
Answer : C
Explanation
the parameter --hive-drop-import-delims removes the mentioned characters.
Q 9 - The purpose of --hive-delims-replacement parameter in sqoop is to
A - Replace any hive delimiters with special string
B - Replace all the hive delimiters with null
C - replace \n, \t, and \01 characters with any other string
Answer : C
Explanation
As the characters \n, \t, and \01 may interfere with the data giving incorrect result, these can be replaced with a suitable string using this parameter.
Q 10 - HIve shows more row count than imported by sqoop. What can be the reason?
A - the \n chara cter present int the data
B - Error with java classes used in sqoop
Answer : A
Explanation
The new line characters present in data will increase the number of rows.
Q 11 - The parameter --hive-import can be used with
B - importing to hive as well as text file
Answer : B
Explanation
This parameter can be used with both hive and text files.
Q 12 - To import data to HBase using sqoop the parameter(s) required is/are
Answer : C
Explanation
sqoop needs to mention both the hbase table name and column family to do the import.
Q 13 - If the hbase table to which sqoop is importing data does not exist then
C - sqoop waits for user input for hbase table details to proceed with import
D - sqoop imports the data to a temporary location under Hbase
Answer : B
Explanation
Unlike hive where sqoop creates the table if it does not exist, in HBase the job fails.
Q 14 - The parameter used to identify the individual row in HBase while importing data to it using sqoop is
Answer : A
Explanation
the parameter --hbase-row-key is used in sqoop to identify each row in the HBase table.
Q 15 - The parameter that can create a hbase table using sqoop when importing data to hbase is
Answer : B
Explanation
If the–create-hbase-table is mentioned during the import then the Hbase table can get created using sqoop if it does not already exist.
Q 16 - After importing a table into HBAse you find that the number of rows inserted is fewer than in the source. The possible reason is −
A - Sqoop is yet to have mature code for HBase
B - Sqoop version and Hbase version conflict
C - Hbase does not allow rows will all NULL values to be inserted
D - Hbase has very limited capabilities to handle numeric data types so some rows got rejected.
Answer : C
Explanation
As Hbase does not allow the rows with all NULL values, those rows were skipped during import and caused fewer row counts.
Q 17 - The property in sqoop that allows rows with all NULL values to be inserted into HBAse tables is −
B - sqoop.hbase.allow.row.nulls,
D - It is not possible as HBAse will never allow rows with all null Columns to be inserted
Answer : A
Explanation
The property sqoop.hbase.add.row.key instructs Sqoop to insert the row key column twice, once as a row identifier and then again in the data itself. Even if all other columns contain NULL, at least the column used for the row key won’t be null, which will allow the insertion of the row into HBase.
Q 18 - When inserting data using sqoop into Hbase table in one physical node, the different parallel tasks of sqoop import create a bottleneck. This can be solved by
A - Configuring sqoop not to run parallel tasks
B - Configuring gHBase to accept rows in parallel
Answer : C
Explanation
By creating more regions, the Hbase table get split into many nodes in the HBAse cluster, which help load data faster from the sqoop parallel load tasks.
Q 19 - The parameters in sqoop command can be passed in to Oozie by using which tags?
Answer : B
Explanation
The <args> tag can contain the parameters of a sqoop command when scheduling with Oozie.
Q 20 - In both import and export scenario, the role of ValidationThreshold is to determine if
A - the error margin between the source and target is within a range
B - the Sqoop command can handle the entire number of rows
C - the number of rows rejected by sqoop while reading the data
D - the number of rows rejected by the target database while loading the data
Answer : A
Explanation
The ValidationThreshold - Determines if the error margin between the source and target are acceptable: Absolute, Percentage Tolerant, etc. Default implementation is AbsoluteValidationThreshold which ensures the row counts from source and targets are the same.
Q 21 - The comparison of row counts between the source system and the target database while loading the data using sqoop is done using the parameter
Answer : A
Explanation
The –validate parameter is used to show the result of row comparison between source and target.
Q 22 - The sqoop export/import jobs canbe stored and used again and again by using
Answer : D
Explanation
Running a sqoop job by using sqoop-job statement saves the job into metastore which can be retrived later and used again and again
Example −
$ sqoop-job --create jobname -- import --connect jdbc:mysql://example.com/db \ --table mytable
Q 23 - What is achieved by the command – sqoop job –exec myjob
A - Sqoop job named myjob is saved to sqoop metastore
B - Sqoop job named myjob starts running
Answer : B
Explanation
This is the command to execute a sqoop job already saved in the metastore.
Q 24 - The tool in sqoop which combines two data sets and preserves only the latest values using a primary key is
Answer : A
Explanation
The Sqoop-merge tool combines two datasets and preserves the latest records. The column marked for primary key is indicated by the parameter –merge-key
Q 25 - The tool that populates a Hive metastore with a definition for a table based on a database table previously imported to HDFS is
Answer : B
Explanation
Define in Hive a table named emps with a definition based on a database table named employees −
$ sqoop create-hive-table --connect jdbc:mysql://db.example.com/corp \ --table employees --hive-table emps
Answer Sheet
Question Number | Answer Key |
---|---|
1 | B |
2 | A |
3 | B |
4 | A |
5 | D |
6 | A |
7 | A |
8 | C |
9 | C |
10 | A |
11 | B |
12 | C |
13 | B |
14 | A |
15 | B |
16 | C |
17 | A |
18 | C |
19 | B |
20 | A |
21 | A |
22 | D |
23 | B |
24 | A |
25 | B |