
- Sqoop Tutorial
- Sqoop - Home
- Sqoop - Introduction
- Sqoop - Installation
- Sqoop - Import
- Sqoop - Import-All-Tables
- Sqoop - Export
- Sqoop - Sqoop Job
- Sqoop - Codegen
- Sqoop - Eval
- Sqoop - List Databases
- Sqoop - List Tables
- Sqoop Useful Resources
- Sqoop - Questions and Answers
- Sqoop - Quick Guide
- Sqoop - Useful Resources
- Sqoop - Discussion
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Sqoop Online Quiz
Following quiz provides Multiple Choice Questions (MCQs) related to Sqoop. You will have to read all the given answers and click over the correct answer. If you are not sure about the answer then you can check the answer using Show Answer button. You can use Next Quiz button to check new set of questions in the quiz.

Q 1 - To prevent the password from being mentioned in the sqoop import clause we can use the additional parameters
Answer : C
Explanation
The -P option asks for password from standard input without echoing and --password-file option reads the password value stored in some other file.
Answer : C
Explanation
You can do both full and partial data import from tables but not a subset of columns from a table.
Q 3 - What is the disadvantage of storing password in the metastore as compared to storing in a password file?
Answer : D
Explanation
The password file can be encrypted and prevented from reading by proper permissions. But metastore is unencrypted and cannot be prevented from reading.
Q 4 - The –boundary-query parameter is used to
A - Select the maximum number of rows to be retrieved by the query
B - Select maximum and minimum values of the column specified in the –split-by parameter
C - Select the number of splits they query can run
D - Select the maximum and minimum number of mapreduce tasks that will be used in the query.
Answer : B
Explanation
Sqoop needs to find the minimum and maximum value of the column
specified in the --split-by parameter so that sqoop can partition data into multiple independent slices that will be transferred in a parallel manner.
Q 5 - Using the higher value for the parameter sqoop.export.statements.per.transaction will
A - Always increase the export performance
B - May or may not increase the export performance
Answer : C
Explanation
In the scenario when the database requires table_level write lock, higher value of sqoop.export.statements.per.transaction will lock the table for a longer time and will decrease the performance.
Q 6 - With MySQL, the feature used by sqoop for update or insert data into an exported table is
Answer : A
Explanation
The ON DUPLICATE KEY UPDATE feature of mySql is used for update else insert with sqoop.
Q 7 - The parameter to specify only a selected number of columns to be exported to a table is
Answer : A
Explanation
The columns clause will take a comma separated values of column names which will be part of the export.
Q 8 - When a column value has a different data type in the HDFS system than expected in the relational table to which data will be exported −
C - Sqoop loads the remaining rows by halting and asking whether to continue the load
D - Sqoop automatically changes the data type to a compatible data type and loads the data.
Answer : B
Explanation
The job fails and sqoop gives a log showing the reason of failure.
Q 9 - The parameter(s) used to laod data using sqoop into the hive partitions is/are
A - --hive-partition-key and -hive-partition-value
Answer : A
Explanation
both partition-key and partition value are passed in to load data into hive partitioned table.
Q 10 - In both import and export scenario, the role of ValidationThreshold is to determine if
A - the error margin between the source and target is within a range
B - the Sqoop command can handle the entire number of rows
C - the number of rows rejected by sqoop while reading the data
D - the number of rows rejected by the target database while loading the data
Answer : A
Explanation
The ValidationThreshold - Determines if the error margin between the source and target are acceptable: Absolute, Percentage Tolerant, etc. Default implementation is AbsoluteValidationThreshold which ensures the row counts from source and targets are the same.