Following quiz provides Multiple Choice Questions (MCQs) related to Sqoop. You will have to read all the given answers and click over the correct answer. If you are not sure about the answer then you can check the answer using Show Answer button. You can use Next Quiz button to check new set of questions in the quiz.
Q 1 - While SequenceFile stores each record as key-value pair, the avro system stored records as
Sqoop generates the schema automatically when reading the data and stores the schema details along with the data in each Avro file generated.
Q 2 - The compression mechanism used by sqoop is
Sqoop does not have any inbuilt code to carry out file compression. It relies on Hadoop's compression settings.
Q 3 - The clause 'WHERE $CONDITIONS' in the sql query specified to import data, serves the purpose of
The WHERE $CONDITION is used to split the result of the SQL query into multiple chunks.
Q 4 - Sqoop’s default behavior while inserting rows into relational tables is
the default behavior is to insert one row at a time while it can be configured for bulk load.
Q 5 - Which of the following is a disadvantage of using the –staging-table parameter?
All the listed options are disadvantages while using the –staging-table option.
Q 6 - The –update-key parameter can take
A comma separate dlist of column names which together identify a unique record can be used in the –update-key parameter.
Q 7 - How do we decide the order of columns in which data is loaded to the target table?
we can use the –column parameter and specify the required column in the required order.
Q 8 - When a column value has a different data type in the HDFS system than expected in the relational table to which data will be exported −
The job fails and sqoop gives a log showing the reason of failure.
Q 9 - The data type of the column used for partition name while importing data using sqoop ino hive can be
Sqoop can only take strings as partition column names while loading data to hive.
Q 10 - In both import and export scenario, the role of ValidationThreshold is to determine if
The ValidationThreshold - Determines if the error margin between the source and target are acceptable: Absolute, Percentage Tolerant, etc. Default implementation is AbsoluteValidationThreshold which ensures the row counts from source and targets are the same.