This section presents you various set of Mock Tests related to Sqoop. You can download these sample mock tests at your local machine and solve offline at your convenience. Every mock test is supplied with a mock test key to let you verify the final score and grade yourself.
Q 1 - Which of the following is used by sqoop to establish a connection with enterprise data warehouses?
The JDBC driver is a java program which has been traditionally providing data base connectivity to a variety of databases.
Q 2 - Besides the JDBC driver, sqoop also needs which of the following to connect to remote databases?
Sqoop Needs both JDBC driver and Database connector which is required to import data.
Q 3 - To run sqoop from multiple nodes, it has to be installed in
On installing in one node it, automatically gets replicated to other nodes in the cluster.
Q 4 - By default the records from databases imported to HDFS by sqoop are
The default record separator is comm.
Q 5 - To import data to Hadoop cluster from relational database sqoop create a mapreduce job. In this job
A Mapreduce job executes multiple mappers and each mapper retrieves a slice of Table's data.
Q 6 - The parameter in sqoop which specifies the output directories when importing data is
The --target-dir and --warehouse-dir are the two parameters used for specifying the path where import will be done.
Q 7 - If there is already a target directory with the same name as the table being imported then
To prevent accidental deletion of data the job fails.
Q 8 - To prevent the password from being mentioned in the sqoop import clause we can use the additional parameters
The -P option asks for password from standard input without echoing and --password-file option reads the password value stored in some other file.
Q 9 - What are the two binary file formats supported by sqoop?
These are the two binary file formats supported by Sqoop.
Q 10 - While SequenceFile stores each record as key-value pair, the avro system stored records as
Sqoop generates the schema automatically when reading the data and stores the schema details along with the data in each Avro file generated.
Q 11 - The compression mechanism used by sqoop is
Sqoop does not have any inbuilt code to carry out file compression. It relies on Hadoop's compression settings.
Q 12 - For some databases sqoop can to faster data transefr by using the parameter
The direct mode delegates the data transferring capabilities to the native untilities provided by the database.
Q 13 - The data type mapping between the database column and sqoop column can be overridden by using the parameter
As sqoop uses the Java Data types internally, the mapping of the data types has to be done with Java Data Types.
Q 14 - What does the num-mappers parameter serves?
The default number of map task ssqoop uses is 4.
This can be altered using num-mappers parameter.
Q 15 - What is the default value used by sqoop when it encounters a missing value while importing form CSV file.
unlike databases there is no NULL values in CSV files. Those are handled by sqoop by using null string.
Q 16 - What option can be used to import the entire database from a relational system using sqoop?
The --import-all-tables is used to import all the tables from the database. The tables structure as well as data is imported one by one through this command.
Q 17 - what option can bne used to import only some of the table from a database while using the --import-all-tables parameter?
You can mention the tables names along with the --exclude-table clause to skip a given number of tables while importing an entire database.
Q 18 - Sqoop supports
You can do both full and partial data import from tables but not a subset of columns from a table.
Q 19 - What are the two different incremental modes of importing data into sqoop?
The --incremental parameter is used to fetch only the new data (data which does not already exist in hadoop) . It is done as an append if there are columns specified to be checked for new data. it cal also use the last modified parameter which will use the last_updated_date column from the existing table to identify the new row.
Q 20 - What does the --last-value parameter in sqoop incremental import signify?
Sqoop uses the --last-value parameter in both the append mode and the last_update_date mode to import the incremental data form source.
Q 21 - The --options-file parameter is used to
The command line options (the name and value of the parameters) that do not change from time to time can be saved into a file and used again and again. This is called an options file.
Q 22 - while specifying the connect string in the sqoop import command, for a Hadoop cluster, if we specify localhost in place of a server address(hostname or IP address) in the URI, then
Specifying localhost does not invalidate the command as some local database may be running and the node will be able to connect. So each node will connect to different database if they are available.
Q 23 - What is the disadvantage of storing password in the metastore as compared to storing in a password file?
The password file can be encrypted and prevented from reading by proper permissions. But metastore is unencrypted and cannot be prevented from reading.
Q 24 - What is the advantage of storing password in a metastore as compared to storing in password in a file?
The main advantage of using metastore is it can be used by any user having access to the environment without knowing the password.
Q 25 - The argument in a saved sqoop job can be altered at run time by using the option
For a saved job named 'job1' the --table parameter can be altered at run time by using the command below.
sqoop job --exec job1 -- --table-newtable.
|Question Number||Answer Key|