This section presents you various set of Mock Tests related to Hive. You can download these sample mock tests at your local machine and solve offline at your convenience. Every mock test is supplied with a mock test key to let you verify the final score and grade yourself.
Q 1 - In case of one large table and 2 small tables, for an optimized query performance
When the small one is cached, each row from the larger table can be efficiently compared with each row of the small table.
Q 2 - The DISTRIBUTED BY clause in hive
Sorting as the last clause will be efficient as that is also the last step in the reduce job producing the output.
Q 3 - The DISTRIBUTED by clause is used to ensure that
The DISTRIBUTED BY clause send a range of values to the same reducer.
Q 4 - A view in Hive can be seen by using
There is no separate clause for viewing views. It is shown using show tables.
Q 5 - A View in Hive can be dropped by using
DROP view drops the view.
Q 6 - The name of a view in Hive
Views and tables are treated similarly in the hive metadata
Q 7 - The query
Create table TABLE_NAME LIKE VIEW_NAME
A table can be created form a view
Q 8 - what can be altered about a view
TBLPROPERTIES stores some documentation about the table like created date time etc.
Q 9 - Which kind of keys(CONSTRAINTS) Hive can have?
Hive is schema on read and unlike RDBMS it does not have a way to enforce the existence of keys.
Q 10 - The Index in Hive can be seen by
Similar to show tables, Indexes can be queried by SHOW Index.
Q 11 - If an Index is dropped then
AN index can be dropped only after dropping the table on which index is created.
Q 12 - Indexes can be created
As external table data is managed by other applications hive does not create index on them.
Q 13 - The clause " WITH DEFERRED REBUILD" while creating an index
It is about creating index on an empty table.
Q 14 - If the data on the table on which an index is defined changes then,
Hive does not manage the Index like RDBMS. SO it has to be built manually.
Q 15 - The identifiers in HiveQL are
Hive is case insensitive
Q 16 - What is the disadvantage of using too many partitions in Hive tables?
Too many partitions create too many files and too much metadata to be stored by namenode.
Q 17 - When importing data to using SerDe, if a row is found to have more columns than expected then
Hive is schema on Read and It does not throw error for mismatch between schema and actual data.
Q 18 - Consider the below two sets of queries.
Query A: hive> INSERT OVERWRITE TABLE sales SELECT * FROM history WHERE action = 'purchased'; hive> INSERT OVERWRITE TABLE credits SELECT * FROM history WHERE action = 'returned'; and Query B: hive> FROM history INSERT OVERWRITE sales SELECT * WHERE action = 'purchased' INSERT OVERWRITE credits SELECT * WHERE action = 'returned'
Which of them will make a single pass through?
in Query B, the query is executed only once.
Q 19 - Which of the following feature is used to analyze the query execution plan
EXPLAIN is used to analyze the query execution plan.
Q 20 - The LIMIT clause applied to a select query
The query is run on complete data set and then the results are restricted using LIMIT clause.
Q 21 - The default limit to the number of rows returned by a query can be done using which of the following parameter?
This parameter is configured to change the default value of the number of rows returned
Q 22 - The Property that decides what is the maximum number of files that can be sampled during the use of the LIMIT clause is
This property decides the number files to be looked into for the sample result.
Q 23 - Which of the following hint is used to optimize the join queries
Streaming a table of small size makes the query faster.
Q 24 - Setting the local mode execution to true causes
Local mode avoid creating mapreduce job while running the job in a single machine.
Q 25 - Hive can automatically decide to run local mode by setting which of the following parameters in hive-site.xml?
This parameter is used to set local mode.
|Question Number||Answer Key|