
- Apache Drill Tutorial
- Apache Drill - Home
- Apache Drill - Introduction
- Apache Drill - Fundamentals
- Apache Drill - Architecture
- Apache Drill - Installation
- Apache Drill - SQL Operations
- Apache Drill - Query using JSON
- Window Functions using JSON
- Querying Complex Data
- Data Definition Statements
- Apache Drill - Querying Data
- Querying Data using HBase
- Querying Data using Hive
- Apache Drill - Querying Parquet Files
- Apache Drill - JDBC Interface
- Apache Drill - Custom Function
- Apache Drill - Contributors
- Apache Drill Useful Resources
- Apache Drill - Quick Guide
- Apache Drill - Useful Resources
- Apache Drill - Discussion
Apache Drill - Querying Data using Hive
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy. Hive stores schema in a database and processed data into HDFS.
How to Query Hive Data in Apache Drill?
Following are the steps that are used to query Hive data in Apache Drill.
Step 1: Prerequisites
You must need to install the following components first −
- Java installed version 1.7 or greater
- Hadoop
- Hive
- ZooKeeper
Step 2: Start Hadoop, ZooKeeper and Hive
After the installation, start all the services (Hadoop, ZooKeeper and Hive) one by one in a new terminal.
Step 3: Start Hive metastore
You can start the Hive metastore using the following command −
Query
hive --service metastore
Apache Drill uses Hive metastore service to get hive table’s details.
Step 4: Start Apache Drill in Distributed Mode
To start Drill shell in a distributed mode, you can issue the following command −
Query
bin/drillbit.sh start
Step 5: Enable Storage Plugin
Like HBase, open Apache Drill web console and choose Hive storage plugin enable option then add the following changes to hive storage plugin “update” option,
{ "type": "hive", "enabled": false, "configProps": { "hive.metastore.uris": "thrift://localhost:9083", "hive.metastore.sasl.enabled": "false", "fs.default.name": "hdfs://localhost/" } }
Step 6: Create a Table
Create a table in hive shell using the following command.
Query
create table customers (Name string, address string) row format delimited fields terminated by ',' stored as textfile;
Step 7: Load Data
Load data in the hive shell using the following command.
Query
load data local inpath '/path/to/file/customers.csv' overwrite into table customers;
Step 8: Query Data in Drill
You can query data in the hive shell using the following command.
Query
select * from hive.`customers`;
Result
'Alice','123 Ballmer Av' 'Bob','1 Infinite Loop' 'Frank','435 Walker Ct' 'Mary','56 Southern Pkwy'