
- Apache Drill Tutorial
- Apache Drill - Home
- Apache Drill - Introduction
- Apache Drill - Fundamentals
- Apache Drill - Architecture
- Apache Drill - Installation
- Apache Drill - SQL Operations
- Apache Drill - Query using JSON
- Window Functions using JSON
- Querying Complex Data
- Data Definition Statements
- Apache Drill - Querying Data
- Querying Data using HBase
- Querying Data using Hive
- Apache Drill - Querying Parquet Files
- Apache Drill - JDBC Interface
- Apache Drill - Custom Function
- Apache Drill - Contributors
- Apache Drill Useful Resources
- Apache Drill - Quick Guide
- Apache Drill - Useful Resources
- Apache Drill - Discussion
Apache Drill - SQL Operations
Apache Drill is an open-source SQL-On-Everything engine. It allows SQL queries to be executed on any kind of data source, ranging from a simple CSV file to an advanced SQL and NoSQL database servers.
To execute a query in a Drill shell, open your terminal move to the Drill installed directory and then type the following command.
$ bin/drill-embedded
Then you will see the response as shown in the following program.
0: jdbc:drill:zk = local>
Now you can execute your queries. Otherwise you can run your queries through web console application to the url of http://localhost:8047. If you need any help info type the following command.
$ 0: jdbc:drill:zk = local> !help
Primitive Data Types
Apache Drill supports the following list of data types.
Sr.No | Datatype & Description |
---|---|
1 | BIGINT 8-byte signed integer in the range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
2 | BINARY Variable-length byte string |
3 | BOOLEAN True or false |
4 | DATE Years, months, and days in YYYY-MM-DD format since 4713 BC |
5 | DECIMAL(p,s), DEC(p,s), or NUMERIC(p,s)* 38-digit precision number, precision is p, and scale is s |
6 | INTEGER or INT 4-byte signed integer in the range -2,147,483,648 to 2,147,483,647 |
7 | INTERVAL A day-time or year-month interval |
8 | SMALLINT 2-byte signed integer in the range -32,768 to 32,767 |
9 | FLOAT 4-byte floating point number |
10 | DOUBLE 8-byte floating point number |
11 | TIME 24-hour based time in hours, minutes, seconds format: HH:mm:ss |
12 | TIMESTAMP JDBC timestamp in year, month, date hour, minute, second, and optional milliseconds format: yyyy-MM-dd HH:mm:ss.SSS |
13 | CHARACTER VARYING, CHAR or VARCHAR variable-length string. The default limit is 1 character. The maximum character limit is 2,147,483,647. |
Let us continue with simple examples on the data types.
Date, Time and Timestamp
Apache Drill supports time functions in the range from 1971 to 2037. The processing logic of data types can be easily tested by “VALUES()” statement. The following query returns date, time and timestamp for the given values.
Query
0: jdbc:drill:zk = local> select DATE '2016-04-07', TIME '12:12:23',TIMESTAMP '2016-04-07 12:12:23' from (values(1));
Result
+-------------+-----------+------------------------+ | EXPR$0 | EXPR$1 | EXPR$2 | +-------------+-----------+------------------------+ | 2016-04-07 | 12:12:23 | 2016-04-07 12:12:23.0 | +-------------+-----------+------------------------+
Interval
- The INTERVALYEAR and INTERVALDAY internal types represent a period of time.
- The INTERVALYEAR type specifies values from a year to a month.
- The INTERVALDAY type specifies values from a day to seconds.
INTERVALYEAR Query
0: jdbc:drill:zk = local> select timestamp '2016-04-07 12:45:50' + interval '10' year(2) from (values(1));
Result
+------------------------+ | EXPR$0 | +------------------------+ | 2026-04-07 12:45:50.0 | +———————————————————————-+ 1 row selected (0.251 seconds)
In the above query, INTERVAL keyword followed by 10 adds 10 years to the timestamp. The 2 in parentheses in YEAR(2) specifies the precision of the year interval, 2 digits in this case to support the ten interval.
INTERVALDAY Query
0: jdbc:drill:zk = local> select timestamp '2016-04-07 12:45:52' + interval '1' day(1) from (values(1));
Result
+------------------------+ | EXPR$0 | +------------------------+ | 2016-04-08 12:45:52.0 | +———————————————————————-+
Here INTERVAL ‘1’ indicates that two days will be added from that specified day.
Operators
The following operators are used in Apache Drill to perform the desired operations.
Sr.No | Operators & Description |
---|---|
1 | Logical Operators AND, BETWEEN, IN , LIKE , NOT , OR |
2 | Comparison Operators <, > , <= , >= , = , <> , IS NULL , IS NOT NULL , IS FALSE , IS NOT FALSE , IS TRUE , IS NOT TRUE, Pattern Matching Operator - LIKE |
3 | Math Operators +,-,*,/ |
4 | Subquery Operators EXISTS, IN |
Drill Scalar Functions
Apache Drill scalar functions supports Math and Trig functions. Most scalar functions use data types such as INTEGER, BIGINT, FLOAT and DOUBLE.
Math Functions
The following table describes the list of “Math functions” in Apache Drill.
Sr.No | Function & Description |
---|---|
1 |
Returns the absolute value of the input argument x. |
2 |
Returns the cubic root of x. |
3 |
Returns the smallest integer not less than x. |
4 |
Same as CEIL. |
5 |
Converts x radians to degrees. |
6 |
Returns e (Euler's number) to the power of x. |
7 |
Returns the largest integer not greater than x. |
8 |
Returns log base x to the y power. |
Now let’s run queries for the scalar functions. The Drill scalar functions can be easily tested by the values() statement, otherwise you can also use the select statement.
Trig Functions
Apache Drill supports the following trig functions and these functions’ return type is a floating point value.
Sr.No | Function & Description |
---|---|
1 |
Sine of angle x in radians |
2 |
Inverse cosine of angle x in radians |
3 |
Hyperbolic cosine of hyperbolic angle x in radians |
Data Type Conversion
In Apache Drill, you can cast or convert data to the required type for moving data from one data source to another. Drill also supports the following functions for casting and converting data types −
Function | Return type | Description |
---|---|---|
Data type of y | Converts the data type of x to y | |
Data type of y | Converts binary data (x) to Drill internal types (y) based on the little or big endian encoding of the data. | |
Data type of y | Converts binary data (x) from Drill internal types (y) based on the little or big endian encoding of the data. |
Date - Time Functions
Apache Drill supports time functions based on the Gregorian calendar and in the range from 1971 to 2037. The following table describes the list of Date/Time functions.
Function | Return Type | Description |
---|---|---|
INTERVALDAY or INTERVALYEAR |
Returns interval between two timestamps or subtracts a timestamp from midnight of the current date. | |
DATE | Returns current date | |
TIME | Returns current time | |
TIMESTAMP | Returns current timestamp | |
DATE, TIMESTAMP | Subtracts an interval (y) from a date or timestamp expression (x). | |
DOUBLE | Extracts a time unit from a date or timestamp expression (y). This must be one of the following values: SECOND, MINUTE, HOUR, DAY, MONTH, and YEAR. |
|
TIME | Returns the local current time. | |
TIMESTAMP | Returns the local current timestamp. | |
TIMESTAMP | Returns current timestamp | |
VARCHAR | Returns current timestamp for UTC time zone. |
String Manipulation Function
Apache Drill supports the following list of string functions.
Function | Return type | Description |
---|---|---|
BINARY or VARCHAR | Returns in binary format a substring y of the string x. | |
INTEGER | Returns the length of the alphanumeric argument x. | |
VARCHAR | Combines the two alphanumeric values x and y. Has the same effect as the || operator. | |
VARCHAR | Returns x in which the first character is capitalized. | |
INTEGER | Returns the length in bytes of the alphanumeric value x. | |
VARCHAR | Converts all upper-case letters of x to lower-case letters. | |
VARCHAR | The value of x is filled in the front (the left-hand side) with the value of z until the total length of the value is equal y’s length. If no z value then blanks are used to fill the position. |
|
VARCHAR | Removes all blanks that appear at the beginning of x. | |
INTEGER | Returns the start position of the string x in the string y. | |
VARCHAR | Substitutes new text for substrings that match Java regular expression patterns. In the string x, y is replaced by z. Y is the regular expression. | |
VARCHAR | The value of x is filled in the front (the right-hand side) with the value of z just until the total length of the value is equal to that of y. | |
VARCHAR | Removes all blanks from the end of the value of x. | |
INTEGER | Returns the start position of the string y in the string x. | |
VARCHAR | Extracts characters from position 1 - x of x an optional y times. | |
VARCHAR | Converts all lower-case letters of x to upper-case letters. |
Null Handling Function
Apache Drill supports the following list of null handling functions.
Function | Return type | Description |
---|---|---|
Data type of y | Returns the first non-null argument in the list of y’s. | |
Data type of y | Returns the value of the x if x and y are not equal, and returns a null value if x and y are equal. |
The following table describes the list of “Math functions” in Apache Drill.
Sr.No | Function & Description |
---|---|
1 | E() Returns 2.718281828459045. |
2 | LOG(x) Returns the natural log (base e) of x. |
3 | LOG10(x) Returns the common log of x. |
4 | LSHIFT(x, y) Shifts the binary x by y times to the left. |
5 | MOD(x, y) Returns the remainder of x divided by y. |
6 | NEGATIVE(x) Returns x as a negative number. |
7 | PI Returns pi. |
8 | POW(x, y) Returns the value of x to the y power. |
9 | RADIANS Converts x degrees to radians. |
10 | RAND Returns a random number from 0-1. |
11 | ROUND(x) Rounds to the nearest integer. |
12 | ROUND(x, y) Rounds x to y decimal places. Return type is decimal. |
13 | RSHIFT(x, y) Shifts the binary x by y times to the right. |
14 | SIGN(x) Returns the sign of x. |
15 | SQRT(x) Returns the square root of x. |
16 | TRUNC(x) |
Apache Drill supports the following trig functions and these functions’ return type is a floating point value.
Sr.No. | Function & Description |
---|---|
1 | COS(x) Cosine of angle x in radians |
2 | TAN(x) Tangent of angle x in radians |
3 | ASIN(x) Inverse sine of angle x in radians |
4 | ATAN(x) Inverse tangent of angle x in radians |
5 | SINH() Hyperbolic sine of hyperbolic angle x in radians |
6 | TANH() Hyperbolic tangent of hyperbolic angle x in radians |
The following table describes the list of Date/Time functions.
Function | Return type | Description |
---|---|---|
DATE_ADD(x,y) |
DATE, TIMESTAMP | Returns the sum of the sum of a date/time and a number of days/hours, or of a date/time and date/time interval. Where, x - date, time or timestamp y - integer or an interval expression. |
DATE_PART(x,y) |
DOUBLE | Returns a field of a date, time, timestamp, or interval. where, x - year, month, day, hour, minute, or second y - date, time, timestamp, or interval literal |
UNIX_TIMESTAMP ( [ x] ) |
BIGINT | If x is specified as timestamp then the number of seconds since the UNIX epoch and the timestamp x is returned. If x is not specified then it returns the number of seconds since the UNIX epoch (January 1, 1970 at 00:00:00). |