Following are the four mechanisms for interfacing with a particular database system −
First method of accessing a database is by using the set of routines in an API (Application Program Interface). In this method, the DBMS will be bundled as a set of query and maintenance utilities. These utilities will communicate with the running database through a shared library which further will be exposed to the user as a set of routines in an API.
Second method is via an intermediate abstract layer. This abstract layer will communicate with the database API via a driver. Some example of such drivers are ODBC, JDBC, and Database Interface (DBI).
Third approach is to use Python module for a specific database system. PyCall package will be used to call routines in the Python module. It will also handle the interchange of datatypes between Python and Julia.
The fourth method is sending messages to the database. RESTful is the most common messaging protocol.
Julia provides several APIs to communicate with various database providers.
MySQL.jl is the package to access MySQL from Julia programming language.
Use the following code to install the master version of MySQL API −
To access MySQL API, we need to first connect to the MySQL server which can be done with the help of following code −`
using MySQL con = mysql_connect(HOST, USER, PASSWD, DBNAME)
To work with database, use the following code snippet to create a table −
command = """CREATE TABLE Employee ( ID INT NOT NULL AUTO_INCREMENT, Name VARCHAR(255), Salary FLOAT, JoinDate DATE, LastLogin DATETIME, LunchTime TIME, PRIMARY KEY (ID) );""" response = mysql_query(con, command) if (response == 0) println("Create table succeeded.") else println("Create table failed.") end
We can use the following command to obtain the SELECT query result as dataframe −
command = """SELECT * FROM Employee;""" dframe = execute_query(con, command)
We can use the following command to obtain the SELECT query result as Julia Array −
command = """SELECT * FROM Employee;""" retarr = mysql_execute_query(con, command, opformat=MYSQL_ARRAY)
We can use the following command to obtain the SELECT query result as Julia Array with each row as a tuple −
command = """SELECT * FROM Employee;""" retarr = mysql_execute_query(con, command, opformat=MYSQL_TUPLES)
We can execute a multi query as follows −
command = """INSERT INTO Employee (Name) VALUES (''); UPDATE Employee SET LunchTime = '15:00:00' WHERE LENGTH(Name) > 5;""" data = mysql_execute_query(con, command)
We can get dataframes by using prepared statements as follows −
command = """SELECT * FROM Employee;""" stmt = mysql_stmt_init(con) if (stmt == C_NULL) error("Error in initialization of statement.") end response = mysql_stmt_prepare(stmt, command) mysql_display_error(con, response != 0, "Error occured while preparing statement for query \"$command\"") dframe = mysql_stmt_result_to_dataframe(stmt) mysql_stmt_close(stmt)
Use the following command to close the connection −
JDBC.jl is Julia interface to Java database drivers. The package JDBC.jl enables us the use of Java JDBC drivers to access databases from within Julia programming language.
To start working with it, we need to first add the database driver jar file to the classpath and then initialize the JVM as follows −
using JDBC JavaCall.addClassPath("path of .jar file") # add the path of your .jar file JDBC.init()
The JDBC API in Julia is similar to Java JDBC driver. To connect with a database, we need to follow similar steps as shown below −
conn = DriverManager.getConnection("jdbc:gl:test/juliatest") stmt = createStatement(conn) rs = executeQuery(stmt, "select * from mytable") for r in rs println(getInt(r, 1), getString(r,"NAME")) end
If you want to get each row as a Julia tuple, use JDBCRowIterator to iterate over the result set. Note that if the values are declared to be nullable in the database, they will be of nullable in tuples also.
for r in JDBCRowIterator(rs) println(r) end
Use PrepareStatement to do insert and update. It has setter functions defined for different types corresponding to the getter functions −
ppstmt = prepareStatement(conn, "insert into mytable values (?, ?)") setInt(ppstmt, 1,10) setString(ppstmt, 2,"TEN") executeUpdate(ppstmt)
Use CallableStatement to run the stored procedure −
cstmt = JDBC.prepareCall(conn, "CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY(?, ?)") setString(cstmt, 1, "gl.locks.deadlockTimeout") setString(cstmt, 2, "10") execute(cstmt)
In order to get an array of (column_name, column_type) tuples, we need to Pass the JResultSet object from executeQuery to getTableMetaData as follows −
conn = DriverManager.getConnection("jdbc:gl:test/juliatest") stmt = createStatement(conn) rs = executeQuery(stmt, "select * from mytable") metadata = getTableMetaData(rs)
Use the following command to close the connection −
For executing a query, we need a cursor first. Once obtained a cursor you can run execute! command on the cursor as follows −
csr = cursor(conn) execute!(csr, "insert into ptable (pvalue) values (3.14);") execute!(csr, "select * from gltable;")
We need to call rows on the cursor to iterate over the rows −
rs = rows(csr) for row in rs end
Use the following command to close the cursor call −
ODBC.jl is a package which provides us a Julia ODBC API interface. It is implemented by various ODBC driver managers. We can install it as follows −
Use the command below to install an ODBC driver −
ODBC.adddriver("name of driver", "full, absolute path to driver shared library"; kw...)
We need to pass −
The name of the driver
The full and absolute path to the driver shared library
And any additional keyword arguments which will be included as KEY=VALUE pairs in the .ini config files.
After installing the drivers, we can do the following for enabling connections −
Setup a DSN, via ODBC.adddsn("dsn name", "driver name"; kw...)
Connecting directly by using a full connection string like ODBC.Connection(connection_string)
Following are two paths to execute queries −
DBInterface.execute(conn, sql, params) − It will directly execute a SQL query and after that will return a Cursor for any resultset.
stmt = DBInterface.prepare(conn, sql); DBInterface.execute(stmt, params) − It will first prepare a SQL statement and then execute. The execution can be done perhaps multiple times with different parameters.
SQLlite is a fast, flexible delimited file reader and writer for Julia programming language. This package is registered in METADATA.jl hence can be installed by using the following command −
We will discuss two important and useful functions used in SQLite along with the example −
SQLite.DB(file::AbstractString) − This function requires the file string argument as the name of a pre-defined SQLite database to be opened. If the file does not exit, it will create a database.
julia> using SQLite julia> db = SQLite.DB("Chinook_Sqlite.sqlite")
Here we are using a sample database ‘Chinook’ available for SQLite, SQL Server, MySQL, etc.
SQLite.query(db::SQLite.DB, sql::String, values=) − This function returns the result, if any, after executing the prepared sql statement in the context of db.
julia> SQLite.query(db, "SELECT * FROM Genre WHERE regexp('e[trs]', Name)") 6x2 ResultSet | Row | "GenreId" | "Name" | |-----|-----------|----------------------| | 1 | 3 | "Metal" | | 2 | 4 | "Alternative & Punk" | | 3 | 6 | "Blues" | | 4 | 13 | "Heavy Metal" | | 5 | 23 | "Alternative" | | 6 | 25 | "Opera" |
PostgreSQL.jl is the PostgreSQL DBI driver. It is an interface to PostgreSQL from Julia programming language. It obeys the DBI.jl protocol for working and uses the C PostgreeSQL API (libpq).
Let’s understand its usage with the help of following code −
using DBI using PostgreSQL conn = connect(Postgres, "localhost", "username", "password", "dbname", 5432) stmt = prepare(conn, "SELECT 1::bigint, 2.0::double precision, 'foo'::character varying, " * "'foo'::character(10);") result = execute(stmt) for row in result end finish(stmt) disconnect(conn)
To use PostgreSQL we need to fulfill the following binary requirements −
DataFrames.jl >= v0.5.7
DataArrays.jl >= v0.1.2
libpq shared library (comes with a standard PostgreSQL client installation)
julia 0.3 or higher
Hive.jl is a client for distributed SQL engine. It provides a HiveServer2, for example: Hive, Spark, SQL, Impala.
To connect to the server, we need to create an instance of the HiveSession as follows −
session = HiveSession()
It can also be connected by specifying the hostname and the port number as follows −
session = HiveSession(“localhost”,10000)
The default implementation as above will authenticates with the same user-id as that of the shell. We can override it as follows −
session = HiveSession("localhost", 10000, HiveAuthSASLPlain("uid", "pwd", "zid"))
We can execute DML, DDL, SET, etc., statements as we can see in the example below −
crs = execute(session, "select * from mytable where formid < 1001"; async=true, config=Dict()) while !isready(crs) println("waiting...") sleep(10) end crs = result(crs)
DBAPI is a new database interface proposal, inspired by Python’s DB API 2.0, that defies an abstract interface for database drivers in Julia. This module contains the following −
Abstract required functions which throw a NotImplementedError by default
Abstract optional functions which throw a NotSupportedError by default
To use this API, the database drivers must import this module, subtype its types, and create methods for its functions.
DBPrf is a Julia database which is maintained by JuliaDB. You see its usage below −
The user can provide input in two ways −
$ julia DBPerf.jl <Database_Driver_1.jl> <Database_Driver_2.jl> ....... <Database_Driver_N.jl> <DBMS>
Here, Database_Driver.jl can be of the following types −
DBMS filed is applicable only if we are using JDBC.jl.
The database can be either Oracle or MySQL.
DBPerf.jl ODBC.jl JDBC.jl MySql
julia> include("DBPerf.jl") julia> DBPerf(<Database_Driver_1.jl>, <Database_Driver_2.jl>, ....... <Database_Driver_N.jl>, <DBMS>)
DBPerf(“ODBC.jl”, “JDBC.jl”, “MySql”)