ArangoDB - Quick Guide


Advertisements


ArangoDB - A Multi-Model First Database

ArangoDB is hailed as a native multi-model database by its developers. This is unlike other NoSQL databases. In this database, the data can be stored as documents, key/value pairs or graphs. And with a single declarative query language, any or all of your data can be accessed. Moreover, different models can be combined in a single query. And, owing to its multi-model style, one can make lean applications, which will be scalable horizontally with any or all of the three data models.

Layered vs. Native Multi-Model Databases

In this section, we will highlight a crucial difference between native and layered multimodel databases.

Many database vendors call their product “multi-model,” but adding a graph layer to a key/value or document store does not qualify as native multi-model.

With ArangoDB, the same core with the same query language, one can club together different data models and features in a single query, as we have already stated in previous section. In ArangoDB, there is no “switching” between data models, and there is no shifting of data from A to B to execute queries. It leads to performance advantages to ArangoDB in comparison to the “layered” approaches.

The Need for Multimodal Database

Interpreting the [Fowler’s] basic idea leads us to realize the benefits of using a variety of appropriate data models for different parts of the persistence layer, the layer being part of the larger software architecture.

According to this, one might, for example, use a relational database to persist structured, tabular data; a document store for unstructured, object-like data; a key/value store for a hash table; and a graph database for highly linked referential data.

However, traditional implementation of this approach will lead one to use multiple databases in the same project. It can lead to some operational friction (more complicated deployment, more frequent upgrades) as well as data consistency and duplication issues.

The next challenge after unifying the data for the three data models, is to devise and implement a common query language that can allow data administrators to express a variety of queries, such as document queries, key/value lookups, graphy queries, and arbitrary combinations of these.

By graphy queries, we mean queries involving graph-theoretic considerations. In particular, these may involve the particular connectivity features coming from the edges. For example, ShortestPath, GraphTraversal, and Neighbors.

Graphs are a perfect fit as data model for relations. In many real-world cases such as social network, recommendor system, etc., a very natural data model is a graph. It captures relations and can hold label information with each edge and with each vertex. Further, JSON documents are a natural fit to store this type of vertex and edge data.

ArangoDB ─ Features

There are various notable features of ArangoDB. We will highlight the prominent features below −

  • Multi-model Paradigm
  • ACID Properties
  • HTTP API

ArangoDB supports all popular database models. Following are a few models supported by ArangoDB −

  • Document model
  • Key/Value model
  • Graph model

A single query language is enough to retrieve data out of the database

The four properties Atomicity, Consistency, Isolation, and Durability (ACID) describe the guarantees of database transactions. ArangoDB supports ACID-compliant transactions.

ArangoDB allows clients, such as browsers, to interact with the database with HTTP API, the API being resource-oriented and extendable with JavaScript.

ArangoDB - Advantages

Following are the advantages of using ArangoDB −

Consolidation

As a native multi-model database, ArangoDB eliminates the need to deploy multiple databases, and thus decreases the number of components and their maintenance. Consequently, it reduces the technology-stack complexity for the application. In addition to consolidating your overall technical needs, this simplification leads to lower total cost of ownership and increasing flexibility.

Simplified Performance Scaling

With applications growing over time, ArangoDB can tackle growing performance and storage needs, by independently scaling with different data models. As ArangoDB can scale both vertically and horizontally, so in case when your performance demands a decrease (a deliberate, desired slow-down), your back-end system can be easily scaled down to save on hardware as well as operational costs.

Reduced Operational Complexity

The decree of Polyglot Persistence is to employ the best tools for every job you undertake. Certain tasks need a document database, while others may need a graph database. As a result of working with single-model databases, it can lead to multiple operational challenges. Integrating single-model databases is a difficult job in itself. But the biggest challenge is building a large cohesive structure with data consistency and fault tolerance between separate, unrelated database systems. It may prove nearly impossible.

Polyglot Persistence can be handled with a native multi-model database, as it allows to have polyglot data easily, but at the same time with data consistency on a fault tolerant system. With ArangoDB, we can use the correct data model for the complex job.

Strong Data Consistency

If one uses multiple single-model databases, data consistency can become an issue. These databases aren’t designed to communicate with each other, therefore some form of transaction functionality needs to be implemented to keep your data consistent between different models.

Supporting ACID transactions, ArangoDB manages your different data models with a single back-end, providing strong consistency on a single instance, and atomic operations when operating in cluster mode.

Fault Tolerance

It is a challenge to build fault tolerant systems with many unrelated components. This challenge becomes more complex when working with clusters. Expertise is required to deploy and maintain such systems, using different technologies and/or technology stacks. Moreover, integrating multiple subsystems, designed to run independently, inflict large engineering and operational costs.

As a consolidated technology stack, multi-model database presents an elegant solution. Designed to enable modern, modular architectures with different data models, ArangoDB works for cluster usage as well.

Lower Total Cost of Ownership

Each database technology requires ongoing maintenance, bug fixing patches, and other code changes which are provided by the vendor. Embracing a multi-model database significantly reduces the related maintenance costs simply by eliminating the number of database technologies in designing an application.

Transactions

Providing transactional guarantees throughout multiple machines is a real challenge, and few NoSQL databases give these guarantees. Being native multi-model, ArangoDB imposes transactions to guarantee data consistency.

Basic Concepts and Terminologies

In this chapter, we will discuss the basic concepts and terminologies for ArangoDB. It is very important to have a knowhow of the underlying basic terminologies related to the technical topic we are dealing with.

The terminologies for ArangoDB are listed below −

  • Document
  • Collection
  • Collection Identifier
  • Collection Name
  • Database
  • Database Name
  • Database Organization

From the perspective of data model, ArangoDB may be considered a document-oriented database, as the notion of a document is the mathematical idea of the latter. Document-oriented databases are one of the main categories of NoSQL databases.

The hierarchy goes like this: Documents are grouped into collections, and Collections exist inside databases

It should be obvious that Identifier and Name are two attributes for the collection and database.

Usually, two documents (vertices) stored in document collections are linked by a document (edge) stored in an edge collection. This is ArangoDB's graph data model. It follows the mathematical concept of a directed, labeled graph, except that edges don't just have labels, but are full-blown documents.

Having become familiar with the core terms for this database, we begin to understand ArangoDB's graph data model. In this model, there exist two types of collections: document collections and edge collections. Edge collections store documents and also include two special attributes: first is the _from attribute, and the second is the _to attribute. These attributes are used to create edges (relations) between documents essential for graph database. Document collections are also called vertex collections in the context of graphs (see any graph theory book).

Let us now see how important databases are. They are important because collections exist inside databases. In one instance of ArangoDB, there can be one or many databases. Different databases are usually used for multi-tenant setups, as the different sets of data inside them (collections, documents, etc.) are isolated from one another. The default database _system is special, because it cannot be removed. Users are managed in this database, and their credentials are valid for all the databases of a server instance.

ArangoDB - System Requirements

In this chapter, we will discuss the system requirements for ArangoDB.

The system requirements for ArangoDB are as follows −

  • A VPS Server with Ubuntu Installation
  • RAM: 1 GB; CPU : 2.2 GHz

For all the commands in this tutorial, we have used an instance of Ubuntu 16.04 (xenial) of RAM 1GB with one cpu having a processing power 2.2 GHz. And all the arangosh commands in this tutorial were tested for the ArangoDB version 3.1.27.

How to Install ArangoDB?

In this section, we will see how to install ArangoDB. ArangoDB comes pre-built for many operating systems and distributions. For more details, please refer to the ArangoDB documentation. As already mentioned, for this tutorial we will use Ubuntu 16.04x64.

The first step is to download the public key for its repositories −

# wget https://www.arangodb.com/repositories/arangodb31/
xUbuntu_16.04/Release.key

Output

--2017-09-03 12:13:24-- https://www.arangodb.com/repositories/arangodb31/xUbuntu_16.04/Release.key 
Resolving https://www.arangodb.com/
(www.arangodb.com)... 104.25.1 64.21, 104.25.165.21,
2400:cb00:2048:1::6819:a415, ... 
Connecting to https://www.arangodb.com/
(www.arangodb.com)|104.25. 164.21|:443... connected. 
HTTP request sent, awaiting response... 200 OK 
Length: 3924 (3.8K) [application/pgpkeys]
Saving to: ‘Release.key’
Release.key 100%[===================>] 3.83K - .-KB/s in 0.001s
2017-09-03 12:13:25 (2.61 MB/s) - ‘Release.key’ saved [39 24/3924]

The important point is that you should see the Release.key saved at the end of the output.

Let us install the saved key using the following line of code −

# sudo apt-key add Release.key

Output

OK

Run the following commands to add the apt repository and update the index −

# sudo apt-add-repository 'deb
https://www.arangodb.com/repositories/arangodb31/xUbuntu_16.04/ /'
# sudo apt-get update

As a final step, we can install ArangoDB −

# sudo apt-get install arangodb3

Output

Reading package lists... Done
Building dependency tree
Reading state information... Done
The following package was automatically installed and is no longer required:
grub-pc-bin
Use 'sudo apt autoremove' to remove it.
The following NEW packages will be installed:
arangodb3
0 upgraded, 1 newly installed, 0 to remove and 17 not upgraded.
Need to get 55.6 MB of archives.
After this operation, 343 MB of additional disk space will be used.

Press Enter. Now the process of installing ArangoDB will start −

Get:1 https://www.arangodb.com/repositories/arangodb31/xUbuntu_16.04
arangodb3 3.1.27 [55.6 MB]
Fetched 55.6 MB in 59s (942 kB/s)
Preconfiguring packages ...
Selecting previously unselected package arangodb3.
(Reading database ... 54209 files and directories currently installed.)
Preparing to unpack .../arangodb3_3.1.27_amd64.deb ...

Unpacking arangodb3 (3.1.27) ...
Processing triggers for systemd (229-4ubuntu19) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up arangodb3 (3.1.27) ...
Database files are up-to-date.

When the installation of ArangoDB is about to complete, the following screen appears −

Installation of ArangoDB

Here, you will be asked to provide a password for the ArangoDB root user. Note it down carefully.

Select the yes option when the following dialog box appears −

Configuration Dalog Box

When you click Yes as in the above dialog box, the following dialog box appears. Click Yes here.

Configuration Dalog Box 2

You can also check the status of ArangoDB with the following command −

# sudo systemctl status arangodb3

Output

arangodb3.service - LSB: arangodb
Loaded: loaded (/etc/init.d/arangodb3; bad; vendor pre set: enabled)
Active: active (running) since Mon 2017-09-04 05:42:35 UTC;
4min 46s ago
Docs: man:systemd-sysv-generator(8)
Process: 2642 ExecStart=/etc/init.d/arangodb3 start (code = exited,
status = 0/SUC
Tasks: 22
Memory: 158.6M
CPU: 3.117s
CGroup: /system.slice/arangodb3.service
├─2689 /usr/sbin/arangod --uid arangodb
--gid arangodb --pid-file /va
└─2690 /usr/sbin/arangod --uid arangodb
--gid arangodb --pid-file /va
Sep   04 05:42:33   ubuntu-512  systemd[1]:        Starting LSB:  arangodb...
Sep   04 05:42:33   ubuntu-512  arangodb3[2642]:  * Starting arango database server a
Sep   04 05:42:35   ubuntu-512  arangodb3[2642]:   {startup} starting up in daemon mode
Sep   04 05:42:35   ubuntu-512  arangodb3[2642]:   changed working directory for child
Sep   04 05:42:35   ubuntu-512  arangodb3[2642]:   ...done. 
Sep   04 05:42:35   ubuntu-512  systemd[1]:        StartedLSB: arang odb.
Sep   04 05:46:59   ubuntu-512  systemd[1]:        Started LSB: arangodb. lines 1-19/19 (END)

ArangoDB is now ready to be used.

To invoke the arangosh terminal, type the following command in the terminal −

# arangosh

Output

Please specify a password:

Supply the root password created at the time of installation −

_
__ _ _ __ __ _ _ __ __ _ ___ | |
/ | '__/ _ | ’ \ / ` |/ _ / | ’
| (| | | | (| | | | | (| | () _ \ | | |
_,|| _,|| ||_, |_/|/| ||
|__/

arangosh (ArangoDB 3.1.27 [linux] 64bit, using VPack 0.1.30, ICU 54.1, V8
5.0.71.39, OpenSSL 1.0.2g 1 Mar 2016)

Copyright (c) ArangoDB GmbH

Pretty printing values.
Connected to ArangoDB 'http+tcp://127.0.0.1:8529' version: 3.1.27 [server],
database: '_system', username: 'root'

Please note that a new minor version '3.2.2' is available
Type 'tutorial' for a tutorial or 'help' to see common examples
127.0.0.1:8529@_system> exit

Output Window Root Password

To log out from ArangoDB, type the following command −

127.0.0.1:8529@_system> exit

Output

Uf wiederluege! Na shledanou! Auf Wiedersehen! Bye Bye! Adiau! ¡Hasta luego!
Εις το επανιδείν!

להתראות ! Arrivederci! Tot ziens! Adjö! Au revoir! さようなら До свидания! Até
Breve! !خداحافظ

ArangoDB - Command Line

In this chapter, we will discuss how Arangosh works as the Command Line for ArangoDB. We will start by learning how to add a Database user.

Note − Remember numeric keypad might not work on Arangosh.

Let us assume that the user is “harry” and password is "hpwdb".

127.0.0.1:8529@_system> require("org/arangodb/users").save("harry", "hpwdb");

Output

{
   "user" : "harry",
   "active" : true,
   "extra" : {},
   "changePassword" : false,
   "code" : 201
}

ArangoDB - Web Interface

In this chapter, we will learn how to enable/disable the Authentication, and how to bind the ArangoDB to the Public Network Interface.

# arangosh --server.endpoint tcp://127.0.0.1:8529 --server.database "_system"

It will prompt you for the password saved earlier −

Please specify a password:

Use the password you created for root, at the configuration.

You can also use curl to check that you are actually getting HTTP 401 (Unauthorized) server responses for requests that require authentication −

# curl --dump - http://127.0.0.1:8529/_api/version

Output

HTTP/1.1 401 Unauthorized
X-Content-Type-Options: nosniff
Www-Authenticate: Bearer token_type = "JWT", realm = "ArangoDB"
Server: ArangoDB
Connection: Keep-Alive
Content-Type: text/plain; charset = utf-8
Content-Length: 0

To avoid entering the password each time during our learning process, we will disable the authentication. For that, open the configuration file −

# vim /etc/arangodb3/arangod.conf

You should change the color scheme if the code is not properly visible.

:colorscheme desert

Set authentication to false as shown in the screenshot below.

Output Window Root Password

Restart the service −

# service arangodb3 restart

On making the authentication false, you will be able to login (either with root or created user like Harry in this case) without entering any password in please specify a password.

Let us check the api version when the authentication is switched off −

# curl --dump - http://127.0.0.1:8529/_api/version

Output

HTTP/1.1 200 OK
X-Content-Type-Options: nosniff
Server: ArangoDB
Connection: Keep-Alive
Content-Type: application/json; charset=utf-8
Content-Length: 60
{"server":"arango","version":"3.1.27","license":"community"}

ArangoDB - Example Case Scenarios

In this chapter, we will consider two example scenarios. These examples are easier to comprehend and will help us understand the way the ArangoDB functionality works.

To demonstrate the APIs, ArangoDB comes preloaded with a set of easily understandable graphs. There are two methods to create instances of these graphs in your ArangoDB −

  • Add Example tab in the create graph window in the web interface,
  • or load the module @arangodb/graph-examples/example-graph in Arangosh.

To start with, let us load a graph with the help of web interface. For that, launch the web interface and click on the graphs tab.

Graph Web Interface

The Create Graph dialog box appears. The Wizard contains two tabs – Examples and Graph. The Graph tab is open by default; supposing we want to create a new graph, it will ask for the name and other definitions for the graph.

Graph Create Graph

Now, we will upload the already created graph. For this, we will select the Examples tab.

Upload Created Graph

We can see the three example graphs. Select the Knows_Graph and click on the green button Create.

Once you have created them, you can inspect them in the web interface - which was used to create the pictures below.

Graph Create The Pictures

The Knows_Graph

Let us now see how the Knows_Graph works. Select the Knows_Graph, and it will fetch the graph data.

The Knows_Graph consists of one vertex collection persons connected via one edge collection knows. It will contain five persons Alice, Bob, Charlie, Dave and Eve as vertices. We will have the following directed relations

Alice knows Bob
Bob knows Charlie
Bob knows Dave
Eve knows Alice
Eve knows Bob

Knows_Graph

If you click a node (vertex), say ‘bob’, it will show the ID (persons/bob) attribute name.

Knows_Graph Vertex

And on clicking any of the edge, it will show the ID (knows/4590) attributes.

Click Any Edge Shows ID

This is how we create it, inspect its vertices and edges.

Let us add another graph, this time using Arangosh. For that, we need to include another endpoint in the ArangoDB configuration file.

How to Add Multiple Endpoints

Open the configuration file −

# vim /etc/arangodb3/arangod.conf

Add another endpoint as shown in the terminal screenshot below.

Endpoint Terminal Screenshot

Restart the ArangoDB −

# service arangodb3 restart

Launch the Arangosh −

# arangosh
Please specify a password:
_
__ _ _ __ __ _ _ __ __ _ ___ ___| |__
/ _` | '__/ _` | '_ \ / _` |/ _ \/ __| '_ \
| (_| | | | (_| | | | | (_| | (_) \__ \ | | |
\__,_|_| \__,_|_| |_|\__, |\___/|___/_| |_|
|___/
arangosh (ArangoDB 3.1.27 [linux] 64bit, using VPack 0.1.30, ICU 54.1, V8
5.0.71.39, OpenSSL 1.0.2g 1 Mar 2016)
Copyright (c) ArangoDB GmbH
Pretty printing values.
Connected to ArangoDB 'http+tcp://127.0.0.1:8529' version: 3.1.27
[server], database: '_system', username: 'root'
Please note that a new minor version '3.2.2' is available
Type 'tutorial' for a tutorial or 'help' to see common examples
127.0.0.1:8529@_system>

The Social_Graph

Let us now understand what a Social_Graph is and how it works. The graph shows a set of persons and their relations −

This example has female and male persons as vertices in two vertex collections - female and male. The edges are their connections in the relation edge collection. We have described how to create this graph using Arangosh. The reader can work around it and explore its attributes, as we did with the Knows_Graph.

ArangoDB - Data Models and Modeling

In this chapter, we will focus on the following topics −

  • Database Interaction
  • Data Model
  • Data Retrieval

ArangoDB supports document based data model as well as graph based data model. Let us first describe the document based data model.

ArangoDB's documents closely resemble the JSON format. Zero or more attributes are contained in a document, and a value attached with each attribute. A value is either of an atomic type, such as a number, Boolean or null, literal string, or of a compound data type, such as embedded document/object or an array. Arrays or sub-objects may consist of these data types, which implies that a single document can represent non-trivial data structures.

Further in hierarchy, documents are arranged into collections, which may contain no documents (in theory) or more than one document. One can compare documents to rows and collections to tables (Here tables and rows refer to those of relational database management systems - RDBMS).

But, in RDBMS, defining columns is a prerequisite to store records into a table, calling these definitions schemas. However, as a novel feature, ArangoDB is schema-less – there is no a priori reason to specify what attributes the document will have.

And unlike RDBMS, each document can be structured in a completely different way from another document. These documents can be saved together in one single collection. Practically, common characteristics may exist among documents in the collection, however the database system, i.e., ArangoDB itself, does not bind you to a particular data structure.

Now we will try to understand ArangoDB's [graph data model], which requires two kinds of collections — the first is the document collections (known as vertices collections in group-theoretic language), the second is the edge collections. There is a subtle difference between these two types. Edge collections also store documents, but they are characterized by including two unique attributes, _from and _to for creating relations between documents. In practice, a document (read edge) links two documents (read vertices), both stored in their respective collections. This architecture is derived from the graph-theoretic concept of a labeled, directed graph, excluding edges that can have not only labels, but can be a complete JSON like document in itself.

To compute fresh data, delete documents or to manipulate them, queries are used, which select or filter documents as per the given criteria. Either being simple as an “example query” or being as complex as “joins”, queries are coded in AQL - ArangoDB Query Language.

ArangoDB - Database Methods

In this chapter, we will discuss the different Database Methods in ArangoDB.

To start with, let us get the properties of the Database −

  • Name
  • ID
  • Path

First, we invoke the Arangosh. Once, Arangosh is invoked, we will list the databases we created so far −

We will use the following line of code to invoke Arangosh −

127.0.0.1:8529@_system> db._databases()

Output

[
   "_system",
   "song_collection"
]

We see two databases, one _system created by default, and the second song_collection that we have created.

Let us now shift to song_collection database with the following line of code −

127.0.0.1:8529@_system> db._useDatabase("song_collection")

Output

true
127.0.0.1:8529@song_collection>

We will explore the properties of our song_collection database.

To find the name

We will use the following line of code to find the name.

127.0.0.1:8529@song_collection> db._name()

Output

song_collection

To find the id −

We will use the following line of code to find the id.

song_collection

Output

4838

To find the path −

We will use the following line of code to find the path.

127.0.0.1:8529@song_collection> db._path()

Output

/var/lib/arangodb3/databases/database-4838

Let us now check if we are in the system database or not by using the following line of code −

127.0.0.1:8529@song_collection&t; db._isSystem()

Output

false

It means we are not in the system database (as we have created and shifted to the song_collection). The following screenshot will help you understand this.

Created Shifted Songs Output Screenshot

To get a particular collection, say songs −

We will use the following line of code the get a particular collection.

127.0.0.1:8529@song_collection> db._collection("songs")

Output

[ArangoCollection 4890, "songs" (type document, status loaded)]

The line of code returns a single collection.

Let us move to the essentials of the database operations with our subsequent chapters.

ArangoDB - Crud Operations

In this chapter, we will learn the different operations with Arangosh.

The following are the possible operations with Arangosh −

  • Creating a Document Collection
  • Creating Documents
  • Reading Documents
  • Updating Documents

Let us start by creating a new database. We will use the following line of code to create a new database −

127.0.0.1:8529@_system> db._createDatabase("song_collection")
true

The following line of code will help you shift to the new database −

127.0.0.1:8529@_system> db._useDatabase("song_collection")
true

Prompt will shift to "@@song_collection"

127.0.0.1:8529@song_collection>

Promt Shift Song Collection

From here we will study CRUD Operations. Let us create a collection into the new database −

127.0.0.1:8529@song_collection> db._createDocumentCollection('songs')

Output

[ArangoCollection 4890, "songs" (type document, status loaded)]
127.0.0.1:8529@song_collection>

Let us add a few documents (JSON objects) to our 'songs' collection.

We add the first document in the following way −

127.0.0.1:8529@song_collection> db.songs.save({title: "A Man's Best Friend",
lyricist: "Johnny Mercer", composer: "Johnny Mercer", Year: 1950, _key:
"A_Man"})

Output

{
   "_id" : "songs/A_Man",
   "_key" : "A_Man",
   "_rev" : "_VjVClbW---"
}

Let us add other documents to the database. This will help us learn the process of querying the data. You can copy these codes and paste the same in Arangosh to emulate the process −

127.0.0.1:8529@song_collection> db.songs.save(
   {
      title: "Accentchuate The Politics", 
      lyricist: "Johnny Mercer", 
      composer: "Harold Arlen", Year: 1944,
      _key: "Accentchuate_The"
   }
)

{
   "_id" : "songs/Accentchuate_The",
   "_key" : "Accentchuate_The",
   "_rev" : "_VjVDnzO---"
}

127.0.0.1:8529@song_collection> db.songs.save(
   {
      title: "Affable Balding Me", 
      lyricist: "Johnny Mercer", 
      composer: "Robert Emmett Dolan", 
      Year: 1950,
      _key: "Affable_Balding"
   }
)
{
   "_id" : "songs/Affable_Balding",
   "_key" : "Affable_Balding",
   "_rev" : "_VjVEFMm---"
}

How to Read Documents

The _key or the document handle can be used to retrieve a document. Use document handle if there is no need to traverse the collection itself. If you have a collection, the document function is easy to use −

127.0.0.1:8529@song_collection> db.songs.document("A_Man");
{
   "_key" : "A_Man",
   "_id" : "songs/A_Man",
   "_rev" : "_VjVClbW---",
   "title" : "A Man's Best Friend",
   "lyricist" : "Johnny Mercer",
   "composer" : "Johnny Mercer",
   "Year" : 1950
}

How to Update Documents

Two options are available to update the saved data − replace and update.

The update function patches a document, merging it with the given attributes. On the other hand, the replace function will replace the previous document with a new one. The replacement will still occur even if completely different attributes are provided. We will first observe a non-destructive update, updating the attribute Production` in a song −

127.0.0.1:8529@song_collection> db.songs.update("songs/A_Man",{production:
"Top Banana"});

Output

{
   "_id" : "songs/A_Man",
   "_key" : "A_Man",
   "_rev" : "_VjVOcqe---",
   "_oldRev" : "_VjVClbW---"
}

Let us now read the updated song's attributes −

127.0.0.1:8529@song_collection> db.songs.document('A_Man');

Output

{
   "_key" : "A_Man",
   "_id" : "songs/A_Man",
   "_rev" : "_VjVOcqe---",
   "title" : "A Man's Best Friend",
   "lyricist" : "Johnny Mercer",
   "composer" : "Johnny Mercer",
   "Year" : 1950,
   "production" : "Top Banana"
}

A large document can be easily updated with the update function, especially when the attributes are very few.

In contrast, the replace function will abolish your data on using it with the same document.

127.0.0.1:8529@song_collection> db.songs.replace("songs/A_Man",{production:
"Top Banana"});

Let us now check the song we have just updated with the following line of code −

127.0.0.1:8529@song_collection> db.songs.document('A_Man');

Output

{
   "_key" : "A_Man",
   "_id" : "songs/A_Man",
   "_rev" : "_VjVRhOq---",
   "production" : "Top Banana"
}

Now, you can observe that the document no longer has the original data.

How to Remove Documents

The remove function is used in combination with the document handle to remove a document from a collection −

127.0.0.1:8529@song_collection> db.songs.remove('A_Man');

Let us now check the song's attributes we just removed by using the following line of code −

127.0.0.1:8529@song_collection> db.songs.document('A_Man');

We will get an exception error like the following as an output −

JavaScript exception in file
'/usr/share/arangodb3/js/client/modules/@arangodb/arangosh.js' at 97,7:
ArangoError 1202: document not found
! throw error;
! ^
stacktrace: ArangoError: document not found

at Object.exports.checkRequestResult
(/usr/share/arangodb3/js/client/modules/@arangodb/arangosh.js:95:21)

at ArangoCollection.document
(/usr/share/arangodb3/js/client/modules/@arangodb/arango-collection.js:667:12)
at <shell command>:1:10

Exception Error Output Screen

Crud Operations using Web Interface

In our previous chapter, we learned how to perform various operations on documents with Arangosh, the command line. We will now learn how to perform the same operations using the web interface. To start with, put the following address - http://your_server_ip:8529/_db/song_collection/_admin/aardvark/index.html#login in the address bar of your browser. You will be directed to the following login page.

Login Page

Now, enter the username and password.

Login Username Password

If it is successful, the following screen appears. We need to make a choice for the database to work on, the _system database being the default one. Let us choose the song_collection database, and click on the green tab −

Song Collection

Creating a Collection

In this section, we will learn how to create a collection. Press the Collections tab in the navigation bar at the top.

Our command line added songs collection are visible. Clicking on that will show the entries. We will now add an artists’ collection using the web interface. Collection songs which we created with Arangosh is already there. In the Name field, write artists in the New Collection dialog box that appears. Advanced options can safely be ignored and the default collection type, i.e. Document, is fine.

Creating Collection

Clicking on the Save button will finally create the collection, and now the two collections will be visible on this page.

Filling Up the Newly Created Collection with Documents

You will be presented with an empty collection on clicking the artists collection −

Filling Up Collection with Documents

To add a document, you need to click the + sign placed in the upper right corner. When you are prompted for a _key, enter Affable_Balding as the key.

Now, a form will appear to add and edit the attributes of the document. There are two ways of adding attributes: Graphical and Tree. The graphical way is intuitive but slow, therefore, we will switch to the Code view, using the Tree dropdown menu to select it −

Tree Dropdown menu

To make the process easier, we have created a sample data in the JSON format, which you can copy and then paste into the query editor area −

{"artist": "Johnny Mercer", "title":"Affable Balding Me", "composer": "Robert Emmett Dolan", "Year": 1950}

(Note: Only one pair of curly braces should be used; see the screenshot below)

Created Sample Data In JSON Format

You can observe that we have quoted the keys and also the values in the code view mode. Now, click Save. Upon successful completion, a green flash appears on the page momentarily.

How to Read Documents

To read documents, go back to the Collections page.

When one clicks on the artist collection, a new entry appears.

How to Update Documents

It is simple to edit the entries in a document; you just need to click on the row you wish to edit in the document overview. Here again the same query editor will be presented as when creating new documents.

Removing Documents

You can delete the documents by pressing the ‘-’ icon. Every document row has this sign at the end. It will prompt you to confirm to avoid unsafe deletion.

Moreover, for a particular collection, other operations like filtering the documents, managing indexes, and importing data also exist on the Collections Overview page.

In our subsequent chapter, we will discuss an important feature of the Web Interface, i.e., the AQL query Editor.

Querying the Data with AQL

In this chapter, we will discuss how to query the data with AQL. We have already discussed in our previous chapters that ArangoDB has developed its own query language and that it goes by the name AQL.

Let us now start interacting with AQL. As shown in the image below, in the web interface, press the AQL Editor tab placed at the top of the navigation bar. A blank query editor will appear.

When need, you can switch to the editor from the result view and vice-versa, by clicking the Query or the Result tabs in the top right corner as shown in the image below −

Switch To the Editor From The Result View

Among other things, the editor has syntax highlighting, undo/redo functionality, and query saving. For a detailed reference, one can see the official documentation. We will highlight few basic and commonly-used features of the AQL query editor.

AQL Fundamentals

In AQL, a query represents the end result to be achieved, but not the process through which the end result is to be achieved. This feature is commonly known as a declarative property of the language. Moreover, AQL can query as well modify the data, and thus complex queries can be created by combining both the processes.

Please note that AQL is entirely ACID-compliant. Reading or modifying queries will either conclude in whole or not at all. Even reading a document's data will finish with a consistent unit of the data.

We add two new songs to the songs collection we have already created. Instead of typing, you can copy the following query, and paste it in the AQL editor −

FOR song IN [
   {
      title: "Air-Minded Executive", lyricist: "Johnny Mercer",
      composer: "Bernie Hanighen", Year: 1940, _key: "Air-Minded"
   },
   
   {
      title: "All Mucked Up", lyricist: "Johnny Mercer", composer:
      "Andre Previn", Year: 1974, _key: "All_Mucked"
   }
]
INSERT song IN songs

Press the Execute button at the lower left.

It will write two new documents in the songs collection.

This query describes how the FOR loop works in AQL; it iterates over the list of JSON encoded documents, performing the coded operations on each one of the documents in the collection. The different operations can be creating new structures, filtering, selecting documents, modifying, or inserting documents into the database (refer the instantaneous example). In essence, AQL can perform the CRUD operations efficiently.

To find all the songs in our database, let us once again run the following query, equivalent to a SELECT * FROM songs of an SQL-type database (because the editor memorizes the last query, press the *New* button to clean the editor) −

FOR song IN songs
RETURN song

The result set will show the list of songs so far saved in the songs collection as shown in the screenshot below.

List of Songs

Operations like FILTER, SORT and LIMIT can be added to the For loop body to narrow and order the result.

FOR song IN songs
FILTER song.Year > 1940
RETURN song

The above query will give songs created after the year 1940 in the Result tab (see the image below).

Query Songs Created After Year_1940

The document key is used in this example, but any other attribute can also be used as an equivalent for filtering. Since the document key is guaranteed to be unique, no more than a single document will match this filter. For other attributes this may not be the case. To return a subset of active users (determined by an attribute called status), sorted by name in ascending order, we use the following syntax −

FOR song IN songs
FILTER song.Year > 1940
SORT song.composer
RETURN song
LIMIT 2

We have deliberately included this example. Here, we observe a query syntax error message highlighted in red by AQL. This syntax highlights the errors and is helpful in debugging your queries as shown in the screenshot below.

Syntax Highlights The Errors

Let us now run the correct query (note the correction) −

FOR song IN songs
FILTER song.Year > 1940
SORT song.composer
LIMIT 2
RETURN song

Run The Correct Query

Complex Query in AQL

AQL is equipped with multiple functions for all supported data types. Variable assignment within a query allows to build very complex nested constructs. This way data-intensive operations move closer to the data at the backend than on to the client (such as browser). To understand this, let us first add the arbitrary durations (length) to songs.

Let us start with the first function, i.e., the Update function −

UPDATE { _key: "All_Mucked" }
WITH { length: 180 }
IN songs

Complex Query in AQL

We can see one document has been written as shown in the above screenshot.

Let us now update other documents (songs) too.

UPDATE { _key: "Affable_Balding" }
WITH { length: 200 }
IN songs

We can now check that all our songs have a new attribute length

FOR song IN songs
RETURN song

Output

[
   {
      "_key": "Air-Minded",
      "_id": "songs/Air-Minded",
      "_rev": "_VkC5lbS---",
      "title": "Air-Minded Executive",
      "lyricist": "Johnny Mercer",
      "composer": "Bernie Hanighen",
      "Year": 1940,
      "length": 210
   },
   
   {
      "_key": "Affable_Balding",
      "_id": "songs/Affable_Balding",
      "_rev": "_VkC4eM2---",
      "title": "Affable Balding Me",
      "lyricist": "Johnny Mercer",
      "composer": "Robert Emmett Dolan",
      "Year": 1950,
      "length": 200
   },
   
   {
      "_key": "All_Mucked",
      "_id": "songs/All_Mucked",
      "_rev": "_Vjah9Pu---",
      "title": "All Mucked Up",
      "lyricist": "Johnny Mercer",
      "composer": "Andre Previn",
      "Year": 1974,
      "length": 180
   },
   
   {
      "_key": "Accentchuate_The",
      "_id": "songs/Accentchuate_The",
      "_rev": "_VkC3WzW---",
      "title": "Accentchuate The Politics",
      "lyricist": "Johnny Mercer",
      "composer": "Harold Arlen",
      "Year": 1944,
      "length": 190
   }
]

To illustrate the use of other keywords of AQL such as LET, FILTER, SORT, etc., we now format the song's durations in the mm:ss format.

Query

FOR song IN songs
FILTER song.length > 150
LET seconds = song.length % 60
LET minutes = FLOOR(song.length / 60)
SORT song.composer
RETURN
{
   Title: song.title, 
   Composer: song.composer, 
   Duration: CONCAT_SEPARATOR(':',minutes, seconds) 
}

Complex Query in AQL 2

This time we will return the song title together with the duration. The Return function lets you create a new JSON object to return for each input document.

We will now talk about the ‘Joins’ feature of AQL database.

Let us begin by creating a collection composer_dob. Further, we will create the four documents with the hypothetical date of births of the composers by running the following query in the query box −

FOR dob IN [
   {composer: "Bernie Hanighen", Year: 1909}
   ,
   {composer: "Robert Emmett Dolan", Year: 1922}
   ,
   {composer: "Andre Previn", Year: 1943}
   ,
   {composer: "Harold Arlen", Year: 1910}
]
INSERT dob in composer_dob

Composer DOB

To highlight the similarity with SQL, we present a nested FOR-loop query in AQL, leading to the REPLACE operation, iterating first in the inner loop, over all the composers’ dob and then on all the associated songs, creating a new document containing attribute song_with_composer_key instead of the song attribute.

Here goes the query −

FOR s IN songs
FOR c IN composer_dob
FILTER s.composer == c.composer

LET song_with_composer_key = MERGE(
   UNSET(s, 'composer'),
   {composer_key:c._key}
)
REPLACE s with song_with_composer_key IN songs

Song Wth Composer Key

Let us now run the query FOR song IN songs RETURN song again to see how the song collection has changed.

Output

[
   {
      "_key": "Air-Minded",
      "_id": "songs/Air-Minded",
      "_rev": "_Vk8kFoK---",
      "Year": 1940,
      "composer_key": "5501",
      "length": 210,
      "lyricist": "Johnny Mercer",
      "title": "Air-Minded Executive"
   },
   
   {
      "_key": "Affable_Balding",
      "_id": "songs/Affable_Balding",
      "_rev": "_Vk8kFoK--_",
      "Year": 1950,
      "composer_key": "5505",
      "length": 200,
      "lyricist": "Johnny Mercer",
      "title": "Affable Balding Me"
   },
   
   {
      "_key": "All_Mucked",
      "_id": "songs/All_Mucked",
      "_rev": "_Vk8kFoK--A",
      "Year": 1974,
      "composer_key": "5507",
      "length": 180,
      "lyricist": "Johnny Mercer",
      "title": "All Mucked Up"
   },
   
   {
      "_key": "Accentchuate_The",
      "_id": "songs/Accentchuate_The",
      "_rev": "_Vk8kFoK--B",
      "Year": 1944,
      "composer_key": "5509",
      "length": 190,
      "lyricist": "Johnny Mercer",
      "title": "Accentchuate The Politics"
   }
]

The above query completes the data migration process, adding the composer_key to each song.

Now the next query is again a nested FOR-loop query, but this time leading to the Join operation, adding the associated composer's name (picking with the help of `composer_key`) to each song −

FOR s IN songs
FOR c IN composer_dob
FILTER c._key == s.composer_key
RETURN MERGE(s,
{ composer: c.composer }
)

Output

[
   {
      "Year": 1940,
      "_id": "songs/Air-Minded",
      "_key": "Air-Minded",
      "_rev": "_Vk8kFoK---",
      "composer_key": "5501",
      "length": 210,
      "lyricist": "Johnny Mercer",
      "title": "Air-Minded Executive",
      "composer": "Bernie Hanighen"
   },
   
   {
      "Year": 1950,
      "_id": "songs/Affable_Balding",
      "_key": "Affable_Balding",
      "_rev": "_Vk8kFoK--_",
      "composer_key": "5505",
      "length": 200,
      "lyricist": "Johnny Mercer",
      "title": "Affable Balding Me",
      "composer": "Robert Emmett Dolan"
   },

   {
      "Year": 1974,
      "_id": "songs/All_Mucked",
      "_key": "All_Mucked",
      "_rev": "_Vk8kFoK--A",
      "composer_key": "5507",
      "length": 180,
      "lyricist": "Johnny Mercer",
      "title": "All Mucked Up",
      "composer": "Andre Previn"
   },

   {
      "Year": 1944,
      "_id": "songs/Accentchuate_The",
      "_key": "Accentchuate_The",
      "_rev": "_Vk8kFoK--B",
      "composer_key": "5509",
      "length": 190,
      "lyricist": "Johnny Mercer",
      "title": "Accentchuate The Politics",
      "composer": "Harold Arlen"
   }
]

Adding Composer Key To Each Song

ArangoDB - AQL Example Queries

In this chapter, we will consider a few AQL Example Queries on an Actors and Movies Database. These queries are based on graphs.

Problem

Given a collection of actors and a collection of movies, and an actIn edges collection (with a year property) to connect the vertex as indicated below −

[Actor] <- act in -> [Movie]

How do we get −

  • All actors who acted in "movie1" OR "movie2"?
  • All actors who acted in both "movie1" AND "movie2”?
  • All common movies between "actor1" and "actor2”?
  • All actors who acted in 3 or more movies?
  • All movies where exactly 6 actors acted in?
  • The number of actors by movie?
  • The number of movies by actor?
  • The number of movies acted in between 2005 and 2010 by actor?

Solution

During the process of solving and obtaining the answers to the above queries, we will use Arangosh to create the dataset and run queries on that. All the AQL queries are strings and can simply be copied over to your favorite driver instead of Arangosh.

Let us start by creating a Test Dataset in Arangosh. First, download this file

# wget -O dataset.js
https://drive.google.com/file/d/0B4WLtBDZu_QWMWZYZ3pYMEdqajA/view?usp=sharing

Output

...
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘dataset.js’
dataset.js [ <=> ] 115.14K --.-KB/s in 0.01s
2017-09-17 14:19:12 (11.1 MB/s) - ‘dataset.js’ saved [117907]

You can see in the output above that we have downloaded a JavaScript file dataset.js. This file contains the Arangosh commands to create the dataset in the database. Instead of copying and pasting the commands one by one, we will use the --javascript.execute option on Arangosh to execute the multiple commands non-interactively. Consider it the life saver command!

Now execute the following command on the shell −

$ arangosh --javascript.execute dataset.js

Command On The Shell

Supply the password when prompted as you can see in the above screenshot. Now we have saved the data, so we will construct the AQL queries to answer the specific questions raised in the beginning of this chapter.

First Question

Let us take the first question: All actors who acted in "movie1" OR "movie2". Suppose, we want to find the names of all the actors who acted in "TheMatrix" OR "TheDevilsAdvocate" −

We will start with one movie at a time to get the names of the actors −

127.0.0.1:8529@_system> db._query("FOR x IN ANY 'movies/TheMatrix' actsIn
OPTIONS {bfs: true, uniqueVertices: 'global'} RETURN x._id").toArray();

Output

We will receive the following output −

[
   "actors/Hugo",
   "actors/Emil",
   "actors/Carrie",
   "actors/Keanu",
   "actors/Laurence"
]

First Question

Now we continue to form a UNION_DISTINCT of two NEIGHBORS queries which will be the solution −

127.0.0.1:8529@_system> db._query("FOR x IN UNION_DISTINCT ((FOR y IN ANY
'movies/TheMatrix' actsIn OPTIONS {bfs: true, uniqueVertices: 'global'} RETURN
y._id), (FOR y IN ANY 'movies/TheDevilsAdvocate' actsIn OPTIONS {bfs: true,
uniqueVertices: 'global'} RETURN y._id)) RETURN x").toArray();

Output

[
   "actors/Charlize",
   "actors/Al",
   "actors/Laurence",
   "actors/Keanu",
   "actors/Carrie",
   "actors/Emil",
   "actors/Hugo"
]

First Question 2

Second Question

Let us now consider the second question: All actors who acted in both "movie1" AND "movie2". This is almost identical to the question above. But this time we are not interested in a UNION but in an INTERSECTION −

127.0.0.1:8529@_system> db._query("FOR x IN INTERSECTION ((FOR y IN ANY
'movies/TheMatrix' actsIn OPTIONS {bfs: true, uniqueVertices: 'global'} RETURN
y._id), (FOR y IN ANY 'movies/TheDevilsAdvocate' actsIn OPTIONS {bfs: true,
uniqueVertices: 'global'} RETURN y._id)) RETURN x").toArray();

Output

We will receive the following output −

[
   "actors/Keanu"
]

Second Question

Third Question

Let us now consider the third question: All common movies between "actor1" and "actor2". This is actually identical to the question about common actors in movie1 and movie2. We just have to change the starting vertices. As an example, let us find all the movies where Hugo Weaving ("Hugo") and Keanu Reeves are co-starring −

127.0.0.1:8529@_system> db._query(
   "FOR x IN INTERSECTION (
      (
         FOR y IN ANY 'actors/Hugo' actsIn OPTIONS 
         {bfs: true, uniqueVertices: 'global'}
          RETURN y._id
      ),
      
      (
         FOR y IN ANY 'actors/Keanu' actsIn OPTIONS 
         {bfs: true, uniqueVertices:'global'} RETURN y._id
      )
   ) 
   RETURN x").toArray();

Output

We will receive the following output −

[
   "movies/TheMatrixReloaded",
   "movies/TheMatrixRevolutions",
   "movies/TheMatrix"
]

Third Question

Fourth Question

Let us now consider the fourth question. All actors who acted in 3 or more movies. This question is different; we cannot make use of the neighbors function here. Instead we will make use of the edge-index and the COLLECT statement of AQL for grouping. The basic idea is to group all edges by their startVertex (which in this dataset is always the actor). Then we remove all actors with less than 3 movies from the result as here we have included the number of movies an actor has acted in −

127.0.0.1:8529@_system> db._query("FOR x IN actsIn COLLECT actor = x._from WITH
COUNT INTO counter FILTER counter >= 3 RETURN {actor: actor, movies:
counter}"). toArray()

Output

[
   {
      "actor" : "actors/Carrie",
      "movies" : 3
   },
   
   {
      "actor" : "actors/CubaG",
      "movies" : 4
   },

   {
      "actor" : "actors/Hugo",
      "movies" : 3
   },

   {
      "actor" : "actors/Keanu",
      "movies" : 4
   },

   {
      "actor" : "actors/Laurence",
      "movies" : 3
   },

   {
      "actor" : "actors/MegR",
      "movies" : 5
   },

   {
      "actor" : "actors/TomC",
      "movies" : 3
   },
   
   {
      "actor" : "actors/TomH",
      "movies" : 3
   }
]

Fourth Question

For the remaining questions, we will discuss the query formation, and provide the queries only. The reader should run the query themselves on the Arangosh terminal.

Fifth Question

Let us now consider the fifth question: All movies where exactly 6 actors acted in. The same idea as in the query before, but with the equality filter. However, now we need the movie instead of the actor, so we return the _to attribute

db._query("FOR x IN actsIn COLLECT movie = x._to WITH COUNT INTO counter FILTER
counter == 6 RETURN movie").toArray()

The number of actors by movie?

We remember in our dataset _to on the edge corresponds to the movie, so we count how often the same _to appears. This is the number of actors. The query is almost identical to the ones before but without the FILTER after COLLECT

db._query("FOR x IN actsIn COLLECT movie = x._to WITH COUNT INTO counter RETURN
{movie: movie, actors: counter}").toArray()

Sixth Question

Let us now consider the sixth question: The number of movies by an actor.

The way we found solutions to our above queries will help you find the solution to this query as well.

db._query("FOR x IN actsIn COLLECT actor = x._from WITH COUNT INTO counter
RETURN {actor: actor, movies: counter}").toArray()

ArangoDB - How to Deploy

In this chapter, we will describe various possibilities to deploy ArangoDB.

Deployment: Single Instance

We have already learned how to deploy the single instance of the Linux (Ubuntu) in one of our previous chapters. Let us now see how to make the deployment using Docker.

Deployment: Docker

For deployment using docker, we will install Docker on our machine. For more information on Docker, please refer our tutorial on Docker.

Once Docker is installed, you can use the following command −

docker run -e ARANGO_RANDOM_ROOT_PASSWORD = 1 -d --name agdb-foo -d
arangodb/arangodb

It will create and launch the Docker instance of ArangoDB with the identifying name agdbfoo as a Docker background process.

Also terminal will print the process identifier.

By default, port 8529 is reserved for ArangoDB to listen for requests. Also this port is automatically available to all Docker application containers which you may have linked.



Advertisements