Hazelcast - Introduction

Distributed In-memory Data Grid

A data grid is a superset to distributed cache. Distributed cache is typically used only for storing and retrieving key-value pairs which are spread across caching servers. However, a data grid, apart from supporting storage of key-value pairs, also supports other features, for example,

It supports other data structures like locks, semaphores, sets, list, and queues.
It provides a way to query the stored data by rich querying languages, for example, SQL.
It provides a distributed execution engine which helps to operate on the data in parallel.

Benefits of Hazelcast

Support multiple data structures − Hazelcast supports the usage of multiple data structures along with Map. Some of the examples are Lock, Semaphore, Queue, List, etc.
Fast R/W access − Given that all the data is in-memory, Hazelcast offers very high-speed data read/write access.
High availability − Hazelcast supports the distribution of data across machines along with additional support for backup. This means that the data is not stored on a single machine. So, even if a machine goes down, which occurs frequently in a distributed environment, the data is not lost.
High Performance − Hazelcast provides constructs which can be used to distribute the workload/computation/query among multiple worker machines. This means a computation/query uses resources from multiple machines which reduces the execution time drastically.
Easy to use − Hazelcast implements and extends a lot of java.util.concurrent constructs which make it very easy to use and integrate with the code. Configuration to start using Hazelcast on a machine just involves adding the Hazelcast jar to our classpath.

Hazelcast vs Other Caches & Key-Value stores

Comparing Hazelcast with other caches like Ehcache, Guava, and Caffeine may not be very useful. It is because, unlike other caches, Hazelcast is a distributed cache, that is, it spreads the data across machines/JVM. Although Hazelcast can work very well on single JVM as well, however, it is more useful is a distributed environment.

Similarly comparing it with Databases like MongoDB is also of not much use. This is because, Hazelcast mostly stores data in memory (although it also supports writing to disk). So, it offers high R/W speed with the limitation that data needs to be stored in memory.

Hazelcast also supports caching/storing complex data types and provides an interface to query them, unlike other data stores.

A comparison, however, can be made with Redis which also offers similar features.

Hazelcast vs Redis

In terms of features, both Redis and Hazelcast are very similar. However, following are the points where Hazelcast scores over Redis −

Built for Distributed Environment from ground-up − Unlike Redis, which started as single machine cache, Hazelcast, from the very beginning, has been built for distributed environment.
Simple cluster scale in/out − Maintaining a cluster where nodes are added or removed is very simple in case of Hazelcast, for example, adding a node is a matter of launching the node with the required configuration. Removing a node requires simple shutting down of the node. Hazelcast automatically handles partitioning of data, etc. Having the same setup for Redis and performing the above operation requires more precaution and manual efforts.
Less resources needs to support failover − Redis follows master-slave approach. For failover, Redis requires additional resources to setup Redis Sentinel. These Sentinel nodes are responsible to elevate a slave to master if the original master node goes down. In Hazelcast, all nodes are treated equal, failure of a node is detected by other nodes. So, the case of a node going down is handled pretty transparently and that too without any additional set of monitoring servers.
Simple Distributed Compute − Hazelcast, with its EntryProcessor, provides a simple interface to send the code to the data for parallel processing. This reduces data transfer over the wire. Redis also supports this, however, achieving this requires one to be aware of Lua scripting which adds additional learning curve.