 
- Hazelcast - Home
- Hazelcast - Introduction
- Hazelcast - Setup
- Hazelcast - First Application
- Hazelcast - Configuration
- Setting up multi-node instances
- Hazelcast - Data Structures
- Hazelcast - Client
- Hazelcast - Serialization
- Hazelcast Advanced
- Hazelcast - Spring Integration
- Hazelcast - Monitoring
- Map Reduce & Aggregations
- Hazelcast - Collection Listener
- Common Pitfalls & Performance Tips
- Hazelcast Useful Resources
- Hazelcast - Quick Guide
- Hazelcast - Useful Resources
- Hazelcast - Discussion
Hazelcast - Map Reduce & Aggregations
MapReduce is a computation model which is useful for data processing when you have lots of data and you need multiple machines, i.e., a distributed environment to calculate data. It involves 'map'ing of data into key-value pairs and then 'reducing', i.e., grouping these keys and performing operation on the value.
Given the fact that Hazelcast is designed keeping a distributed environment in mind, implementing Map-Reduce Frameworks comes naturally to it.
Lets see how to do it with an example.
For example, let's suppose we have data about a car (brand & car number) and the owner of that car.
Honda-9235, John Hyundai-235, Alice Honda-935, Bob Mercedes-235, Janice Honda-925, Catnis Hyundai-1925, Jane
And now, we have to figure out the number of cars for each brand, i.e., Hyundai, Honda, etc.
Example
Let's try to find that out using MapReduce −
package com.example.demo;
import java.lang.reflect.Array;
import java.util.ArrayList;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.atomic.AtomicInteger;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;
import com.hazelcast.core.ICompletableFuture;
import com.hazelcast.core.IMap;
import com.hazelcast.mapreduce.Context;
import com.hazelcast.mapreduce.Job;
import com.hazelcast.mapreduce.JobTracker;
import com.hazelcast.mapreduce.KeyValueSource;
import com.hazelcast.mapreduce.Mapper;
import com.hazelcast.mapreduce.Reducer;
import com.hazelcast.mapreduce.ReducerFactory;
public class MapReduce {
   public static void main(String[] args) throws ExecutionException,
   InterruptedException {
      try {
         // create two Hazelcast instances
         HazelcastInstance hzMember = Hazelcast.newHazelcastInstance();
         Hazelcast.newHazelcastInstance();
         IMap<String, String> vehicleOwnerMap=hzMember.getMap("vehicleOwnerMap");
         vehicleOwnerMap.put("Honda-9235", "John");
         vehicleOwnerMap.putc"Hyundai-235", "Alice");
         vehicleOwnerMap.put("Honda-935", "Bob");
         vehicleOwnerMap.put("Mercedes-235", "Janice");
         vehicleOwnerMap.put("Honda-925", "Catnis");
         vehicleOwnerMap.put("Hyundai-1925", "Jane");
         KeyValueSource<String, String> kvs=KeyValueSource.fromMap(vehicleOwnerMap);
         JobTracker tracker = hzMember.getJobTracker("vehicleBrandJob");
         Job<String, String> job = tracker.newJob(kvs);
         ICompletableFuture<Map<String, Integer>> myMapReduceFuture =
            job.mapper(new BrandMapper())
            .reducer(new BrandReducerFactory()).submit();
         Map<String, Integer&g; result = myMapReduceFuture.get();
         System.out.println("Final output: " + result);
      } finally {
         Hazelcast.shutdownAll();
      }
   }
   private static class BrandMapper implements Mapper<String, String, String, Integer> {
      @Override
      public void map(String key, String value, Context<String, Integer>
      context) {
         context.emit(key.split("-", 0)[0], 1);
      }
   }
   private static class BrandReducerFactory implements ReducerFactory<String, Integer, Integer> {
      @Override
      public Reducer<Integer, Integer> newReducer(String key) {
         return new BrandReducer();
      }
   }
   private static class BrandReducer extends Reducer<Integer, Integer> {
      private AtomicInteger count = new AtomicInteger(0);
      @Override
      public void reduce(Integer value) {
         count.addAndGet(value);
      }
      @Override
      public Integer finalizeReduce() {
         return count.get();
      }
   }
}
Lets try to understand this code −
- We create Hazelcast members. In the example, we have a single member, but there can well be multiple members.
- We create a map using dummy data and create a Key-Value store out of it. 
- We create a Map-Reduce job and ask it to use the Key-Value store as the data. 
- We then submit the job to cluster and wait for completion. 
- The mapper creates a key, i.e., extracts brand information from the original key and sets the value to 1 and then emits that information as K-V to the reducer. 
- The reducer simply sums the value, grouping the data, based on key, i.e., brand name. 
Output
The output of the code −
Final output: {Mercedes=1, Hyundai=2, Honda=3}