Apache Flink - Flink vs Spark vs Hadoop



Here is a comprehensive table, which shows the comparison between three most popular big data frameworks: Apache Flink, Apache Spark and Apache Hadoop.

Apache Hadoop Apache Spark Apache Flink

Year of Origin

2005 2009 2009

Place of Origin

MapReduce (Google) Hadoop (Yahoo) University of California, Berkeley Technical University of Berlin

Data Processing Engine

Batch Batch Stream

Processing Speed

Slower than Spark and Flink 100x Faster than Hadoop Faster than spark

Programming Languages

Java, C, C++, Ruby, Groovy, Perl, Python Java, Scala, python and R Java and Scala

Programming Model

MapReduce Resilient distributed Datasets (RDD) Cyclic dataflows

Data Transfer

Batch Batch Pipelined and Batch

Memory Management

Disk Based JVM Managed Active Managed

Latency

Low Medium Low

Throughput

Medium High High

Optimization

Manual Manual Automatic

API

Low-level High-level High-level

Streaming Support

NA Spark Streaming Flink Streaming

SQL Support

Hive, Impala SparkSQL Table API and SQL

Graph Support

NA GraphX Gelly

Machine Learning Support

NA SparkML FlinkML
Advertisements