Apache Flink - Batch vs Real-time Processing

Quiz

In terms of Big Data, there are two types of processing −

Batch Processing
Real-time Processing

Processing based on the data collected over time is called Batch Processing. For example, a bank manager wants to process past one-month data (collected over time) to know the number of cheques that got cancelled in the past 1 month.

Processing based on immediate data for instant result is called Real-time Processing. For example, a bank manager getting a fraud alert immediately after a fraud transaction (instant result) has occurred.

The table given below lists down the differences between Batch and Real-Time Processing −

Batch Processing	Real-Time Processing
Static Files	Event Streams
Processed Periodically in minute, hour, day etc.	Processed immediately nanoseconds
Past data on disk storage	In Memory Storage
Example − Bill Generation	Example − ATM Transaction Alert

These days, real-time processing is being used a lot in every organization. Use cases like fraud detection, real-time alerts in healthcare and network attack alert require real-time processing of instant data; a delay of even few milliseconds can have a huge impact.

An ideal tool for such real time use cases would be the one, which can input data as stream and not batch. Apache Flink is that real-time processing tool.

Print Page