Tutorialspoint

Real Time Spark Project for Beginners: Hadoop, Spark, Docker

Building Real Time Data Pipeline Using Apache Kafka, Apache Spark, Hadoop, PostgreSQL, Django and Flexmonster on Docker

  Pari Margu

   Development, Development Tools, Docker

Language - English Published on 12/2020

Description

  • In many data centers, different type of servers generate large amount of data(events, Event in this case is status of the server in the data center) in real-time.
  • There is always a need to process these data in real-time and generate insights which will be used by the server/data center monitoring people and they have to track these server's status regularly and find the resolution in case of issues occurring, for better server stability.
  • Since the data is huge and coming in real-time, we need to choose the right architecture with scalable storage and computation frameworks/technologies.
  • Hence we want to build the Real Time Data Pipeline Using Apache Kafka, Apache Spark, Hadoop, PostgreSQL, Django and Flexmonster on Docker to generate insights out of this data.
  • The Spark Project/Data Pipeline is built using Apache Spark with Scala and PySpark on Apache Hadoop Cluster which is on top of Docker.
  • Data Visualization is built using Django Web Framework and Flexmonster.

Who this course is for:

  • Beginners who want to learn Apache Spark/Big Data Project Development Process and Architecture
  • Beginners who want to learn Real Time Streaming Data Pipeline Development Process and Architecture
  • Entry/Intermediate level Data Engineers and Data Scientist
  • Data Engineering and Data Science Aspirants
  • Data Enthusiast who want to learn, how to develop and run Spark Application on Docker
  • Anyone who is really willingness to become Big Data/Spark Developer

What Will I Get ?

  • Complete Development of Real Time Streaming Data Pipeline using Hadoop and Spark Cluster on Docker
  • Setting up Single Node Hadoop and Spark Cluster on Docker
  • Features of Spark Structured Streaming using Spark with Scala
  • Features of Spark Structured Streaming using Spark with Python(PySpark)
  • How to use PostgreSQL with Spark Structured Streaming
  • Basic understanding of Apache Kafka
  • How to build Data Visualisation using Django Web Framework and Flexmonster
  • Fundamentals of Docker and Containerization

Requirements

  • Basic understanding of Programming Language
  • Basic understanding of Apache Hadoop
  • Basic understanding of Apache Spark

    Feedbacks (0)

  • No Feedbacks Posted Yet..!

We make use of cookies to improve our user experience. By using this website, you agree with our Cookies Policy.