Tutorialspoint

Leap Year Sale! Use code FEB10 to get an extra 10% off

Delta Lake with Apache Spark using Scala

person icon Bigdata Engineer

4

Delta Lake with Apache Spark using Scala

Delta Lake with Apache Spark using Scala on Databricks platform

updated on icon Updated on Mar, 2024

language icon Language - English

person icon Bigdata Engineer

category icon Development,Apache Spark,Scala

Lectures -53

Resources -2

Duration -2 hours

4

price-loader

30-days Money-Back Guarantee

Training 5 or more people ?

Get your team access to 9000+ top Tutorials Point courses anytime, anywhere.

Course Description

You will Learn Delta Lake with Apache Spark using Scala on DataBricks Platform

Learn the latest Big Data Technology - Spark! And learn to use it with one of the most popular programming languages, Scala!

One of the most valuable technology skills is the ability to analyze huge data sets, and this course is specifically designed to bring you up to speed on one of the best technologies for this task, Apache Spark! The top technology companies like Google, Facebook, Netflix, Airbnb, Amazon, NASA, and more are all using Spark to solve their big data problems!

Spark can perform up to 100x faster than Hadoop MapReduce, which has caused an explosion in demand for this skill! Because the Spark 3.0 DataFrame framework is so new, you now have the ability to quickly become one of the most knowledgeable people in the job market!

Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. Delta Lake runs on top of your existing data lake and is fully compatible with Apache Spark APIs.

Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.

Topics Included in the Courses

  • Introduction to Delta Lake

  • Introduction to Data Lake

  • Key Features of Delta Lake

  • Introduction to Spark

  • Free Account creation in Databricks

  • Provisioning a Spark Cluster

  • Basics about notebooks

  • Dataframes

  • Create a table

  • Write a table

  • Read a table

  • Schema validation

  • Update table schema

  • Table Metadata

  • Delete from a table

  • Update a Table

  • Vacuum

  • History

  • Concurrency Control

  • Optimistic concurrency control

  • Migrate Workloads to Delta Lake

  • Optimize Performance with File Management

  • Auto Optimize

  • Optimize Performance with Caching

  • Delta and Apache Spark caching

  • Cache a subset of the data

  • Isolation Levels

  • Best Practices

  • Frequently Asked Question in Interview 

About Databricks: 

Databricks lets you start writing Spark code instantly so you can focus on your data problems.

Goals

What will you learn in this course:

  • You will be able to learn Delta Lake with Apache Spark in few hours
  • Basics to Advance Level of Knowledge about Delta Lake
  • Hands on practice with Delta Lake
  • You will Learn Delta Lake with Apache Spark using Scala on DataBricks Platform
  • Learn how to leverage the power of Delta Lake with a Spark Environment!
  • Learn about the DataBricks Platform!

Prerequisites

What are the prerequisites for this course?

  • Apache Spark and Scala and SQL basic knowledge is necessary for this course
Delta Lake with Apache Spark using Scala

Curriculum

Check out the detailed breakdown of what’s inside the course

Introduction
52 Lectures
  • play icon Course Introduction 03:21 03:21
  • play icon Introduction to Delta Lake 01:30 01:30
  • play icon Introduction to Data Lake 01:09 01:09
  • play icon Key Features of Delta Lake 04:57 04:57
  • play icon Elements of Delta Lake 03:18 03:18
  • play icon Introduction to Spark 04:04 04:04
  • play icon (Old) Free Account creation in Databricks 01:51 01:51
  • play icon (New) Free Account creation in Databricks 01:50 01:50
  • play icon Provisioning a Spark Cluster 02:14 02:14
  • play icon Basics about notebooks 07:29 07:29
  • play icon Dataframes 04:47 04:47
  • play icon Download Code and Files
  • play icon (Hands On) Create a table 06:38 06:38
  • play icon (Hands On) Write a table 14:12 14:12
  • play icon (Hands On) Read a table 06:52 06:52
  • play icon Schema validation 02:49 02:49
  • play icon (Hands On) Update table schema 03:01 03:01
  • play icon Table Metadata 01:53 01:53
  • play icon Delete from a table 01:44 01:44
  • play icon Update a Table 02:10 02:10
  • play icon Vacuum 01:59 01:59
  • play icon History 01:34 01:34
  • play icon Concurrency Control 01:08 01:08
  • play icon Optimistic concurrency control 02:33 02:33
  • play icon Migrate Workloads to Delta Lake 05:23 05:23
  • play icon Optimize Performance with File Management 01:13 01:13
  • play icon Auto Optimize 02:45 02:45
  • play icon Optimize Performance with Caching 01:11 01:11
  • play icon Delta and Apache Spark caching 03:26 03:26
  • play icon Cache a subset of the data 01:37 01:37
  • play icon Isolation Levels 01:06 01:06
  • play icon Best Practices 02:56 02:56
  • play icon FAQ (Interview Question on Optimization) 1 01:47 01:47
  • play icon FAQ (Interview Question on Optimization) 2 01:50 01:50
  • play icon FAQ (Interview Question on Optimization) 3 00:51 00:51
  • play icon FAQ (Interview Question on Auto Optimize) 4 00:50 00:50
  • play icon FAQ (Interview Question on Auto Optimize) 5 01:06 01:06
  • play icon FAQ (Interview Question) 6 01:06 01:06
  • play icon FAQ (Interview Question) 7 00:37 00:37
  • play icon FAQ (Interview Question) 8 00:42 00:42
  • play icon FAQ (Interview Question) 9 00:20 00:20
  • play icon FAQ (Interview Question) 10 00:25 00:25
  • play icon FAQ (Interview Question) 11 00:28 00:28
  • play icon FAQ (Interview Question) 12 00:27 00:27
  • play icon FAQ (Interview Question) 13 00:43 00:43
  • play icon FAQ (Interview Question) 14 00:55 00:55
  • play icon FAQ (Interview Question) 15 01:39 01:39
  • play icon FAQ (Interview Question) 16 00:31 00:31
  • play icon FAQ (Interview Question) 17 00:32 00:32
  • play icon FAQ (Interview Question) 18 01:00 01:00
  • play icon FAQ (Interview Question) 19 01:25 01:25
  • play icon Thank you 00:20 00:20

Instructor Details

Bigdata Engineer

Bigdata Engineer

I am Solution Architect with 12+ year’s of experience in Banking, Telecommunication and Financial Services industry across a diverse range of roles in Credit Card, Payments, Data Warehouse and Data Center programmes

My role as Bigdata and Cloud Architect to work as part of Bigdata team to provide Software Solution.

Responsibilities includes,

- Support all Hadoop related issues
- Benchmark existing systems, Analyse existing system challenges/bottlenecks and Propose right solutions to eliminate them based on various Big Data technologies
- Analyse and Define pros and cons of various technologies and platforms
- Define use cases, solutions and recommendations
- Define Big Data strategy
- Perform detailed analysis of business problems and technical environments
- Define pragmatic Big Data solution based on customer requirements analysis
- Define pragmatic Big Data Cluster recommendations
- Educate customers on various Big Data technologies to help them understand pros and cons of Big Data
- Data Governance
- Build Tools to improve developer productivity and implement standard practices

I am sure the knowledge in these courses can give you extra power to win in life.

All the best!!

Course Certificate

User your certification to make a career change or to advance in your current career. Salaries are among the highest in the world.

sample Tutorialspoint certificate

Our students work
with the Best

Related Video Courses

View More

Annual Membership

Become a valued member of Tutorials Point and enjoy unlimited access to our vast library of top-rated Video Courses

Subscribe now
Annual Membership

Online Certifications

Master prominent technologies at full length and become a valued certified professional.

Explore Now
Online Certifications

Talk to us

1800-202-0515