Machine Learning with Apache Spark 3.0 using Scala
Created by Bigdata Engineer, Last Updated 15-Jan-2021, Language:English
Machine Learning with Apache Spark 3.0 using Scala
Machine Learning with Apache Spark 3.0 using Scala with Examples and 4 Projects
Created by Bigdata Engineer, Last Updated 15-Jan-2021, Language:English
What Will I Get ?
- Fundamental knowledge on Machine Learning with Apache Spark using Scala
- Learn and master the art of Machine Learning through hands-on projects, and then execute them up to run on Databricks cloud computing services (Free Service) in this course.
- You will Build Apache Spark Machine Learning Projects (Total 4 Projects)
- Explore Apache Spark and Machine Learning on the Databricks platform.
- Launching Spark Cluster
- Create a Data Pipeline
- Process that data using a Machine Learning model (Spark ML Library)
- Hands-on learning
- Real-time Use Case
Requirements
- Some programming experience is required and Scala fundamental knowledge is also required.
- Fundamental Spark Knowledge mandatory
Description
Machine Learning with Apache Spark 3.0 using Scala with Examples and Project
“Big data" analysis is a hot and highly valuable skill – and this course will teach you the hottest technology in big data: Apache Spark. Employers including Amazon, eBay, NASA, Yahoo, and many more. All are using Spark to quickly extract meaning from massive data sets across a fault-tolerant Hadoop cluster. You'll learn those same techniques, using your own Operating system right at home.
So, What are we going to cover in this course then?
Learn and master the art of Machine Learning through hands-on projects, and then execute them up to run on Databricks cloud computing services (Free Service) in this course. Well, the course is covering topics:
1) Overview
2) What is Spark ML
3) Types of Machine Learning
4) Steps Involved in the Machine learning program
5) Basic Statics
6) Data Sources
7) Pipelines
8) Extracting, transforming and selecting features
9) Classification and Regression
10) Clustering
Projects:
1) Will it Rain Tomorrow in Australia
2) Railway train arrival delay prediction
3) Predict the class of the Iris flower based on available attributes
4) Mall Customer Segmentation (K-means) Cluster
In order to get started with the course And to do that you're going to have to set up your environment.
So, the first thing you're going to need is a web browser that can be (Google Chrome or Firefox, or Safari, or Microsoft Edge (Latest version)) on Windows, Linux, and macOS desktop
This is completely Hands-on Learning with the Databricks environment.
Course Content
-
Introduction
4 Lectures 00:19:54-
Introduction
Preview00:07:14 -
Overview
00:00:59 -
What is Spark ML?
Preview00:03:13 -
Introduction to Machine Learning
00:08:28
-
-
Apache Spark Basics (Optional)
11 Lectures 01:31:42-
Introduction to Spark
Preview00:07:21 -
Free Account creation in Databricks
00:01:51 -
Provisioning a Spark Cluster
00:02:14 -
Basics about notebooks
00:07:29 -
Why we should learn Apache Spark?
00:03:08 -
Spark RDD (Create and Display Practical)
00:18:21 -
Spark Dataframe (Create and Display Practical)
00:12:06 -
Anonymus Functions in Scala
00:04:38 -
Extra (Optional on Spark DataFrame)
00:04:47 -
Extra (Optional on Spark DataFrame) in Details
00:12:46 -
Spark Datasets (Create and Display Practical)
00:17:01
-
-
Apache Spark Machine Learning
52 Lectures 05:49:39-
Types of Machine Learning
00:01:56 -
Steps Involved in Machine Learning Program
00:02:50 -
Spark MLlib
00:02:07 -
Importing Notebook and Data Upload
00:01:42 -
Basic statistics Correlation
Preview00:02:40 -
Data Sources
00:00:38 -
Data Source CSV File
Preview00:08:53 -
Data Source JSON File
00:06:21 -
Data Source LIBSVM File
00:03:52 -
Data Source Image File
Preview00:04:44 -
Data Source Arvo File
00:02:22 -
Data Source Parquet File
00:02:50 -
Machine Learning Data Pipeline Overview
00:10:54 -
Machine Learning Project as an Example (Just for Basic Idea)
00:01:22 -
Machine Learning Pipeline Example Project (Will it Rain Tomorrow in Australia) 1
Preview00:09:16 -
Machine Learning Pipeline Example Project (Will it Rain Tomorrow in Australia) 2
00:13:09 -
Machine Learning Pipeline Example Project (Will it Rain Tomorrow in Australia) 3
00:07:46 -
Components of a Machine Learning Pipeline
00:05:17 -
Extracting, transforming and selecting features
00:01:00 -
TF-IDF (Feature Extractor)
00:09:47 -
Word2Vec (Feature Extractor)
00:04:48 -
CountVectorizer (Feature Extractor)
00:04:09 -
FeatureHasher (Feature Extractor)
00:05:15 -
Tokenizer (Feature Transformers)
00:06:25 -
StopWordsRemover (Feature Transformers)
00:04:01 -
n-gram (Feature Transformers)
00:03:41 -
Binarizer (Feature Transformers)
00:03:53 -
PCA (Feature Transformers)
00:03:36 -
Polynomial Expansion (Feature Transformers)
00:03:31 -
Discrete Cosine Transform (DCT) (Feature Transformers)
00:03:05 -
StringIndexer (Feature Transformers)
00:03:07 -
IndexToString (Feature Transformers)
00:02:51 -
OneHotEncoder (Feature Transformers)
00:02:38 -
SQLTransformer (Feature Transformers)
00:03:12 -
VectorAssembler (Feature Transformers)
00:03:40 -
RFormula (Feature Selector)
00:04:02 -
ChiSqSelector (Feature Selector)
00:04:38 -
Classification Model
00:02:43 -
Decision tree classifier Project
00:23:03 -
Logistic regression Model (Classification Model It has regression in the name)
00:13:48 -
Naive Bayes Project (Iris flower class prediction)
00:19:43 -
Random Forest Classifier Project
00:09:01 -
Gradient-boosted tree classifier Project
00:13:33 -
Linear Support Vector Machine Project
00:11:18 -
One-vs-Rest classifier (a.k.a. One-vs-All) Project
00:12:01 -
Regression Model
00:01:42 -
Linear Regression Model Project
00:12:36 -
Decision tree regression Model Project
00:14:47 -
Random forest regression Model Project
00:12:59 -
Gradient-boosted tree regression Model Project
00:13:25 -
Clustering KMeans Project (Mall Customer Segmentation)
00:20:28 -
Explanation of few terms used in Model
00:02:34
-
-
Download Resources
2 Lectures 00:00:20-
Download Resources
-
Thank you
00:00:20
-

Bigdata Engineer
I am Solution Architect with 12+ year’s of experience in Banking, Telecommunication and Financial Services industry across a diverse range of roles in Credit Card, Payments, Data Warehouse and Data Center programmes
My role as Bigdata and Cloud Architect to work as part of Bigdata team to provide Software Solution.
Responsibilities includes,
- Support all Hadoop related issues
- Benchmark existing systems, Analyse existing system challenges/bottlenecks and Propose right solutions to eliminate them based on various Big Data technologies
- Analyse and Define pros and cons of various technologies and platforms
- Define use cases, solutions and recommendations
- Define Big Data strategy
- Perform detailed analysis of business problems and technical environments
- Define pragmatic Big Data solution based on customer requirements analysis
- Define pragmatic Big Data Cluster recommendations
- Educate customers on various Big Data technologies to help them understand pros and cons of Big Data
- Data Governance
- Build Tools to improve developer productivity and implement standard practices
I am sure the knowledge in these courses can give you extra power to win in life.
All the best!!