Tutorialspoint
PySpark Tutorial

PySpark Tutorial

Simply Easy Learning

   Formats - PDF

   Pages - 24,      ISBN - TP00434

   Development, Programming Languages, Python

  Language - English

   Published on 01/2015

price-loader

Description

Apache Spark is written in Scala programming language. To support Python with Spark, Apache Spark community released a tool, PySpark. Using PySpark, you can work with RDDs in Python programming language also. It is because of a library called Py4j that they are able to achieve this.

This is an introductory tutorial, which covers the basics of Data-Driven Documents and explains how to deal with its various components and sub-components.

Audience

This tutorial is prepared for those professionals who are aspiring to make a career in programming language and real-time processing framework. This tutorial is intended to make the readers comfortable in getting started with PySpark along with its various modules and submodules.

Prerequisites

Before proceeding with the various concepts given in this tutorial, it is being assumed that the readers are already aware about what a programming language and a framework is. In addition to this, it will be very helpful, if the readers have a sound knowledge of Apache Spark, Apache Hadoop, Scala Programming Language, Hadoop Distributed File System (HDFS) and Python.

No Datials Available

We make use of cookies to improve our user experience. By using this website, you agree with our Cookies Policy.