Getting Started with Python

In the first chapter, we have learnt what web scraping is all about. In this chapter, let us see how to implement web scraping using Python.

Why Python for Web Scraping?

Python is a popular tool for implementing web scraping. Python programming language is also used for other useful projects related to cyber security, penetration testing as well as digital forensic applications. Using the base programming of Python, web scraping can be performed without using any other third party tool.

Python programming language is gaining huge popularity and the reasons that make Python a good fit for web scraping projects are as below −

Syntax Simplicity

Python has the simplest structure when compared to other programming languages. This feature of Python makes the testing easier and a developer can focus more on programming.

Inbuilt Modules

Another reason for using Python for web scraping is the inbuilt as well as external useful libraries it possesses. We can perform many implementations related to web scraping by using Python as the base for programming.

Open Source Programming Language

Python has huge support from the community because it is an open source programming language.

Wide range of Applications

Python can be used for various programming tasks ranging from small shell scripts to enterprise web applications.

Installation of Python

Python distribution is available for platforms like Windows, MAC and Unix/Linux. We need to download only the binary code applicable for our platform to install Python. But in case if the binary code for our platform is not available, we must have a C compiler so that source code can be compiled manually.

We can install Python on various platforms as follows −

Installing Python on Unix and Linux

You need to followings steps given below to install Python on Unix/Linux machines −

Step 1 − Go to the link

Step 2 − Download the zipped source code available for Unix/Linux on above link.

Step 3 − Extract the files onto your computer.

Step 4 − Use the following commands to complete the installation −

run ./configure script
make install

You can find installed Python at the standard location /usr/local/bin and its libraries at /usr/local/lib/pythonXX, where XX is the version of Python.

Installing Python on Windows

You need to followings steps given below to install Python on Windows machines −

Step 1 − Go to the link

Step 2 − Download the Windows installer python-XYZ.msi file, where XYZ is the version we need to install.

Step 3 − Now, save the installer file to your local machine and run the MSI file.

Step 4 − At last, run the downloaded file to bring up the Python install wizard.

Installing Python on Macintosh

We must use Homebrew for installing Python 3 on Mac OS X. Homebrew is easy to install and a great package installer.

Homebrew can also be installed by using the following command −

$ ruby -e "$(curl -fsSL"

For updating the package manager, we can use the following command −

$ brew update

With the help of the following command, we can install Python3 on our MAC machine −

$ brew install python3

Setting Up the PATH

You can use the following instructions to set up the path on various environments −

Setting Up the Path on Unix/Linux

Use the following commands for setting up paths using various command shells −

For csh shell

setenv PATH "$PATH:/usr/local/bin/python".

For bash shell (Linux)


For sh or ksh shell


Setting Up the Path on Windows

For setting the path on Windows, we can use the path %path%;C:\Python at the command prompt and then press Enter.

Running Python

We can start Python using any of the following three ways −

Interactive Interpreter

An operating system such as UNIX and DOS that is providing a command-line interpreter or shell can be used for starting Python.

We can start coding in interactive interpreter as follows −

Step 1 − Enter python at the command line.

Step 2 − Then, we can start coding right away in the interactive interpreter.

$python # Unix/Linux
python% # Unix/Linux
C:> python # Windows/DOS

Script from the Command-line

We can execute a Python script at command line by invoking the interpreter. It can be understood as follows −

$python # Unix/Linux
python% # Unix/Linux
C: >python # Windows/DOS

Integrated Development Environment

We can also run Python from GUI environment if the system is having GUI application that is supporting Python. Some IDEs that support Python on various platforms are given below −

IDE for UNIX − UNIX, for Python, has IDLE IDE.

IDE for Windows − Windows has PythonWin IDE which has GUI too.

IDE for Macintosh − Macintosh has IDLE IDE which is downloadable as either MacBinary or BinHex'd files from the main website.