Python Program to Calculate Standard Deviation


In this article, we will learn how to implement a python program to calculate standard deviation on a dataset.

Consider a set of values plotted on any coordinate axes. Standard deviation of these set of values, called population, is defined as the variation seen among them. If the standard deviation is low, the values are plotted closely to the mean. But if the standard deviation is high, the values are dispersed farther from the mean.

It is denoted by square root of the variance of a dataset. There are two types of standard deviations −

The population standard deviation is calculated from every data value of a population. Hence, it is a fixed value. The mathematical formula is defined as −

$$\mathrm{SD\:=\:\sqrt{\frac{\sum(X_i\:-\:X_m)^2}{n}}}$$

Where,

  • Xm is the mean of a dataset.

  • Xi is the elements of the dataset.

  • n is the number of elements in the dataset.

However, the sample standard deviation is a statistic calculated only on some datum values of a population, hence the value depends upon the sample chosen. The mathematical formula is defined as −

$$\mathrm{SD\:=\:\sqrt{\frac{\sum(X_i\:-\:X_m)^2}{n\:-\:1}}}$$

Where,

  • Xm is the mean of a dataset.

  • Xi is the elements of the dataset.

  • n is the number of elements in the dataset.

Input Output Scenarios

Let us now look at some input output scenarios for various sets of data −

Assume the dataset only contains positive integers −

Input: [2, 3, 4, 1, 2, 5]
Result: Population Standard Deviation: 1.3437096247164249
Sample Standard Deviation: 0.8975274678557505

Assume the dataset only contains negative integers −

Input: [-2, -3, -4, -1, -2, -5]
Result: Population Standard Deviation: 1.3437096247164249
Sample Standard Deviation: 0.8975274678557505

Assume the dataset only contains positive and negative integers −

Input: [-2, -3, -4, 1, 2, 5]
Result: Population Standard Deviation: 3.131382371342656
Sample Standard Deviation: 2.967415635794143

Using Mathematical Formula

We have seen the formula of standard deviation above in the same article; now let us look at the python program to implement the mathematical formula on various datasets.

Example

In the following example, we are importing the math library and calculating the standard deviation of the dataset by applying sqrt() built-in method on its variance.

import math #declare the dataset list dataset = [2, 3, 4, 1, 2, 5] #find the mean of dataset sm=0 for i in range(len(dataset)): sm+=dataset[i] mean = sm/len(dataset) #calculating population standard deviation of the dataset deviation_sum = 0 for i in range(len(dataset)): deviation_sum+=(dataset[i]- mean)**2 psd = math.sqrt((deviation_sum)/len(dataset)) #calculating sample standard deviation of the dataset ssd = math.sqrt((deviation_sum)/len(dataset) - 1) #display output print("Population standard deviation of the dataset is", psd) print("Sample standard deviation of the dataset is", ssd)

Output

The output standard deviation obtained is as follows −

Population standard deviation of the dataset is 1.3437096247164249
Sample standard deviation of the dataset is 0.8975274678557505

Using std() function in numpy module

In this approach, we import the numpy module and only population standard deviation is calculated using the numpy.std() function on the elements of a numpy array.

Example

The following python program is implemented to calculate the standard deviation on the elements of a numpy array −

import numpy as np #declare the dataset list dataset = np.array([2, 3, 4, 1, 2, 5]) #calculating standard deviation of the dataset sd = np.std(dataset) #display output print("Population standard deviation of the dataset is", sd)

Output

The standard deviation is displayed as the following output −

Population standard deviation of the dataset is 1.3437096247164249

Using stdev() and pstdev() Functions in statistics module

The statistics module in python provides functions called stdev() and pstdev() to calculate the standard deviation of a sample dataset. The stdev() function in python only calculates the sample standard deviation whereas the pstdev() function calculates the population standard deviation.

The parameters and return type for both functions is the same.

Example 1: Using stdev() Function

The python program to demonstrate the usage of stdev() function to find the sample standard deviation of a dataset is as follows −

import statistics as st #declare the dataset list dataset = [2, 3, 4, 1, 2, 5] #calculating standard deviation of the dataset sd = st.stdev(dataset) #display output print("Standard Deviation of the dataset is", sd)

Output

The sample standard deviation of the dataset obtained as an output is as follows −

Standard Deviation of the dataset is 1.4719601443879744

Example 2: Using pstdev() Function

The python program to demonstrate the usage of pstdev() function to find the population standard deviation of a dataset is as follows −

import statistics as st #declare the dataset list dataset = [2, 3, 4, 1, 2, 5] #calculating standard deviation of the dataset sd = st.pstdev(dataset) #display output print("Standard Deviation of the dataset is", sd)

Output

The sample standard deviation of the dataset obtained as an output is as follows −

Standard Deviation of the dataset is 1.3437096247164249

Updated on: 26-Oct-2022

10K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements