List a few statistical methods available for a NumPy array


In this article, we will show you a list of a few statistical methods of NumPy library in python.

Statistics is dealing with collecting and analyzing data. It describes methods for collecting samples, describing data, and concluding data. NumPy is the core package for scientific calculations, hence NumPy statistical Functions go hand in hand.

Numpy has a number of statistical functions that can be used to do statistical data analysis. Let us discuss a few of them here.

numpy.amin() and numpy.amax()

These functions return the minimum and the maximum from the elements in the given array along the specified axis.

Example

import numpy as np # input array inputArray = np.array([[2,6,3],[1,5,4],[8,12,9]]) print('Input Array is:') print(inputArray) # Printing new line print() print("Minimum element in an array:", np.amin(inputArray)) print() print("Maximum element in an array:", np.amax(inputArray)) print() print('Minimum element in an array among axis 0(rows):') print(np.amin(inputArray, 0)) print('Minimum element in an array among axis 1(columns):') print(np.amin(inputArray, 1)) print() print('Maximum element in an array among axis 0(rows):') print(np.amax(inputArray, 0)) print() print('Maximum element in an array among axis 1(columns):') print(np.amax(inputArray, axis=1)) print()

Output

On executing, the above program will generate the following output −

Input Array is:
[[ 2  6  3]
 [ 1  5  4]
 [ 8 12  9]]

Minimum element in an array: 1

Maximum element in an array: 12

Minimum element in an array among axis 0(rows):
[1 5 3]
Minimum element in an array among axis 1(columns):
[2 1 8]

Maximum element in an array among axis 0(rows):
[ 8 12  9]

Maximum element in an array among axis 1(columns):
[ 6  5 12]

numpy.ptp()

Example

The numpy.ptp() function returns the range (maximum-minimum) of values along an axis. The ptp() is an abbreviation for peak-to-peak.

import numpy as np # input array inputArray = np.array([[2,6,3],[1,5,4],[8,12,9]]) print('Input Array is:') print(inputArray) print() print('The peak to peak(ptp) values of an array') print(np.ptp(inputArray)) print() print('Range (maximum-minimum) of values along axis 1(columns):') print(np.ptp(inputArray, axis = 1)) print() print('Range (maximum-minimum) of values along axis 0(rows):') print(np.ptp(inputArray, axis = 0))

Output

On executing, the above program will generate the following output −

Input Array is:
[[ 2  6  3]
 [ 1  5  4]
 [ 8 12  9]]

The peak to peak(ptp) values of an array
11

Range (maximum-minimum) of values along axis 1(columns):
[4 4 4]

Range (maximum-minimum) of values along axis 0(rows):
[7 7 6]

numpy.percentile()

Percentile (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall.

It computes the nth percentile of data along the given axis.

Syntax

numpy.percentile(a, q, axis)

Parameters

a Input array
q The percentile to compute must be between 0-100
axis The axis along which the percentile is to be calculated

Example

import numpy as np # input array inputArray = np.array([[20,45,70],[30,25,50],[10,80,90]]) print('Input Array is:') print(inputArray) print() print('Applying percentile() function to print 10th percentile:') print(np.percentile(inputArray, 10)) print() print('10th percentile of array along the axis 1(columns):') print(np.percentile(inputArray, 10, axis = 1)) print() print('10th percentile of array along the axis 0(rows):') print(np.percentile(inputArray, 10, axis = 0))

Output

On executing, the above program will generate the following output −

Input Array is:
[[20 45 70]
 [30 25 50]
 [10 80 90]]

Applying percentile() function to print 10th percentile:
18.0

10th percentile of array along the axis 1(columns):
[25. 26. 24.]

10th percentile of array along the axis 0(rows):
[12. 29. 54.]

numpy.median()

Median is defined as the value separating the higher half of a data sample from the lower half.

The numpy.median() function calculates the median of the multi-dimensional or one-dimensional arrays.

Example

import numpy as np # input array inputArray = np.array([[20,45,70],[30,25,50],[10,80,90]]) print('Input Array is:') print(inputArray) print() # printing the median of an array print('Median of an array:') print(np.median(inputArray)) print() print('Median of array along the axis 0(rows):') print(np.median(inputArray, axis = 0) ) print() print('Median of array along the axis 1(columns):') print(np.median(inputArray, axis = 1))

Output

On executing, the above program will generate the following output −

Input Array is:
[[20 45 70]
 [30 25 50]
 [10 80 90]]

Median of an array:
45.0

Median of array along the axis 0(rows):
[20. 45. 70.]

Median of array along the axis 1(columns):
[45. 30. 80.]

numpy.mean()

Arithmetic mean is the sum of elements along an axis divided by the number of elements.

The numpy.mean() function returns the arithmetic mean of elements in the array. If the axis is mentioned, it is calculated along it.

Example

import numpy as np # input array inputArray = np.array([[20,45,70],[30,25,50],[10,80,90]]) print('Input Array is:') print(inputArray) print() # printing the mean of an array print('Mean of an array:') print(np.mean(inputArray)) print() print('Mean of an array along the axis 0(rows):') print(np.mean(inputArray, axis = 0) ) print() print('Mean of an array along the axis 1(columns):') print(np.mean(inputArray, axis = 1))

Output

On executing, the above program will generate the following output −

Input Array is:
[[20 45 70]
 [30 25 50]
 [10 80 90]]

Mean of an array:
46.666666666666664

Mean of an array along the axis 0(rows):
[20. 50. 70.]

Mean of an array along the axis 1(columns):
[45. 35. 60.]

numpy.average()

The numpy.average() function computes the weighted average along the axis of multidimensional arrays whose weights are specified in another array.

The function can have an axis parameter. If the axis is not specified, the array is flattened.

Example

import numpy as np # input array inputArray = np.array([1,2,3,4]) print('Input Array is:') print(inputArray) print() # printing the average of all elements in an array print('Average of all elements in an array:') print(np.average(inputArray)) print()

Output

On executing, the above program will generate the following output −

Input Array is:
[1 2 3 4]

Average of all elements in an array:
2.5

Standard Deviation & Variance

Standard deviation

Standard deviation is the square root of the average of squared deviations from mean. The formula for standard deviation is as follows −

std = sqrt(mean(abs(x - x.mean())**2))

If the array is [1, 2, 3, 4], then its mean is 2.5. Hence the squared deviations are [2.25, 0.25, 0.25, 2.25] and the square root of its mean divided by 4, i.e., sqrt (5/4) is 1.1180339887498949.

Variance

Variance is the average of squared deviations, i.e., mean(abs(x - x.mean())**2). In other words, the standard deviation is the square root of variance.

Example

import numpy as np # input array inputArray= [1,2,3,4] # printing the standard deviation of array print("Input Array =",inputArray) print("Standard deviation of array = ", np.std(inputArray)) # printing the variance of array print("Variance of array = ", np.var(inputArray))

Output

On executing, the above program will generate the following output −

Input Array = [1, 2, 3, 4]
Standard deviation of array =  1.118033988749895
Variance of array =  1.25

Conclusion

By using examples, we studied some of the few statistical methods for a Numpy array in this article.

Updated on: 20-Oct-2022

147 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements