Absolute Deviation and Absolute Mean Deviation using NumPy

In statistics, data variability measures how dispersed values are in a sample. Two key measures are Absolute Deviation (difference of each value from the mean) and Mean Absolute Deviation (MAD) (average of all absolute deviations). NumPy provides efficient functions to calculate both.

Formulas

$$\mathrm{Absolute\:Deviation_i = |x_i - \bar{x}|}$$

$$\mathrm{MAD = \frac{1}{n}\sum_{i=1}^{n}|x_i - \bar{x}|}$$

Where xi is each data point, x? is the mean, and n is the sample size.

Absolute Deviation

Calculate the absolute deviation for each element in a data sample ?

from numpy import mean, absolute

data = [12, 42, 53, 13, 112]

# Find mean value
M = mean(data)
print("Sample Mean Value =", M)

# Calculate absolute deviation for each element
print("\nData - Mean = Deviation")
for i in range(len(data)):
    dev = absolute(data[i] - M)
    print(f"  {data[i]:3d} - {M} = {round(dev, 2)}")
Sample Mean Value = 46.4

Data - Mean = Deviation
   12 - 46.4 = 34.4
   42 - 46.4 = 4.4
   53 - 46.4 = 6.6
   13 - 46.4 = 33.4
  112 - 46.4 = 65.6

Values above the mean (53, 112) and below the mean (12, 42, 13) all produce positive deviations after taking the absolute value. Without absolute values, these deviations would sum to zero.

Mean Absolute Deviation (MAD)

MAD is the average of all absolute deviations − a single number summarizing data spread ?

from numpy import mean, absolute

data = [12, 42, 53, 13, 112]

# Find mean value
M = mean(data)
print("Sample Mean Value =", M)

# Calculate MAD manually
total = 0
for i in range(len(data)):
    dev = absolute(data[i] - M)
    total += round(dev, 2)

mad = total / len(data)
print("Mean Absolute Deviation:", mad)
Sample Mean Value = 46.4
Mean Absolute Deviation: 28.88

Shorter NumPy Approach

NumPy can calculate MAD in a single line using vectorized operations ?

from numpy import mean, absolute, array

data = array([12, 42, 53, 13, 112])

# One-line MAD calculation
mad = mean(absolute(data - mean(data)))
print("Mean Absolute Deviation:", round(mad, 2))
Mean Absolute Deviation: 28.88

data - mean(data) computes all deviations at once, absolute() makes them positive, and mean() averages them.

Comparison: MAD vs Standard Deviation

Measure Formula Sensitivity to Outliers
MAD Mean of |xi − x?| Less sensitive (linear)
Standard Deviation Square root of mean of (xi − x?)² More sensitive (squared)

Conclusion

Absolute Deviation measures each data point's distance from the mean, while MAD averages these distances into a single dispersion metric. Use NumPy's vectorized approach (mean(absolute(data - mean(data)))) for efficient one-line calculation. MAD is preferred over standard deviation when your data contains outliers.

Updated on: 2026-03-15T16:38:02+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements