How to Calculate Covariance in MATLAB


In this article, we will explore how to calculate covariance using MATLAB programming. But before that let’s have a look into the basic theory of covariance and importance.

What is Covariance?

Covariance is a statistical tool used to describe the correlation between two or more random variables. In other words, covariance is a measure that gives information about relationship between two or more variables.

The covariable is primarily used to quantify the changes in one variable with respect to changes in another variables.

The covariance between two random variables ‘A’ and ‘B’ is specified as cov(A, B), and it can be calculated as follows:

$$\mathrm{cov(A,B)\:=\:\frac{1}{k}\displaystyle\sum\limits_{i=1}^k (a_i−E(A))(b_i−E(B))}$$

Where, E(A) and E(B) are the expected values or means of random variables A and B respectively.

The value of covariance can be either positive or negative or zero, and they represent different types of correlations or relationships among the random variables.

The following points give the information about nature of relationship between the random variables depending on the value of covariance:

  • Positive Covariance − A positive value of covariance between the random variables specifies that when one variable increases, then another one also tends to increase. Hence, the positive covariance between the variables gives a positive linear relationship between them.

  • Negative Covariance − A negative value of covariance specifies that when one variable increases, the second one tends to decrease. Hence, the negative covariance represents a negative linear relationship between the variables.

  • Zero Covariance − When the value of covariance between two random variables is zero, it represents that there is no linear relationship between the variables. Though, this does not mean that there is no relationship at all between the variables. They could have a nonlinear relationship.

The covariance is widely used in the field of statistics, data analysis, and finance, as it helps to analyze and understand the relationship between different random variables, measure their dependencies, estimate risk and diversification, etc. The covariance is also used in the field of machine learning to analyze and develop data model.

After getting a brief overview of covariance, let us now discuss its implementation in MATLAB programming. MATLAB provides a built−in function namely ‘cov()’ that is used to calculate covariance between random variables.

The following sections describe different syntaxes of the ‘cov()’ function in MATLAB and their applications in MATLAB programming to calculate covariance.

Calculate Covariance of an Array

The following syntax is used to calculate the covariance of an array and obtain a covariance matrix:

C = cov(X);

The implementation of this syntax is demonstrated in the following program.

Example

% MATLAB code to calculate covariance of an array
% Define an array
X = [2 4 6; 8 10 12; 14 16 18];
% Calculate covariance of the array
C = cov(X);
% Display the covariance result
disp('The covariance of the array X is:');
disp(C);

Output

The covariance of the array X is:
   36   36   36
   36   36   36
   36   36   36

Code Explanation

In this MATLAB program, firstly we define an array ‘X’. Then, we use the ‘cov’ function to calculate its covariance and store the result in the ‘C’ variable. Finally, we display the result by using the ‘disp’ function.

Calculate Covariance of Two Arrays

To calculate the covariance of two arrays, we use the following syntax of the ‘cov’ function:

C = cov(X, Y);

Let us understand implementation of this syntax in MATLAB programming.

Example

% MATLAB code to calculate covariance of two arrays
% Define two input arrays
X = [2 4 6; -3 5 7];
Y = [8 10 12; 2 -7 6];
% Calculate covariance of arrays
C = cov(X, Y);
% Display the covariance result
disp('The covariance of the arrays X and Y is:');
disp(C);

Output

The covariance of the arrays X and Y is:
   15.0000   42.5000   15.0000
   -3.0000   -8.5000   -3.0000
   -3.0000   -8.5000   -3.0000

Calculate Covariance of an Array by Normalization

The following syntax of the ‘cov’ function is used to calculate the covariance of an array by normalizing it with a weight ‘W’:

C = cov(X, W);

Here, if W = 1, the covariance matrix is normalized by the “number of rows in the input array”, and if W = 0, the covariance matrix is normalized by the “number of rows in the input array minus 1”.

The following MATLAB program demonstrates the implementation of this syntax.

Example

% MATLAB code to calculate covariance of an array with normalization
% Define the input array
X = [2 4 6; 3 5 7; 9 8 5];
% Calculate covariance of array with normalization by W = 1
C1 = cov(X, 1);
% Calculate covariance of array with normalization by W = 0
C2 = cov(X, 0);
% Display the covariance result
disp('The covariance of the array X with W = 1 is:');
disp(C1);
disp('The covariance of the array X with W = 0 is:');
disp(C2);

Output

The covariance of the array X with W = 1 is:
    9.5556    5.2222   -2.0000
    5.2222    2.8889   -1.0000
   -2.0000   -1.0000    0.6667

The covariance of the array X with W = 0 is:
   14.3333    7.8333   -3.0000
    7.8333    4.3333   -1.5000
   -3.0000   -1.5000    1.0000

Code Explanation

In this MATLAB code, we start by defining an array ‘X’. Then, we calculate covariance of the ‘X’ by normalizing it by with ‘W = 1’ and ‘W = 0’ and store the results in ‘C1’ and ‘C2’ respectively. Finally, we use the ‘disp’ function to display the covariance matrices.

Calculate Covariance of an Array with NaN Value

When the given array contains NaN values, then we the following syntax of the ‘cov’ function is used to calculate the covariance of the array:

C = cov(X, nanflag);

Here, the ‘nanflag’ option specified how to handle the NaN values of the array in covariance calculation.

If nanflag = ‘includenan’, then the ‘cov’ function considers the NaN values of the array in the calculation.

If nanflag = ‘omitrows’, the ‘cov’ function omits the NaN values in the calculation.

The following MATLAB program demonstrate the use of the ‘cov’ function with ‘nanflag’ option.

Example

% MATLAB code to calculate covariance of an array with nanflag option
% Define the input array with NaN values
X = [2 4 5; 3 5 7; NaN 7 NaN];
% Calculate covariance of array with includenan flag
C1 = cov(X, 'includenan');
% Calculate covariance of array with omitrows flag
C2 = cov(X, 'omitrows');
% Display the covariance results
disp('The covariance of the array with includenan is:');
disp(C1);
disp('The covariance of the array with omitrows is:');
disp(C2);

Output

The covariance of the array with includenan is:
       NaN       NaN       NaN
       NaN    2.3333       NaN
       NaN       NaN       NaN

The covariance of the array with omitrows is:
    0.5000    0.5000    1.0000
    0.5000    0.5000    1.0000
    1.0000    1.0000    2.0000

Code Explanation

In this MATLAB program, we start by defining an array ‘X’ with NaN values. Then, we calculate covariance of the ‘X’ with including the NaN values and omitting NaN values and store the results in ‘C1’ and ‘C2’ respectively. Finally, we use the ‘disp’ function to display the covariance matrices.

Updated on: 07-Aug-2023

96 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements