# Compute Variance-Covariance Matrix using Python

**PyShark**, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)

Want to share your content on python-bloggers? click here.

In this article we will discuss how to compute a variance-covariance matrix using Python.

**Table of Contents**

- Introduction
- Variance-covariance matrix explained
- Create a sample DataFrame
- Compute variance-covariance matrix
- Conclusion

## Introduction

A variance-covariance matrix is a square matrix (has the same number of rows and columns) that gives the covariance between each pair of elements available in the data.

Covariance measures the extent to which to variables move in the same direction.

In the variance-covariance matrix, variances of variables appear on the diagonal and covariances of variables are all other elements of the matrix.

To continue following this tutorial we will need the following Python library: pandas.

If you don’t have them installed, please open “Command Prompt” (on Windows) and install them using the following code:

pip install pandas

## Variance-Covariance Matrix Explained

A covariance matrix is:

- Symmetric

The square matrix is equal to its transpose: \( A = A^T \). - Positive semi-definite
- With main diagonal containing the variances (covariances of variables on themselves)

$$

cov_{x,y,z} = \left[ \begin{array}{ccc}

cov_{x,x} & cov_{x,y} & cov_{x,z} \\

cov_{y,x} & cov_{y,y} & cov_{y,z} \\

cov_{z,x} & cov_{z,y} & cov_{z,z}

\end{array} \right]

= \left[ \begin{array}{ccc}

\sigma^2_{x} & \sigma_{xy} & \sigma_{xz} \\

\sigma_{yx} & \sigma^2_{y} & \sigma_{yz} \\

\sigma_{zx} & \sigma_{zy} & \sigma^2_{z}

\end{array} \right]

$$

where each covariance can be computed by the following formula (replacing x, y, z values):

$$

cov_{x,y} = E[(X – E[X])(Y – E[Y])] = \frac{\sum(x_i – \bar{x})(y_i – \bar{y})}{N-1}

$$

## Create a sample DataFrame

Let’s create a sample Pandas DataFrame with three variables: Age, Experience, Salary with a few observations for each:

import pandas as pd df = pd.DataFrame( {'Age': [25, 32, 37], 'Experience': [2, 6, 9], 'Salary': [2000, 3000, 3500]} ) print(df)

And we get:

Age Experience Salary 0 25 2 2000 1 32 6 3000 2 37 9 3500

## Compute variance-covariance matrix using Python

Using the **.cov()** method of the Pandas DataFrame we are are able to compute the variance-covariance matrix using Python:

cov_matrix = df.cov() print(cov_matrix)

And we get:

Age Experience Salary Age 36.333333 21.166667 4583.333333 Experience 21.166667 12.333333 2666.666667 Salary 4583.333333 2666.666667 583333.333333

## Conclusion

In this article we discussed how to compute a variance-covariance matrix using Python.

Feel free to leave comments below if you have any questions or have suggestions for some edits and check out more of my Statistics articles.

The post Compute Variance-Covariance Matrix using Python appeared first on PyShark.

**leave a comment**for the author, please follow the link and comment on their blog:

**PyShark**.

Want to share your content on python-bloggers? click here.