# Correlation heatmap

Posted on

### Question :

Correlation heatmap

I want to represent correlation matrix using a heatmap. There is something called correlogram in R, but I don’t think there’s such a thing in Python.

How can I do this? The values go from -1 to 1, for example:

``````[[ 1.          0.00279981  0.95173379  0.02486161 -0.00324926 -0.00432099]
[ 0.00279981  1.          0.17728303  0.64425774  0.30735071  0.37379443]
[ 0.95173379  0.17728303  1.          0.27072266  0.02549031  0.03324756]
[ 0.02486161  0.64425774  0.27072266  1.          0.18336236  0.18913512]
[-0.00324926  0.30735071  0.02549031  0.18336236  1.          0.77678274]
[-0.00432099  0.37379443  0.03324756  0.18913512  0.77678274  1.        ]]
``````

I was able to produce the following heatmap based on another question, but the problem is that my values get ‘cut’ at 0, so I would like to have a map which goes from blue(-1) to red(1), or something like that, but here values below 0 are not presented in an adequate way. Here’s the code for that:

``````plt.imshow(correlation_matrix,cmap='hot',interpolation='nearest')
``````

Another alternative is to use the heatmap function in seaborn to plot the covariance. This example uses the Auto data set from the ISLR package in R (the same as in the example you showed).

``````import pandas.rpy.common as com
import seaborn as sns
%matplotlib inline

# load the R package ISLR
infert = com.importr("ISLR")

# calculate the correlation matrix
corr = auto_df.corr()

# plot the heatmap
sns.heatmap(corr,
xticklabels=corr.columns,
yticklabels=corr.columns)
`````` If you wanted to be even more fancy, you can use Pandas Style, for example:

``````cmap = cmap=sns.diverging_palette(5, 250, as_cmap=True)

def magnify():
return [dict(selector="th",
props=[("font-size", "7pt")]),
dict(selector="td",
dict(selector="th:hover",
props=[("font-size", "12pt")]),
dict(selector="tr:hover td:hover",
props=[('max-width', '200px'),
('font-size', '12pt')])
]

.set_properties(**{'max-width': '80px', 'font-size': '10pt'})
.set_caption("Hover to magify")
.set_precision(2)
.set_table_styles(magnify())
`````` If your data is in a Pandas DataFrame, you can use Seaborn’s `heatmap` function to create your desired plot.

``````import seaborn as sns

Var_Corr = df.corr()
# plot the heatmap and annotation on it
sns.heatmap(Var_Corr, xticklabels=Var_Corr.columns, yticklabels=Var_Corr.columns, annot=True)
``````

`Correlation plot`

From the question, it looks like the data is in a NumPy array. If that array has the name `numpy_data`, before you can use the step above, you would want to put it into a Pandas DataFrame using the following:

``````import pandas as pd
df = pd.DataFrame(numpy_data)
``````

The code below will produce this plot: ``````import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# A list with your data slightly edited
l = [1.0,0.00279981,0.95173379,0.02486161,-0.00324926,-0.00432099,
0.00279981,1.0,0.17728303,0.64425774,0.30735071,0.37379443,
0.95173379,0.17728303,1.0,0.27072266,0.02549031,0.03324756,
0.02486161,0.64425774,0.27072266,1.0,0.18336236,0.18913512,
-0.00324926,0.30735071,0.02549031,0.18336236,1.0,0.77678274,
-0.00432099,0.37379443,0.03324756,0.18913512,0.77678274,1.00]

# Split list
n = 6
data = [l[i:i + n] for i in range(0, len(l), n)]

# A dataframe
df = pd.DataFrame(data)

def CorrMtx(df, dropDuplicates = True):

# If you have a dateset where you need to include the calculation
# of a correlation matrix, just uncomment the line below:
# df = df.corr()

# Exclude duplicate correlations by masking uper right values
if dropDuplicates:

# Set background color / chart style
sns.set_style(style = 'white')

# Set up  matplotlib figure
f, ax = plt.subplots(figsize=(11, 9))

# Add diverging colormap from red to blue
cmap = sns.diverging_palette(250, 10, as_cmap=True)

# Draw correlation plot with or without duplicates
if dropDuplicates:
square=True,
linewidth=.5, cbar_kws={"shrink": .5}, ax=ax)
else:
sns.heatmap(df, cmap=cmap,
square=True,
linewidth=.5, cbar_kws={"shrink": .5}, ax=ax)

CorrMtx(df, dropDuplicates = False)
``````

I put this together after it was announced that the outstanding `seaborn corrplot` was to be deprecated. The snippet above makes a resembling correlation plot based on `seaborn heatmap`. You can also specify the color range and select whether or not to drop duplicate correlations. Notice that I’ve used the same numbers as you, but that I’ve put them in a pandas dataframe. Regarding the choice of colors you can have a look at the documents for sns.diverging_palette. You asked for blue, but that falls out of this particular range of the color scale with your sample data. For both observations of
0.95173379, try changing to -0.95173379 and you’ll get this: 2. Use `pcolor()` with the `vmin`, `vmax` parameters.