# NumPy Broadcasting: Calculating sum of squared differences between two arrays

Posted on

### Question :

NumPy Broadcasting: Calculating sum of squared differences between two arrays

I have the following code. It is taking forever in Python. There must be a way to translate this calculation into a broadcast…

``````def euclidean_square(a,b):
squares = np.zeros((a.shape,b.shape))
for i in range(squares.shape):
for j in range(squares.shape):
diff = a[i,:] - b[j,:]
sqr = diff**2.0
squares[i,j] = np.sum(sqr)
return squares
``````

You can use `np.einsum` after calculating the differences in a `broadcasted way`, like so –

``````ab = a[:,None,:] - b
out = np.einsum('ijk,ijk->ij',ab,ab)
``````

Or use `scipy's cdist` with its optional metric argument set as `'sqeuclidean'` to give us the squared euclidean distances as needed for our problem, like so –

``````from scipy.spatial.distance import cdist
out = cdist(a,b,'sqeuclidean')
``````

I collected the different methods proposed here, and in two other questions, and measured the speed of the different methods:

``````import numpy as np
import scipy.spatial
import sklearn.metrics

def dist_direct(x, y):
d = np.expand_dims(x, -2) - y
return np.sum(np.square(d), axis=-1)

def dist_einsum(x, y):
d = np.expand_dims(x, -2) - y
return np.einsum('ijk,ijk->ij', d, d)

def dist_scipy(x, y):
return scipy.spatial.distance.cdist(x, y, "sqeuclidean")

def dist_sklearn(x, y):
return sklearn.metrics.pairwise.pairwise_distances(x, y, "sqeuclidean")

def dist_layers(x, y):
res = np.zeros((x.shape, y.shape))
for i in range(x.shape):
res += np.subtract.outer(x[:, i], y[:, i])**2
return res

# inspired by the excellent https://github.com/droyed/eucl_dist
def dist_ext1(x, y):
nx, p = x.shape
x_ext = np.empty((nx, 3*p))
x_ext[:, :p] = 1
x_ext[:, p:2*p] = x
x_ext[:, 2*p:] = np.square(x)

ny = y.shape
y_ext = np.empty((3*p, ny))
y_ext[:p] = np.square(y).T
y_ext[p:2*p] = -2*y.T
y_ext[2*p:] = 1

return x_ext.dot(y_ext)

# https://stackoverflow.com/a/47877630/648741
def dist_ext2(x, y):
return np.einsum('ij,ij->i', x, x)[:,None] + np.einsum('ij,ij->i', y, y) - 2 * x.dot(y.T)
``````

I use `timeit` to compare the speed of the different methods. For the comparison, I use vectors of length 10, with 100 vectors in the first group, and 1000 vectors in the second group.

``````import timeit

p = 10
x = np.random.standard_normal((100, p))
y = np.random.standard_normal((1000, p))

for method in dir():
if not method.startswith("dist_"):
continue
t = timeit.timeit(f"{method}(x, y)", number=1000, globals=globals())
print(f"{method:12} {t:5.2f}ms")
``````

On my laptop, the results are as follows:

``````dist_direct   5.07ms
dist_einsum   3.43ms
dist_ext1     0.20ms  <-- fastest
dist_ext2     0.35ms
dist_layers   2.82ms
dist_scipy    0.60ms
dist_sklearn  0.67ms
``````

While the two methods `dist_ext1` and `dist_ext2`, both based on the idea of writing `(x-y)**2` as `x**2 - 2*x*y + y**2`, are very fast, there is a downside: When the distance between `x` and `y` is very small, due to cancellation error the numerical result can sometimes be (very slightly) negative.

``````difference_squared = np.zeros((a.shape, b.shape))