# Using numpy to build an array of all combinations of two arrays

Posted on

Solving problem is about exposing yourself to as many situations as possible like Using numpy to build an array of all combinations of two arrays and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about Using numpy to build an array of all combinations of two arrays, which can be followed any time. Take easy to follow this discuss.

Using numpy to build an array of all combinations of two arrays

I’m trying to run over the parameters space of a 6 parameter function to study its numerical behavior before trying to do anything complex with it, so I’m searching for an efficient way to do this.

My function takes float values given in a 6-dim numpy array as input. What I tried to do initially was this:

First, I created a function that takes 2 arrays and generate an array with all combinations of values from the two arrays:

``````from numpy import *
def comb(a,b):
c = []
for i in a:
for j in b:
c.append(r_[i,j])
return c
``````

Then, I used `reduce()` to apply that to m copies of the same array:

``````def combs(a,m):
return reduce(comb,[a]*m)
``````

Finally, I evaluate my function like this:

``````values = combs(np.arange(0,1,0.1),6)
for val in values:
print F(val)
``````

This works but it’s way too slow. I know the space of parameters is huge, but this shouldn’t be so slow. I have only sampled 106 (a million) points in this example and it took more than 15 seconds just to create the array `values`.

Do you know any more efficient way of doing this with numpy?

I can modify the way the function `F` takes it’s arguments if it’s necessary.

In newer version of `numpy` (>1.8.x), `numpy.meshgrid()` provides a much faster implementation:

@pv’s solution

``````In [113]:
%timeit cartesian(([1, 2, 3], [4, 5], [6, 7]))
10000 loops, best of 3: 135 µs per loop
In [114]:
cartesian(([1, 2, 3], [4, 5], [6, 7]))
Out[114]:
array([[1, 4, 6],
[1, 4, 7],
[1, 5, 6],
[1, 5, 7],
[2, 4, 6],
[2, 4, 7],
[2, 5, 6],
[2, 5, 7],
[3, 4, 6],
[3, 4, 7],
[3, 5, 6],
[3, 5, 7]])
``````

`numpy.meshgrid()` use to be 2D only, now it is capable of ND. In this case, 3D:

``````In [115]:
%timeit np.array(np.meshgrid([1, 2, 3], [4, 5], [6, 7])).T.reshape(-1,3)
10000 loops, best of 3: 74.1 µs per loop
In [116]:
np.array(np.meshgrid([1, 2, 3], [4, 5], [6, 7])).T.reshape(-1,3)
Out[116]:
array([[1, 4, 6],
[1, 5, 6],
[2, 4, 6],
[2, 5, 6],
[3, 4, 6],
[3, 5, 6],
[1, 4, 7],
[1, 5, 7],
[2, 4, 7],
[2, 5, 7],
[3, 4, 7],
[3, 5, 7]])
``````

Note that the order of the final resultant is slightly different.

Here’s a pure-numpy implementation. It’s about 5× faster than using itertools.

``````
import numpy as np
def cartesian(arrays, out=None):
"""
Generate a cartesian product of input arrays.
Parameters
----------
arrays : list of array-like
1-D arrays to form the cartesian product of.
out : ndarray
Array to place the cartesian product in.
Returns
-------
out : ndarray
2-D array of shape (M, len(arrays)) containing cartesian products
formed of input arrays.
Examples
--------
>>> cartesian(([1, 2, 3], [4, 5], [6, 7]))
array([[1, 4, 6],
[1, 4, 7],
[1, 5, 6],
[1, 5, 7],
[2, 4, 6],
[2, 4, 7],
[2, 5, 6],
[2, 5, 7],
[3, 4, 6],
[3, 4, 7],
[3, 5, 6],
[3, 5, 7]])
"""
arrays = [np.asarray(x) for x in arrays]
dtype = arrays[0].dtype
n = np.prod([x.size for x in arrays])
if out is None:
out = np.zeros([n, len(arrays)], dtype=dtype)
m = n / arrays[0].size
out[:,0] = np.repeat(arrays[0], m)
if arrays[1:]:
cartesian(arrays[1:], out=out[0:m, 1:])
for j in xrange(1, arrays[0].size):
out[j*m:(j+1)*m, 1:] = out[0:m, 1:]
return out
``````

itertools.combinations is in general the fastest way to get combinations from a Python container (if you do in fact want combinations, i.e., arrangements WITHOUT repetitions and independent of order; that’s not what your code appears to be doing, but I can’t tell whether that’s because your code is buggy or because you’re using the wrong terminology).

If you want something different than combinations perhaps other iterators in itertools, `product` or `permutations`, might serve you better. For example, it looks like your code is roughly the same as:

``````for val in itertools.product(np.arange(0, 1, 0.1), repeat=6):
print F(val)
``````

All of these iterators yield tuples, not lists or numpy arrays, so if your F is picky about getting specifically a numpy array you’ll have to accept the extra overhead of constructing or clearing and re-filling one at each step.

You can do something like this

``````import numpy as np
def cartesian_coord(*arrays):
grid = np.meshgrid(*arrays)
coord_list = [entry.ravel() for entry in grid]
points = np.vstack(coord_list).T
return points
a = np.arange(4)  # fake data
print(cartesian_coord(*6*[a])
``````

which gives

``````array([[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 2],
...,
[3, 3, 3, 3, 3, 1],
[3, 3, 3, 3, 3, 2],
[3, 3, 3, 3, 3, 3]])
``````

The following numpy implementation should be approx. 2x the speed of the given answer:

``````def cartesian2(arrays):
arrays = [np.asarray(a) for a in arrays]
shape = (len(x) for x in arrays)
ix = np.indices(shape, dtype=int)
ix = ix.reshape(len(arrays), -1).T
for n, arr in enumerate(arrays):
ix[:, n] = arrays[n][ix[:, n]]
return ix
``````

It looks like you want a grid to evaluate your function, in which case you can use `numpy.ogrid` (open) or `numpy.mgrid` (fleshed out):

``````import numpy
my_grid = numpy.mgrid[[slice(0,1,0.1)]*6]
``````

you can use `np.array(itertools.product(a, b))`

Here’s yet another way, using pure NumPy, no recursion, no list comprehension, and no explicit for loops. It’s about 20% slower than the original answer, and it’s based on np.meshgrid.

``````def cartesian(*arrays):
mesh = np.meshgrid(*arrays)  # standard numpy meshgrid
dim = len(mesh)  # number of dimensions
elements = mesh[0].size  # number of elements, any index will do
flat = np.concatenate(mesh).ravel()  # flatten the whole meshgrid
reshape = np.reshape(flat, (dim, elements)).T  # reshape and transpose
return reshape
``````

For example,

``````x = np.arange(3)
a = cartesian(x, x, x, x, x)
print(a)
``````

gives

``````[[0 0 0 0 0]
[0 0 0 0 1]
[0 0 0 0 2]
...,
[2 2 2 2 0]
[2 2 2 2 1]
[2 2 2 2 2]]
``````