Integer overflow in numpy arrays

Posted on

Question :

Integer overflow in numpy arrays
import numpy as np
a = np.arange(1000000).reshape(1000,1000)

With this code I get this answer. Why do I get negative values?

[[         0          1          4 ...,     994009     996004     998001]
 [   1000000    1002001    1004004 ...,    3988009    3992004    3996001]
 [   4000000    4004001    4008004 ...,    8982009    8988004    8994001]
 [1871554624 1873548625 1875542628 ..., -434400663 -432404668 -430408671]
 [-428412672 -426416671 -424420668 ..., 1562593337 1564591332 1566589329]
 [1568587328 1570585329 1572583332 ..., -733379959 -731379964 -729379967]]
Asked By: kame


Answer #1:

On your platform, np.arange returns an array of dtype ‘int32’ :

In [1]: np.arange(1000000).dtype
Out[1]: dtype('int32')

Each element of the array is a 32-bit integer. Squaring leads to a result which does not fit in 32-bits. The result is cropped to 32-bits and still interpreted as a 32-bit integer, however, which is why you see negative numbers.

Edit: In this case, you can avoid the integer overflow by constructing an array of dtype ‘int64’ before squaring:


Note that the problem you’ve discovered is an inherent danger when working with numpy. You have to choose your dtypes with care and know before-hand that your code will not lead to arithmetic overflows. For the sake of speed, numpy can not and will not warn you when this occurs.

See for a discussion of this on the numpy mailing list.

Answered By: unutbu

Answer #2:

python integers don’t have this problem, since they automatically upgrade to python long integers when they overflow.

so if you do manage to overflow the int64’s, one solution is to use python int’s in the numpy array:

import numpy

Answered By: suki

Answer #3:

numpy integer types are fixed width and you are seeing the results of integer overflow.

Answer #4:

A solution to this problem is as follows (taken from here):

…change in class StringConverter._mapper (numpy/lib/ from:

 _mapper = [(nx.bool_, str2bool, False),
            (nx.integer, int, -1),
            (nx.floating, float, nx.nan),
            (complex, _bytes_to_complex, nx.nan + 0j),
            (nx.string_, bytes, asbytes('???'))]


 _mapper = [(nx.bool_, str2bool, False),
            (nx.int64, int, -1),
            (nx.floating, float, nx.nan),
            (complex, _bytes_to_complex, nx.nan + 0j),
            (nx.string_, bytes, asbytes('???'))]

This solved a similar problem that I had with numpy.genfromtxt for me

Note that the author describes this as a ‘temporary’ and ‘not optimal’ solution. However, I have had no side effects using v2.7 (yet?!).

Answered By: atomh33ls

Leave a Reply

Your email address will not be published. Required fields are marked *