inequality comparison of numpy array with nan to a scalar

Posted on

Question :

inequality comparison of numpy array with nan to a scalar

I am trying to set members of an array that are below a threshold to nan. This is part of a QA/QC process and the incoming data may already have slots that are nan.

So as an example my threshold might be -1000 and hence I would want to set -3000 to nan in the following array

x = np.array([np.nan,1.,2.,-3000.,np.nan,5.])

This following:

x[x < -1000.] = np.nan

produces the correct behavior, but also a RuntimeWarning, but the overhead of disabling the warning


is kind of heavy an potentially a bit unsafe.

Trying to index twice with fancy indexing as follows doesn’t produce any effect:

nonan = np.where(~np.isnan(x))[0]
x[nonan][x[nonan] < -1000.] = np.nan

I assume this is because a copy is made due to the integer index or the use of indexing twice.

Does anyone have a relatively simple solution? It would be fine to use a masked array in the process, but the final product has to be an ndarray and I can’t introduce new dependencies. Thanks.

Asked By: Eli S


Answer #1:

Any comparison (other than !=) of a NaN to a non-NaN value will always return False:

>>> x < -1000
array([False, False, False,  True, False, False], dtype=bool)

So you can simply ignore the fact that there are NaNs already in your array and do:

>>> x[x < -1000] = np.nan
>>> x
array([ nan,   1.,   2.,  nan,  nan,   5.])

EDIT I don’t see any warning when I ran the above, but if you really need to stay away from the NaNs, you can do something like:

mask = ~np.isnan(x)
mask[mask] &= x[mask] < -1000
x[mask] = np.nan
Answered By: Eli S

Answer #2:

One option is to disable the relevant warnings with numpy.errstate:

with numpy.errstate(invalid='ignore'):

To turn off the relevant warnings globally, use numpy.seterr.

Answered By: Jaime

Answer #3:

np.less() has a where argument that controls where the operation will be applied. So you could do:

x[np.less(x, -1000., where=~np.isnan(x))] = np.nan

Answer #4:

I personally ignore the warnings using the np.errstate context manager in the answer already given, as the code clarity is worth the extra time, but here is an alternative.

# given
x = np.array([np.nan, 1., 2., -3000., np.nan, 5.])

# apply NaNs as desired
mask = np.zeros(x.shape, dtype=bool)
np.less(x, -1000, out=mask, where=~np.isnan(x))
x[mask] = np.nan

# expected output and comparison
y = np.array([np.nan, 1., 2., np.nan, np.nan, 5.])
assert np.allclose(x, y, rtol=0., atol=1e-14, equal_nan=True)

The numpy less ufunc takes the optional argument where, and only evaluates it where true, unlike the np.where function which evaluates both options and then picks the relevant one. You then set the desired output when it’s not true by using the out argument.

Answered By: mikewatt

Answer #5:

A little bit late, but this is how I would do:

x = np.array([np.nan,1.,2.,-3000.,np.nan,5.]) 

Answered By: DStauffman

Leave a Reply

Your email address will not be published. Required fields are marked *