I’m trying to implement the binary classification example using the IMDb dataset in Google Colab. I have implemented this model before. But when I tried to do it again after a few days, it returned a
value error: 'Object arrays cannot be loaded when allow_pickle=False' for the load_data() function.
I have already tried solving this, referring to an existing answer for a similar problem: How to fix ‘Object arrays cannot be loaded when allow_pickle=False’ in the sketch_rnn algorithm.
But it turns out that just adding an allow_pickle argument isn’t sufficient.
from keras.datasets import imdb (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)
ValueError Traceback (most recent call last) <ipython-input-1-2ab3902db485> in <module>() 1 from keras.datasets import imdb ----> 2 (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000) 2 frames /usr/local/lib/python3.6/dist-packages/keras/datasets/imdb.py in load_data(path, num_words, skip_top, maxlen, seed, start_char, oov_char, index_from, **kwargs) 57 file_hash='599dadb1135973df5b59232a0e9a887c') 58 with np.load(path) as f: ---> 59 x_train, labels_train = f['x_train'], f['y_train'] 60 x_test, labels_test = f['x_test'], f['y_test'] 61 /usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py in __getitem__(self, key) 260 return format.read_array(bytes, 261 allow_pickle=self.allow_pickle, --> 262 pickle_kwargs=self.pickle_kwargs) 263 else: 264 return self.zip.read(key) /usr/local/lib/python3.6/dist-packages/numpy/lib/format.py in read_array(fp, allow_pickle, pickle_kwargs) 690 # The array contained Python objects. We need to unpickle the data. 691 if not allow_pickle: --> 692 raise ValueError("Object arrays cannot be loaded when " 693 "allow_pickle=False") 694 if pickle_kwargs is None: ValueError: Object arrays cannot be loaded when allow_pickle=False
Here’s a trick to force
imdb.load_data to allow pickle by, in your notebook, replacing this line:
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)
import numpy as np # save np.load np_load_old = np.load # modify the default parameters of np.load np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k) # call load_data with allow_pickle implicitly set to true (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000) # restore np.load for future normal usage np.load = np_load_old
This issue is still up on keras git. I hope it gets solved as soon as possible.
Until then, try downgrading your numpy version to 1.16.2. It seems to solve the problem.
!pip install numpy==1.16.1 import numpy as np
This version of numpy has the default value of
Following this issue on GitHub, the official solution is to edit the imdb.py file. This fix worked well for me without the need to downgrade numpy. Find the imdb.py file at
tensorflow/python/keras/datasets/imdb.py (full path for me was:
C:AnacondaLibsite-packagestensorflowpythonkerasdatasetsimdb.py – other installs will be different) and change line 85 as per the diff:
- with np.load(path) as f: + with np.load(path, allow_pickle=True) as f:
The reason for the change is security to prevent the Python equivalent of an SQL injection in a pickled file. The change above will ONLY effect the imdb data and you therefore retain the security elsewhere (by not downgrading numpy).
I just used allow_pickle = True as an argument to np.load() and it worked for me.
In my case worked with:
I think the answer from cheez (https://stackoverflow.com/users/122933/cheez) is the easiest and most effective one. I’d elaborate a little bit over it so it would not modify a numpy function for the whole session period.
My suggestion is below. I´m using it to download the reuters dataset from keras which is showing the same kind of error:
old = np.load np.load = lambda *a,**k: old(*a,**k,allow_pickle=True) from keras.datasets import reuters (train_data, train_labels), (test_data, test_labels) = reuters.load_data(num_words=10000) np.load = old del(old)
You can try changing the flag’s value
none of the above listed solutions worked for me: i run anaconda with python 3.7.3.
What worked for me was
run “conda install numpy==1.16.1” from Anaconda powershell
close and reopen the notebook