Hello all, I’m going through this MOOC on a laptop with a GTX 960 GPU and 16GB CPU RAM. I was able to get through lesson 1 by setting the batch size down to 10, but hit a wall in lesson 2 when running the get_data helper function on the training set. I was able to run get_data for the validation set but it looks to take up 5GB of RAM. I went ahead and used bcolz to save the validation data and deleted the variable to free up some memory, but the training data still uses up all 16 GB and 8GB of swap memory before my computer hangs and I have to REISUB and restart.
Looking at htop I see that only one of the four cores in my computer is working at 100% when running the get_data function. Looking at this thread Numpy core affinity , I thought it might be an issue with Importing Numpy and Scipy messing with core affinity but the issue was supposedly resolved in newer versions and I have all the newest versions of Scipy and Numpy installed.
16GB isn’t enough to use get_data() for this data set. So just use fit_generator instead on the batches (get_data is simply a little time-saver for pre-calculating the resized images and keeping them in RAM).
Thanks for your reply. I was able to fit the model using just the batches; however I ran into an issue when evaluating the model on validation data.
This line model.evaluate(val_batches, val_labels)
gives me the error Exception: Error when checking model input: data should be a Numpy array, or list/dict of Numpy arrays. Found: <keras.preprocessing.image.DirectoryIterator object at 0x7fba7d106390>..
It looks like the get_batches() return a iterator unlike load_array() which returns a numpy list.
I think this is similar to the error I’m having in notebook2. So I’m a little confused on how to implement where to implement fit_generator. Is it going to be directly on trn_data from the line of code:
trn_data = get_batches(path+‘train’, shuffle=False, batch_size=1, class_mode=None, target_size=(224,224)) if so what are the parameters we need to pass in? I appreciate your help.
@jeremy Hi Jeremy
I saw your suggestion with regards to ‘fit_generator’ function.
I’ve tried, but I think I am doing something wrong about it - something is totally missing in my understanding.
Now I am getting dimensions mismatch errors (the errors and notebook is here)
I am running it on p2.xlarge on AWS.
Does everyone have the memory issue on it or something is wrong with my server setup?
To my limited understanding Correct me if I’m wrong.
The fit_generator is a replacement for the model’s fit function. Some of the arguments have been moved around ore replaced because it’s using a generator to get its data instead of being handed the data directly.
@katya The linear model is supposed to go from a vector of length 1000 (the predictions of the vgg imagenet model) to a vector of length 2 (the dogs and cat categories).
But instead you are feeding it with “batches” which is the imagepreprocessor and hence you are giving the lm model an input of shape (1,3,224,224), which is in fact the input of the vgg16 model.
Basically what you are expected to do is:
IMAGES (3,224,224) – pretrained- vgg16 —> 1000 vector of probability for each imagenet category
then, once you have these predictions use a linear model to:
1000vector ---- lm.fit ----> 2 vector of probability for cat and dog
but what you are incorrectly doing is:
IMAGES (3,224,224) — lm ----> 2 vector of probability for cat and dog
since “lm” expects a 1000 vector (as defined in your code line: lm = Sequential([Dense(2, activation=‘softmax’, input_shape=(1000,))]) you receive the error message.
I also have 16GB and had an issue with get_data(), but I didn’t want to deviate to much from the lesson’s notebook by using fit_generator. So instead, I constructed the predictions on the training images with a for loop. Here is my code:
Thanks for the tip - this definitely helped, but took about an hour to build the trn_features array. I think something weird might be going on, because trn_features.shape returned (184000, 1000), instead of (23,000, 1000). So, when I run lm.fit, there’s a mismatch between the inputs and targets. Anyone else run into this issue?
Thanks - I noticed this perfect 8x multiplication also. It took a little bit of diving, but I fixed it! I probably should’ve changed one thing at a time, so I know where the problem was, but I didn’t want to wait and changed any batch_size references (within reason) from 8 to 1. Thanks for the help. It still took a while to load all the images, but once loaded, the model trained insanely quickly! This lesson has had me stuck for a while, but at least I’m learning and getting more familiar with the programs as I go along.
Jeremy suggested using fit_generator and get_batches to replace get_data when people have the issue with ram. can fit_generator and get_batches be used with read_csv for multilabel? is there an example of that? (i did searched in the github, most examples i see are not multilabel)