State Farm Full: how not to run Out of Memory with VGG + (da_batches.samples*5)?

Hello !

I run my own server Ubuntu 16.02 + Theano 0.9 + Keras 2.0 and Python 3.6.
Hardware is i5 4690K + 16GB RAM + 50GB SWAP + GTX 1080 Ti 11GB VRAM

When running State Farm notebook on full data (ie. not Sample), I get stuck in the “Pre-Computed Data Augmentation + Dropout” section,
when trying to “create a dataset of convolutional features 5x bigger than training set”?

I use the following code:

%time da_conv_feat = conv_model.predict_generator(da_batches, (da_batches.samples*5), workers=3)

I get a “Kernel died” message every time, after maybe 15-20 minutes of computing.

When looking at System Monitor, even when starting the notebook from scratch just to compute that line, I go from 3GB RAM and 1GB SWAP used slowly but surely to 16GB RAM + 50GB SWAP, then notebook’s kernel dies.
:head_bandage:

The CPU is working at 95% due to Workers=3 and the GTX 1080 TI varies between 15 and 100%, VRAM not exceeeding 45%.

Also it seems outrageously massive as a task since in the part before, “Imagenet Conv features”, I use (batches.samples / batch size) with batch_size = 64.
Since (batches.samples*1) alone runs OOM already.

And here we are trying (batches.samples * 5), a 320 multiplier :astonished:

Did anyone manage to run that code with full dataset on your own PC server ?

The original full code is here:https://github.com/fastai/courses/blob/master/deeplearning1/nbs/statefarm.ipynb

Eric

1 Like

your batch tries to augment all of your images at once and when you add *5, it is getting too big. I think augmentation is happening in CPU And your 16GB of RAM can’t handle it.

look at https://keras.io/preprocessing/image/
try to do it manually in smaller batches that your machine can handle

for e in range(epochs):
    print('Epoch', e)
    batches = 0
    for x_batch, y_batch in datagen.flow(x_train, y_train, batch_size=32):
        model.fit(x_batch, y_batch)
        batches += 1
        if batches >= len(x_train) / 32:
            # we need to break the loop by hand because
            # the generator loops indefinitely
            break

I put all the data in an variable first and then use this, but you should be able to read the batches from directory

3 Likes

Neat, thanks for the tip Bahram !

I was simply following @jeremy instructions to "Just start my notebooks and hit shift-enter until you get an error, then make a post, wait for the answer and you’ll win every Kaggle competition from now on, EZ-PZ"
Not ! :sunglasses:

Eric

1 Like