The kernel appears to have died. Runnig notebook on large dataset of images


(Said Aspen) #1

I am trying to run the notebook from Lesson 1 and 2 on a very big data set of images. The notebook seems to die without any error logging when using ImageClassifierData.from_csv

This is what I run.

This part works fine:

tfms = tfms_from_model(resnet50, 199, aug_tfms=transforms_side_on, max_zoom=1.1)

But this part does not:

train_csv = PATH + "train.csv"
data = ImageClassifierData.from_csv(PATH, 'train', train_csv, test_name='test', num_workers=1, 
                  val_idxs=val_idxs, suffix='.jpg', tfms=tfms, bs=16)

After a minute or two I get this:

Nothing is ever written to the console.

Does anyone know whats going on? I believe it might be the fact that it is a very very large data-set of images (~150 Gb big).


(Susant Bisoi) #2

I am getting often the same issues.


(WILLIAM PRIDE) #3

I was able to resolve this issue by re-installing PyTorch from source as described here

Cheers,
Will


(Said Aspen) #4

Tried that and updated the nvidia drivers as well as Cuda. Let’s see how it works


(Ankit) #5

Even i am getting the same issue, were you able to resolve the issue of “Kernels appears to have died.” ?

I am on AWS p2.xlarge and using the below code but even with small bs of 16 and sz of 65, it goes out of memory. Any suggestions on how to handle this?