Hi,
This is the source code, the initial training of the FC layer before unfreezing
learn = ConvLearner.pretrained(f_model, data, ps=ps, xtra_fc=xtra_fc,
precompute=False, metrics=metrics)
learn.fit(lr, 3, cycle_len=1)
Key to this is the data loader:
return ImageClassifierData.from_csv(path, f'{train_class}/train', labels_file_multi_dev, bs, tfms,
suffix='', val_idxs=val_idxs,
test_name=f'{train_class}/test', num_workers=30)
The initial augmentation starts, then data starts to be fed to the GPU. However, as there is more augmentation always occuring, the memory usage jumps up, around 500MB every few seconds
This time around, I can’t reproduce it. 2 differences:
- CPU utilisation never went above 800%, before it was at 2000%
- Memory usage maxed out at 50GB
It makes sense this process would use a lot of RAM, however I would think it could release some as these are random image being generated (unless they are being cached so they can be resent to the model?). So it would be great to try figure this out and get the RAM usage down (for example if you had 2 GPUs on your own box with 64GB RAM, this would prevent the 2nd GPU being used)