Hi,
I am trying to run the notebook for lesson 3 on a standard Hetzner ex51-ssd-gpu system (GTX 1080 [8G], 64G of RAM) but I keep getting this OOM error when running this cell:
final_model.fit_generator(batches, samples_per_epoch=batches.nb_sample, nb_epoch=1,
validation_data=val_batches, nb_val_samples=val_batches.nb_sample)
Error message:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[32,64,226,226]
[[Node: Conv2D_40 = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](transpose_156, transpose_157)]]
[[Node: mul_64/_1211 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_2232_mul_64", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Jeremy noted somewhere that tensorflow could be made using GPU memory more efficiently, so in my first cell I added this option (didnât help):
def limit_mem():
K.get_session().close()
cfg = K.tf.ConfigProto()
cfg.gpu_options.allow_growth = True
K.set_session(K.tf.Session(config=cfg))
limit_mem()
I also tried reducing the batch numbers for the fitting like so (didnât help):
final_model.fit_generator(batches, samples_per_epoch=batches.nb_sample/2, nb_epoch=1,
validation_data=val_batches, nb_val_samples=val_batches.nb_sample/2)
nvidia-smi output:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 381.22 Driver Version: 381.22 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 0000:01:00.0 Off | N/A |
| 33% 33C P8 10W / 180W | 7843MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 29731 C /usr/bin/python2 7829MiB |
+-----------------------------------------------------------------------------+
Any other ideas I could try?