I’m doing segmentation using a unet model with standard configuration and a dice metric. Training correctly outputs the train_loss, valid_loss, and dice when running on 1.0.46. After upgrading to 1.0.49 the valid_loss is always #nan# and dice values are not output.
Machine configuration:
=== Software ===
python : 3.7.2
fastai : 1.0.49
fastprogress : 0.1.20
torch : 1.0.0
torch cuda : 9.0 / is available
torch cudnn : 7005 / is enabled
=== Hardware ===
torch devices : 2
- gpu0 : GeForce GTX 1080 Ti
- gpu1 : GeForce GTX 1080 Ti
=== Environment ===
platform : Windows-10-10.0.16299-SP0
conda env : fastai_v1
python : C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1\python.exe
sys.path : C:\Windows\system32
C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1\python37.zip
C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1\DLLs
C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1\lib
C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1
C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1\lib\site-packages
no nvidia-smi is found