Possible error: seed not working with unet learner

@muellerzr I have tried it. However, after loading a saved model lr_find is showing a different plot each time that I execute the cell.

@WaterKnight see Sylvain’s response here:

Why would two runs give the same plot? LR Finder runs a mock training that has some randomness + the head of the model is randomly initialized. Unless you go out of your way to set the seeds before the two runs, you won’t get the same graphs/suggestions exactly.

Along with this you should probably re-seed between each call to lr_find as well to be safe. Also please do not make duplicate topics. We’re not ignoring it, we are trying to figure it out ourselves.

Sorry, my bad.

Even if the model is loaded previously, it adds some randomness to it??

You are just throwing two lines of codes without explaining to us what you are doing, so we are replying the best we can. You said you are creating a unet_learner (that where the random part is added) then running lr_find (which is random anyway).

The base algorithm to train model is called SGD and S is for stochastic. You should never expect to always get the same results because of that.

To get two identical runs, follow the lines of codes I have given in the topic linked. They have been confirmed to work.

1 Like

This is what I am doing.

manual = DataBlock(blocks=(ImageBlock, MaskBlock(codes)),
                   get_items=partial(get_image_files,folders=[manual_name,test_name]),
                   get_y=get_y_fn,
                   splitter=FuncSplitter(ParentSplitter),
                   item_tfms=Resize((size,size)),
                   batch_tfms=Normalize.from_stats(*imagenet_stats)
                  )
dls = manual.dataloaders(path_images,bs=bs)

learn = unet_learner(dls, resnet34, metrics=[Dice(),JaccardCoeff()],wd=1e-2,
                     pretrained=True,normalize=True).to_fp16()

learn.load("stage-1")

learn.lr_find() 

Sorry, didn’t want to make you angry :disappointed_relieved: :disappointed_relieved: :disappointed_relieved:. Okey, I understand that SGB not makes results reproducibles.

However, I was just trying that my notebooks were fully reproducible owing to the fact that is for my final degree project.

Just add those three lines at the beginning:

set_seed(42)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

I tested on code using CAMVID (since I don’t have your dataset) and got the exact same results twice in a row (graph + suggestions).

Make sure you are on fastai’s master with an editable install since the bug behind this was fixed yesterday only.

Thank you very much.

From what package is set_seed()???

set_seed(42)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

I was using the function listed above for setting the seed:

I am installing fastai 2 with pip like this:
pip install git+https://github.com/fastai/fastai2

What do you mean by editable install?

That is the editable install (what you were doing I think). For the set_seed functionality see the torch_core module:

No set_seed comes from the torch_core module.

1 Like

Ah my bad, fixed the above link :slight_smile: thanks

What is the meaning of that??

A quick look at the FAQ for this forum will explain it:

The exact method he mentions is the GCP instructions, however the pip install of the git repository will do the same thing, as I mentioned just a minute ago.

Sorry, my english is bad!

In Learner base class the default value is Adam.

Is unet_learner using SGD instead of Adam?

No all Learners default to Adam

Looks like unet_learner does by default!

https://dev.fast.ai/vision.learner#unet_learner

I am still seeing strange things that I don´t understand.

This first model was trained with fit_once_cycle(10, slice(1e-5,1e-4)):
imagen

This second model was trained with fit_once_cycle(20, slice(1e-5,1e-4)):
imagen

Why they aren´t the same losses??? Looks like number of epoch is afecting the results. I have looked into the code of fit_one_cycle and doesn´t look that it affects.

If you change the number of epochs, the schedule of the learning rate is not the same, hence the different losses. See the doc of one cycle for more information (GitHub pages are down so I can’t point the docs directly, sorry).

ohhhh, it is an interesting behaviour that didn´t know. Good to know! With doc you mean dev.fast.ai?

I have executed a notebook of a week ago after updating fastai2 and I am getting different plots and errors!!! Has been any important update to fastai 2?