Possible error: seed not working with unet learner

WaterKnight · March 27, 2020, 4:12pm

I have tried to execute the next piece several times:

Create unet_learner
using lr_find()

For each time I has gotten different plots.

I am establishing the seed with the next method:

def random_seed(seed_value, use_cuda):
    np.random.seed(seed_value) # cpu vars
    torch.manual_seed(seed_value) # cpu  vars
    random.seed(seed_value) # Python
    if use_cuda: 
        torch.cuda.manual_seed(seed_value)
        torch.cuda.manual_seed_all(seed_value) # gpu vars
        torch.backends.cudnn.deterministic = True  #needed
        torch.backends.cudnn.benchmark = False

In addition, I am also stablishing the seed in: RandomSplitter(valid_pct=0.1,seed=2020)

So, I don know why I am not getting reproducibility.

WaterKnight · March 29, 2020, 6:22pm

I have not found the solution until!

muellerzr · March 29, 2020, 6:26pm

@WaterKnight you should try again on the master branch. This was discovered earlier this week and fixed yesterday.

WaterKnight · March 29, 2020, 6:35pm

Ohhhhh, nice!!!

Thank you very much @muellerzr for the information!!!

WaterKnight · March 30, 2020, 2:06pm

@muellerzr I have tried it. However, after loading a saved model lr_find is showing a different plot each time that I execute the cell.

muellerzr · March 30, 2020, 8:47pm

@WaterKnight see Sylvain’s response here:

Why would two runs give the same plot? LR Finder runs a mock training that has some randomness + the head of the model is randomly initialized. Unless you go out of your way to set the seeds before the two runs, you won’t get the same graphs/suggestions exactly.

Along with this you should probably re-seed between each call to lr_find as well to be safe. Also please do not make duplicate topics. We’re not ignoring it, we are trying to figure it out ourselves.

WaterKnight · March 30, 2020, 9:16pm

Sorry, my bad.

Even if the model is loaded previously, it adds some randomness to it??

sgugger · March 30, 2020, 9:20pm

You are just throwing two lines of codes without explaining to us what you are doing, so we are replying the best we can. You said you are creating a unet_learner (that where the random part is added) then running lr_find (which is random anyway).

The base algorithm to train model is called SGD and S is for stochastic. You should never expect to always get the same results because of that.

To get two identical runs, follow the lines of codes I have given in the topic linked. They have been confirmed to work.

WaterKnight · March 30, 2020, 9:32pm

This is what I am doing.

manual = DataBlock(blocks=(ImageBlock, MaskBlock(codes)),
                   get_items=partial(get_image_files,folders=[manual_name,test_name]),
                   get_y=get_y_fn,
                   splitter=FuncSplitter(ParentSplitter),
                   item_tfms=Resize((size,size)),
                   batch_tfms=Normalize.from_stats(*imagenet_stats)
                  )
dls = manual.dataloaders(path_images,bs=bs)

learn = unet_learner(dls, resnet34, metrics=[Dice(),JaccardCoeff()],wd=1e-2,
                     pretrained=True,normalize=True).to_fp16()

learn.load("stage-1")

learn.lr_find()

Sorry, didn’t want to make you angry . Okey, I understand that SGB not makes results reproducibles.

However, I was just trying that my notebooks were fully reproducible owing to the fact that is for my final degree project.

sgugger · March 30, 2020, 9:45pm

Just add those three lines at the beginning:

set_seed(42)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

I tested on code using CAMVID (since I don’t have your dataset) and got the exact same results twice in a row (graph + suggestions).

Make sure you are on fastai’s master with an editable install since the bug behind this was fixed yesterday only.

WaterKnight · March 30, 2020, 10:00pm

Thank you very much.

From what package is set_seed()???

set_seed(42)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

I was using the function listed above for setting the seed:

I am installing fastai 2 with pip like this:
pip install git+https://github.com/fastai/fastai2

What do you mean by editable install?

muellerzr · March 30, 2020, 10:04pm

That is the editable install (what you were doing I think). For the set_seed functionality see the torch_core module:

github.com

fastai/fastai2/blob/master/fastai2/torch_core.py#L119


      
              return torch.equal(self,b) if self.dim() else self==b
          
          # Cell
          def _array2tensor(x):
              if x.dtype==np.uint16: x = x.astype(np.float32)
              return torch.from_numpy(x)
          
          # Cell
          @use_kwargs_dict(dtype=None, device=None, requires_grad=False, pin_memory=False)
          def tensor(x, *rest, **kwargs):
              "Like `torch.as_tensor`, but handle lists too, and can pass multiple vector elements directly."
              if len(rest): x = (x,)+rest
              # There was a Pytorch bug in dataloader using num_workers>0. Haven't confirmed if fixed
              # if isinstance(x, (tuple,list)) and len(x)==0: return tensor(0)
              res = (x if isinstance(x, Tensor)
                     else torch.tensor(x, **kwargs) if isinstance(x, (tuple,list))
                     else _array2tensor(x) if isinstance(x, ndarray)
                     else as_tensor(x.values, **kwargs) if isinstance(x, (pd.Series, pd.DataFrame))
                     else as_tensor(x, **kwargs) if hasattr(x, '__array__') or is_iter(x)
                     else _array2tensor(array(x), **kwargs))
              if res.dtype is torch.float64: return res.float()

sgugger · March 30, 2020, 10:06pm

No set_seed comes from the torch_core module.

muellerzr · March 30, 2020, 10:07pm

Ah my bad, fixed the above link thanks

WaterKnight · March 30, 2020, 10:17pm

What is the meaning of that??

muellerzr · March 30, 2020, 10:19pm

A quick look at the FAQ for this forum will explain it:

https://forums.fast.ai/t/fastai-v2-faq-and-links-read-this-before-posting-please/53517

The exact method he mentions is the GCP instructions, however the pip install of the git repository will do the same thing, as I mentioned just a minute ago.

WaterKnight · March 30, 2020, 10:25pm

Sorry, my english is bad!

WaterKnight · March 31, 2020, 8:19am

In Learner base class the default value is Adam.

Is unet_learner using SGD instead of Adam?

muellerzr · March 31, 2020, 11:31am

No all Learners default to Adam

WaterKnight · March 31, 2020, 11:43am

Looks like unet_learner does by default!

https://dev.fast.ai/vision.learner#unet_learner