Trouble in 1.0.55 making unet_learner with TensorDataset

ilgatogelato · July 13, 2019, 6:35pm

Hi all.
Am attempting to make the lesson7-superres notebook work on floating point data instead of png images. Had it working in a borderline reasonable way with a slightly older fastai version, but it broke when I updated.

By analogy with the lesson5-mnist notebook I read them in as numpy arrays, mapped them to tensors, made X_train, y_train, X_valid, y_valid into TensorDatasets, and did a Databunch.create.
Then passed that Databunch to make a unet_learner:

train_ds, valid_ds = TensorDataset(X_train, Y_train), TensorDataset(X_valid,Y_valid)
train_ds.c=3
valid_ds.c=3
data2 = DataBunch.create(train_ds, valid_ds, bs=10, num_workers=1)

But when I try to make the unet_learner:

wd = 1e-3
learn = unet_learner(data2, arch, wd=wd, loss_func=feat_loss, callback_fns=LossMetrics,
blur=True, norm_type=NormType.Weight)

I get the following error:

/group03/secrawle/Pystuf/python-virtual-environments/env/lib/python3.6/site-packages/fastai/callbacks/hooks.py in dummy_batch(m, size)
103 “Create a dummy batch to go through m with size.”
104 ch_in = in_channels(m)
–> 105 return one_param(m).new(1, ch_in, *size).requires_grad_(False).uniform_(-1.,1.)
106
107 def dummy_eval(m:nn.Module, size:tuple=(64,64)):

TypeError: new() argument after * must be an iterable, not builtin_function_or_method

I’m not very knowledgeable about Python/Pytorch/etc - is this a problem because Tensors have a built-in size method? Somehow this worked pretty recently.

With fastai 1.0.52 (I think it was 52?), I had this working. I could train the network, and then read new datasets from disk and produce results. (However, presort=True wasn’t implemented yet in the ItemList part, and the dataset would be all out of order compared to how it was on disk.)
It stopped working when I upgraded fastai to 1.0.55

Any suggestions? Anyone else build a unet_learner with tensor datasets in a more straightforward way? (I’m still padding my (1,128,128) arrays to (3,128,128) for instance, so I’m definitely doing a few clunky things - have mercy I’m learning).

brian · August 28, 2019, 6:52pm

I am having the same issue with version 1.0.57. @ilgatogelato Did you find a fix?

ilgatogelato · August 28, 2019, 7:26pm

Hi Brian

Short answer is no, not really.

For now I am using 1.0.53, it has the “presort” function available in itemlist.from_folder(), but also lets me:

train_ds, valid_ds = TensorDataset(X_train, Y_train), TensorDataset(X_valid,Y_valid)

to

data = DataBunch.create(train_ds, valid_ds, bs=10, num_workers=1)

to

learn = unet_learner(data, …etc…)

without giving me an error that necessitates a deep dive to debug.

Although, that’s obviously not a viable long term solution, to stay stuck on an old version.

My hope is that, as I work through Part 2 of the class (just started it yesterday), I’ll get comfortable enough with the guts of fastai (or just PyTorch for that matter) to find a more sustainable solution.

brian · August 28, 2019, 7:28pm

I tracked it down to here:

def unet_learner(data:DataBunch, arch:Callable, pretrained:bool=True, blur_final:bool=True,
                 norm_type:Optional[NormType]=NormType, split_on:Optional[SplitFuncOrIdxList]=None, blur:bool=False,
                 self_attention:bool=False, y_range:Optional[Tuple[float,float]]=None, last_cross:bool=True,
                 bottle:bool=False, cut:Union[int,Callable]=None, **learn_kwargs:Any)->Learner:
    "Build Unet learner from `data` and `arch`."
    meta = cnn_config(arch)
    body = create_body(arch, pretrained, cut)
    try:    size = data.train_ds[0][0].size
    except: size = next(iter(data.train_dl))[0].shape[-2:]

If I run data.train_ds[0][0].size
I get: <function Tensor.size>

If I run next(iter(data.train_dl))[0].shape[-2:]
I get torch.Size([256, 256])

This is the breaking change:

brian · August 28, 2019, 7:41pm

If you are up for a little hacking, you can modify fastai/vision/learner.py to make it work:

lines 116, 117 look like this now:
try: size = data.train_ds[0][0].size
except: size = next(iter(data.train_dl))[0].shape[-2:]

replace them with:
size = next(iter(data.train_dl))[0].shape[-2:]

ilgatogelato · August 29, 2019, 2:53pm

That’s fantastic, thanks for the tip. I will try this next time I’m fastai-ing.

brian · August 30, 2019, 5:32am

I’ve fixed my loader in a more appropriate way. You just need to wrap your data in an Image object.

ilgatogelato · August 30, 2019, 9:54pm

So my thought was to just include my_unet_learner() in my python script, and make it a copy of unet_learner with the edit you suggested to avoid hacking inside the fastai code.
But that seems to require adding more pieces (cnn_config) or editing vision.py to put cnn_config inside __all__
This works in the sense that I can train a unet on floating point arrays, but seems bad in the sense that the next update will break it again.

I see there’s a class Image that is initialized with a Tensor in fastai/vision/image.py
When you say you wrap your data in an Image object, you mean you do that to all your tensors, and then form Databunch/Dataset/whatever with the resulting images? I guess I will experiment with that next.

SaharNasser · June 28, 2021, 4:28am

Hi @brian, Could you please elaborate on this?