In that case you may instead wish to use RandomResizedCropGPU
instead (it’s designed for the batch transforms)
Hi mueller @muellerzr, I have a question. If the original size of the images in the dataset is 6000*6000 and the amount is small, I use RandomCrop method in fastaiv2 to have augmentation. How could I do an inference on the original 6000*6000 images? appreciate for the help!
the original images are like these:
You probably won’t be able to do that very easily, as a 6000x6000 image will probably get you an OOM error (out of memory) due to it’s sheer size. The best way to do this would be to say split this up into 512x512 patches of data and feed those into your model then reconstruct the segmentation mask from there (I’ve personally never done this but I know it’s how it’s done )
So:
- Split 6000x6000 into ~100 images or so (depending on the inference resolution you try, this can go up or down, try various sizes and see what you like first on a small scale)
- Get some organized way of feeding in the images to the model so that you know how to reconstruct the overall image
- Feed to model
- Reconstruct segmentation mask
If you want to search the forums more, you’re looking for threads on “image tiles”
If anyone wants to help out some, here is that Gaussian keypoint implementation I mentioned (realized today I never made the thing public). Folks who are familiar with pose detection I could certainly use your help. The implementation is based on HRNet:
I’m following the object detection tutorial from Walk with Fastai2 Vision for a private dataset, but am seeing issues with learn.show_results()
I am able to train, no issues there…
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
216 # in this case we return a big list
--> 217 try: return retain_type(torch.cat(xs, dim=dim), xs[0])
218 except: return sum([L(retain_type(o_.index_select(dim, tensor(i)).squeeze(dim), xs[0])
TypeError: expected Tensor as element 0 in argument 0, but got int
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
23 frames
<ipython-input-48-c3b657dcc9ae> in <module>()
----> 1 learn.show_results()
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in show_results(self, ds_idx, dl, max_n, shuffle, **kwargs)
252 if dl is None: dl = self.dls[ds_idx].new(shuffle=shuffle)
253 b = dl.one_batch()
--> 254 _,_,preds = self.get_preds(dl=[b], with_decoded=True)
255 self.dls.show_results(b, preds, max_n=max_n, **kwargs)
256
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in get_preds(self, ds_idx, dl, with_input, with_decoded, with_loss, act, inner, reorder, **kwargs)
227 for mgr in ctx_mgrs: stack.enter_context(mgr)
228 self(event.begin_epoch if inner else _before_epoch)
--> 229 self._do_epoch_validate(dl=dl)
230 self(event.after_epoch if inner else _after_epoch)
231 if act is None: act = getattr(self.loss_func, 'activation', noop)
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in _do_epoch_validate(self, ds_idx, dl)
183 with torch.no_grad(): self.all_batches()
184 except CancelValidException: self('after_cancel_validate')
--> 185 finally: self('after_validate')
186
187 @log_args(but='cbs')
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in __call__(self, event_name)
132 def ordered_cbs(self, event): return [cb for cb in sort_by_run(self.cbs) if hasattr(cb, event)]
133
--> 134 def __call__(self, event_name): L(event_name).map(self._call_one)
135 def _call_one(self, event_name):
136 assert hasattr(event, event_name)
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in map(self, f, *args, **kwargs)
375 else f.format if isinstance(f,str)
376 else f.__getitem__)
--> 377 return self._new(map(g, self))
378
379 def filter(self, f, negate=False, **kwargs):
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in _new(self, items, *args, **kwargs)
325 @property
326 def _xtra(self): return None
--> 327 def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
328 def __getitem__(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
329 def copy(self): return self._new(self.items.copy())
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __call__(cls, x, *args, **kwargs)
45 return x
46
---> 47 res = super().__call__(*((x,) + args), **kwargs)
48 res._newchk = 0
49 return res
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __init__(self, items, use_list, match, *rest)
316 if items is None: items = []
317 if (use_list is not None) or not _is_array(items):
--> 318 items = list(items) if use_list else _listify(items)
319 if match is not None:
320 if is_coll(match): match = len(match)
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in _listify(o)
252 if isinstance(o, list): return o
253 if isinstance(o, str) or _is_array(o): return [o]
--> 254 if is_iter(o): return list(o)
255 return [o]
256
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __call__(self, *args, **kwargs)
218 if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
219 fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 220 return self.fn(*fargs, **kwargs)
221
222 # Cell
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in _call_one(self, event_name)
135 def _call_one(self, event_name):
136 assert hasattr(event, event_name)
--> 137 [cb(event_name) for cb in sort_by_run(self.cbs)]
138
139 def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in <listcomp>(.0)
135 def _call_one(self, event_name):
136 assert hasattr(event, event_name)
--> 137 [cb(event_name) for cb in sort_by_run(self.cbs)]
138
139 def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)
/usr/local/lib/python3.6/dist-packages/fastai2/callback/core.py in __call__(self, event_name)
22 _run = (event_name not in _inner_loop or (self.run_train and getattr(self, 'training', True)) or
23 (self.run_valid and not getattr(self, 'training', False)))
---> 24 if self.run and _run: getattr(self, event_name, noop)()
25 if event_name=='after_fit': self.run=True #Reset self.run to True at each end of fit
26
/usr/local/lib/python3.6/dist-packages/fastai2/callback/core.py in after_validate(self)
94 "Concatenate all recorded tensors"
95 if self.with_input: self.inputs = detuplify(to_concat(self.inputs, dim=self.concat_dim))
---> 96 if not self.save_preds: self.preds = detuplify(to_concat(self.preds, dim=self.concat_dim))
97 if not self.save_targs: self.targets = detuplify(to_concat(self.targets, dim=self.concat_dim))
98 if self.with_loss: self.losses = to_concat(self.losses)
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs[0].keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs[0].keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs[0].keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs[0].keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs[0].keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs[0].keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
217 try: return retain_type(torch.cat(xs, dim=dim), xs[0])
218 except: return sum([L(retain_type(o_.index_select(dim, tensor(i)).squeeze(dim), xs[0])
--> 219 for i in range_of(o_)) for o_ in xs], L())
220
221 # Cell
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
217 try: return retain_type(torch.cat(xs, dim=dim), xs[0])
218 except: return sum([L(retain_type(o_.index_select(dim, tensor(i)).squeeze(dim), xs[0])
--> 219 for i in range_of(o_)) for o_ in xs], L())
220
221 # Cell
/usr/local/lib/python3.6/dist-packages/fastcore/utils.py in range_of(x)
170 def range_of(x):
171 "All indices of collection `x` (i.e. `list(range(len(x)))`)"
--> 172 return list(range(len(x)))
173
174 # Cell
TypeError: object of type 'int' has no len()
I have tried to understand what is going on, but am at a loss right now
Any pointers highly appreciated!
(as mentioned in other threads), object detection is in its infancy in regards to fastai. show_results
won’t work in that example, nor predict
, etc. It requires custom post-processing. This thread has some explanations and some examples:
I’m trying to figure out how to setup a CNN that takes >2 images as input and output a single prediction (e.g. classification). tldr is I’m stuck with is how to combine the outputs from the backbone at the output layer.
My ASCII art:
Image1 -> CNN_1a (shared weights with 1b) ---|
|----> FC layers -> Softmax (output)
Image2 -> CNN_1b (shared weights with 1c) ---|
So far, I’ve been able to just add another ImageBlock:
DataBlock(blocks=(ImageBlock, ImageBlock, CategoryBlock),
get_x=[ColReader("file1"),ColReader("file2")],
get_y=ColReader("label"))
Borrowed some tricks from Jeremy’s fastbook and @muellerzr notebook on Siamese. Btw this is “sort of” like a Siamese model, except it takes 2 specific images to make a classification (not trying to predict if 2 images are different).
class NotReallySiameseModel(Module):
def __init__(self, encoder, head):
self.encoder, self.head = encoder, head
def forward(self, x1, x2):
ftrs = torch.cat([self.encoder(x1), self.encoder(x2)], dim=1)
return self.head(ftrs)
encoder = create_body(resnet34, cut=-2)
head = create_head(512*4, 2, ps=0.5)
model = NotReallySiameseModel(encoder, head)
def siamese_splitter(model):
return [params(model.encoder), params(model.head)]
Then finally I get:
learn = Learner(dls, model, splitter=siamese_splitter, metrics=accuracy)
learn.freeze()
learn.summary()
Summary print out indicate my output is correct:
NotReallySiameseModel (Input shape: [‘32 x 3 x 224 x 224’, ‘32 x 3 x 224 x 224’])
But my last layer is a 32 x 2 output (batch size is 32, and so it’s still getting 2 separate outputs).
Has anyone worked with similar architectures before? Would appreciate any help on how to combine output from 2 networks into 1. Thanks!
This is very weird, following up on my post, the setup I mentioned above does train… but how is that possible when I only have one y_label? I extended the model to take in 4 inputs and as expected I’m getting a 32 x 4 output.
When I do “get_preds”, I’m actually getting a 1-d vector of 4 numbers as the output. Is there a bug somewhere? There is only 2 categories in my label, so it’s impossible to create a one-hot encoding that has more than 2 numbers…
How should i do if i want to do unpaired style transfer (GAN) between two different domains, I am trying to understand how i should use the datablocks api to create a dataloader with 2 datasets.
When i had this code in in pytorch, i just had two dataloaders that i used zip to iter through, But not sure how to do this using the fastai API.
If using native pytorch, you can work with unfold
to split the inference image as tiles (sliding window) and create a tensor.
Run inference on the tensor and stitch back the output as whole image using permute
You can look at this code, for a working example.
You created your head
with 2 output (specified in create_head
above), so it should give you batch × 2
output, which it sounds like it did. In this case, I think because you followed the Siamese tutorial, the two output were meant for the predicted probabilities of ‘Similar’ / ‘Dissimilar’.
If you are trying to do a classification for n
classes, your head
should have n
output, i.e. created with:
head = create_head(512*4, n, ps=0.5)
Then your model will output n
activations, which get softmax’d to give predicted probabilities, and the max. probability one is your predicted class.
I think that’s what you were asking? If not, please provide more details and code snippets…
Yijin
How did you ‘extend’ the model to take 4 inputs? What did you change in create_head
?
Yes, thanks that worked. Apparently I forgot to change the number of outputs which defaulted to 4… but the weird thing is that the model shouldn’t “compile” with the wrong number of outputs.
I did not do anything else fancy. Used the Siamese code, generalized it to take 4 images, and set the right output to 2.
I have saved Semantic Segmentation Masks in disk as uint8 images with values 0 and 1. 0 is for the class background and 1 for the other class.
This images are visualized fully blacks in disk if saved as pngs. When being loaded with dataloader obtained from DataBlock API, showbatch shows them well. PILMask also shows them well.
manual = DataBlock(blocks=(ImageBlock, MaskBlock(codes)),
get_items=partial(get_image_files,folders=[manual_name]),
get_y=get_y_fn,
splitter=RandomSplitter(valid_pct=0.1,seed=2020),
item_tfms=Resize((size,size)),
batch_tfms=Normalize.from_stats(*imagenet_stats)
)
I was wondering if it is a way of saving in disk as images with 0 as value 0 and 1 as 255. So selection is visualized in white. I have tried this approach and loaded some models that I trained some months ago. I am validating with learn.validate()
and i am getting next error:
CUDA error: device-side assert triggered
However, If images are saved with values 0 and 1, I am not getting problems at all.
Is there a way with datablock api to force to normalize the mask?
I have solved by using next ItemTransform:
class TargetMaskConvertTransform(ItemTransform):
def __init__(self):
pass
def encodes(self, x):
img,mask = x
#Convert to array
mask = np.array(mask)
# Change 255 for 1
mask[mask==255]=1
# Back to PILMask
mask = PILMask.create(mask)
return img, mask
@muellerzr do you think that is a good approach to make this? Or is better to save as a uint8 image with a channel with values [0,1] but images will represent fully black?
Hi all.
I am looking into xresnet and its variants and had a quick question about the attention. It seems they all use spectral norms, but that mainly comes from generative tasks. I can’t really find it used in other multi/self attention uses. Did this come from performance on Imagenette and ImageWoof? Guess I’m asking if it’s another one of those places where fastai is just ahead of the curve =)
Thanks much
Yes it did, and we found self attention helped.
Thanks Zach! Another low-level question, in bag-of-tricks they used bottleneck residual blocks, but we only use those when expansion != 1. Is this yet another case where regular blocks shone out in ImageNette/Woof? (it might be fair to assume the answer to questions like this is almost always yes ><)
Anything related to xresnet, how it works, and it’s modifications came from Woof/Nette
Hey All, thanks for creating this place
I’m working on a depth estimation project, and used to use Fastai v1, but now decided to upgrade to Fastai v2. In v1, I used the ImageImage list, and seemed to get some results, but now I noticed there isn’t a direct equivalent and set out to create my own data loader. I’m currently using the following one:
class ImageImageDataLoaders(DataLoaders):
"Basic wrapper around several `DataLoader`s with factory methods for Image to Image problems"
@classmethod
@delegates(DataLoaders.from_dblock)
def from_label_func(cls, path, fnames, label_func, valid_pct=0.2, seed=None, item_tfms=None, batch_tfms=None, **kwargs):
"Create from list of `fnames` in `path`s with `label_func`."
dblock = DataBlock(blocks=(ImageBlock, ImageBlock(cls=PILImageBW)),
splitter=RandomSplitter(valid_pct, seed=seed),
get_y=label_func,
item_tfms=item_tfms,
batch_tfms=batch_tfms)
res = cls.from_dblock(dblock, fnames, path=path, **kwargs)
return res
Where the inputs are 3 channel .jpg images and the outputs are 1 channel BW .png images.
For some reason when I run this on my data (without errors), the outputs of data.show_batch() show that the labels are completely white images.
Is there a way to look at the values the dataloader has for the outputs after it was created, or some other convenient way to debug this?
Thanks!