Failing with face landmarks data preparation

(Ilia) #1

Recently I’ve started working on the face landmarks detection project. However, I can’t figure out how to prepare the dataset to use it with fastai.

I am trying to use data_block API instead of custom datasets I usually use. I have a folder with the following structure:

    - 1.jpeg
    - 1.txt
    - 2.jpeg
    - 2.txt

Each .txt file contains two columns with coordinates. (First goes Y, then goes X). The coordinates are already scaled into range [-1, 1]. However, as I know, the coordinates rescaling is applied automatically, and I am not sure how to disable it.

I am using the following snippet to construct the data bunch:

def read_landmarks(img_path):
    index = img_path.stem
    pts_path = img_path.parent/f'{index}.txt'
    img = imread(img_path)
    h, w = img.shape[:2]
    pts = np.loadtxt(pts_path, delimiter=',')
    xs, ys = pts[:, 1], pts[:, 0]
    xs, ys = to_absolute(xs, ys, w, h)
    return np.c_[ys, xs]

p = (
    split_by_idx(trn_idx, val_idx).


However, the following error is raised:

~/code/fastai_v1/repo/fastai/vision/ in scale_flow(flow, to_unit)
    417     "Scale the coords in `flow` to -1/1 or the image size depending on `to_unit`."
    418     s = tensor([flow.size[0]/2,flow.size[1]/2])[None]
--> 419     if to_unit: flow.flow = flow.flow/s-1
    420     else:       flow.flow = (flow.flow+1)*s
    421     return flow

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/ in __rdiv__(self, other)
    365     def __rdiv__(self, other):
    366         if self.dtype.is_floating_point:
--> 367             return self.reciprocal() * other
    368         else:
    369             return (self.double().reciprocal() * other).type_as(self)

TypeError: mul(): argument 'other' (position 1) must be Tensor, not numpy.ndarray

The error is raised when I am trying to print the p object. As I can see, it happens because __repr__ magic invokes method that tries to rescale coordinates and failes.

Could someone help with the two questions:

  1. how to disable coordinates re-scaling?
  2. is it possible to solve this issue without disabling re-scaling?

=== Software ===
python        : 3.7.1
fastai        : 1.0.39
fastprogress  : 0.1.18
torch         : 1.0.0
nvidia driver : 410.73
torch cuda    : 9.0.176 / is available
torch cudnn   : 7401 / is enabled

=== Hardware ===
nvidia gpus   : 2
torch devices : 2
  - gpu0      : 11177MB | GeForce RTX 2080
  - gpu1      : 7952MB | GeForce GTX 1080 Ti

=== Environment ===
platform      : Linux-4.15.0-43-generic-x86_64-with-debian-buster-sid
distro        : Ubuntu 18.04 Bionic Beaver
conda env     : fastai
python        : /home/ck/anaconda3/envs/fastai/bin/python
sys.path      : /home/ck/code/tasks/face_landmarks_detection

Probably there is some link to the docs that I’ve missed.

(Ilia) #2

Ok, I’ve updated the target reading function a bit so now it returns tensors:

def read_landmarks(img_path):
    index = img_path.stem
    pts_path = img_path.parent/f'{index}.txt'
    img = imread(img_path)
    h, w = img.shape[:2]
    pts = np.loadtxt(pts_path, delimiter=',')
    xs, ys = pts[:, 1], pts[:, 0]
    xs, ys = to_absolute(xs, ys, w, h)
    return torch.FloatTensor(np.c_[ys, xs])

Sure enough, the error disappeared… Not sure it was the only problem though :smile:


You had the error because of a mix of numpy arrays and tensors, so returning tensors was a good idea!
To disable the scaling there is an option in ImageBBox, I think it’s named do_scale. Of course, you’ll have to sublass PointsItemsList and it’s get method.

How to create PointsItemList with ImageBBox for Localization?
(Ilia) #4

Unfortunately, there is one more problem. The pipeline was successfully instantiated (without any warnings) using this snippet (and the function above):

t = get_transforms(do_flip=True, 
p = (PointsItemList.
     split_by_idxs(trn_idx, val_idx).
     transform(t, tfm_y=True, size=(224, 224), padding_mode='zeros').

However, when I am trying to use it for training:

learn = create_cnn(bunch, models.resnet34)

I am getting the error:

RuntimeError: Traceback (most recent call last):
  File "/home/ck/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/utils/data/", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/ck/code/fastai_v1/repo/fastai/", line 105, in data_collate
  File "/home/ck/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/utils/data/", line 232, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/home/ck/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/utils/data/", line 232, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/home/ck/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/utils/data/", line 209, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 21 and 20 in dimension 1 at /opt/conda/conda-bld/pytorch_1544176307774/work/aten/src/TH/generic/THTensorMoreMath.cpp:1333

It seems like some of landmarks arrays missing values. I have 21x2 landmarks per image but as I can see from the error message, it says that some of the samples have fewer values. Could it be affected by transformations? Probably some of the landmarks somehow “fall” outside of the image during rotations or warping?

Also, every once in a while I am getting the warning:

/home/ck/code/fastai_v1/repo/fastai/ UserWarning: It's not possible to collate samples of your dataset together in a batch.
Shapes of the inputs/targets:
[[torch.Size([3, 224, 224]), torch.Size([3, 224, 224]) [...] torch.Size([3, 224, 224])
torch.Size([21, 2]), torch.Size([21, 2]), [...] torch.Size([21, 2])]]

So it seems that something happens with the data during transformation. Because the original dataset is prepared in such a way that all landmarks are within face area, and each face always has exactly 21 landmark point.

Or do I need to include more padding, or reduce the magnitude of augmentation parameters?


With the data augmentation, it’s very likely you’re losing points because they fall off the picture. You either need to:

  1. use less data augmentaiton
  2. use the option remove_out=False in ImagePoints (to not remove them), beware that you will have targets outside of -1,1, so adapt the scaling of your last layer accordingly.
  3. pad your inputs to deal with those missing points

(Ilia) #6

Yeah, makes sense! Interesting enough that this time (while on master) I am getting one more kind of errors:

There seems to be something wrong with your dataset, can't access self.train_ds[i] for all i in 
[65237, 47545, 8078, 53990, ..., 758]

Using the same code and data as previously, just updated the lib.


What’s a minimal repro of this bug?

(Ilia) #8

I’ll create a dummy repo and try to reproduce the issue. I still not sure if it is some bug in my code. Basically, it is mostly about reading images from the single folder that contains txt and jpeg files.