ImagePoints problem with Kaggle Cat Dataset (Facial features recognition)

I was trying to use an approach from [lesson3-head-pose.ipynb( for Kaggle Cat Dataset.

The annotation data for facial features is stored in text files as a list of coordinates for 9 points (x is first coordinate, y - second):

  • Left Eye
  • Right Eye
  • Mouth
  • Left Ear-1
  • Left Ear-2
  • Left Ear-3
  • Right Ear-1
  • Right Ear-2
  • Right Ear-3

I wanted to check whether everything is alright with labels, so I wrote following functions:

def get_face_features(image_fname):
    feature_txt = path / f'{image_fname}.cat'
    data = np.genfromtxt(feature_txt, dtype=np.float32)
    return tensor(data[1:].reshape((-1, 2)))

def show_cat_with_features(image_fname):
    img = open_image(path/image_fname), get_face_features(image_fname)), scale=True, y_first=False), figsize=(10, 10)) # I used y_first=False, as all points are given (x, y) not (y, x).

After executing


I get following result:

As you can see - the features are not in the correct points. So I started seeking the error in my code. After some time I get my code to work:

def get_face_features(image_fname):
    feature_txt = path / f'{image_fname}.cat'
    data = np.genfromtxt(feature_txt, dtype=np.float32)
    return tensor(data[1:].reshape((-1, 2))).flip(1)

def show_cat_with_features(image_fname):
    img = open_image(path/image_fname), get_face_features(image_fname)), scale=True), figsize=(10, 10))


The result was following:

The 2 changes in code was:

  1. Add flip(1) to a tensor in get_face_features
  2. Remove y_first=False in show_cat_with_features as points were in (y, x) format.

As far as I understand these changes shouldn’t lead to any difference, so I digged into the fastai code and found following code in

class ImagePoints(Image):
    "Support applying transforms to a `flow` of points."
    def __init__(self, flow:FlowField, scale:bool=True, y_first:bool=True):
        "Create from raw tensor image data `px`."
        if scale: flow = scale_flow(flow)
        if y_first: flow.flow = flow.flow.flip(1)

As you can see we first apply scale_flow and then flip. In my opinion it could be a bug, as this operations are not commutative.

Am I right?

Thanks in advance!

Hi @gurev,
what was the reason you used dtype=np.float32?

data = np.genfromtxt(feature_txt, dtype=np.float32)

Is it just for fast computation and effieceny in storage?

I tried using np.float64 and it did not work. Why was this, I really cant figure out.

Hi, I don’t think that there were any reason besides computation speed and memory usage. It is strange that it doesn’t work with np.float64.

I’ll try to find my notebook and check it, but I’m not sure that I still have this notebook in my repo :frowning:

Hi @gurev,
I see that you might not have your notebook anymore but I have a question that you might be able to help with. I am also working through the methodology from the lesson 3 head pose notebook where I want to detect cars. I am able to get the center points to align correctly on top of the cars, similar to what you have with the points on the cat. However when I try to make my data bunch with the function I created to get the points to match the image (my labels in this case), the data bunch reads in both the images (jpgs) and the text (txt files that has the label data in them) raising a torch size mismatch. I was wondering if you remembered how you created your data bunch for this effort. Here is a more detailed writeup with my issue (DataBunch error).


Hi @mlxMantic. Thanks for your question) I have found my notebook on my external drive, please feel free to look at it cats.ipynb

I didn’t encountered the error you describe, but maybe my notebook will be of help to you.


Thank you @gurev that lends a little but of clarity. I’m guessing for your model you labels lengths were constant and you didn’t run into my issue that I am having regarding some images having 0 labels up to n labels. Off the top of your head, would you have an idea on how to modify the collate function to allow for vary amounts of labels (or point coordinates in an image)?

Look into something like pose estimation. I’m working on bringing over the things for a databunch in fastai2 but essentially instead of points it’s a conv2d heatmap that contains your points.

Hello, the link has expired and it cannot be downloaded anymore. Is it okay if you can share it? I’m trying to search for how to do multiple points with ImagePoints.

Hello, sorry for the delay. I’ll try to find the file during weekend and post the link here.

Hello again, here it is: cats.ipynb