Hello!
I was trying to use an approach from [lesson3-head-pose.ipynb(https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson3-head-pose.ipynb) for Kaggle Cat Dataset.
The annotation data for facial features is stored in text files as a list of coordinates for 9 points (x is first coordinate, y - second):
- Left Eye
- Right Eye
- Mouth
- Left Ear-1
- Left Ear-2
- Left Ear-3
- Right Ear-1
- Right Ear-2
- Right Ear-3
I wanted to check whether everything is alright with labels, so I wrote following functions:
def get_face_features(image_fname):
feature_txt = path / f'{image_fname}.cat'
data = np.genfromtxt(feature_txt, dtype=np.float32)
return tensor(data[1:].reshape((-1, 2)))
def show_cat_with_features(image_fname):
img = open_image(path/image_fname)
img.show(y=ImagePoints(FlowField(img.size, get_face_features(image_fname)), scale=True, y_first=False), figsize=(10, 10)) # I used y_first=False, as all points are given (x, y) not (y, x).
After executing
show_cat_with_features(fname)
I get following result:
As you can see - the features are not in the correct points. So I started seeking the error in my code. After some time I get my code to work:
def get_face_features(image_fname):
feature_txt = path / f'{image_fname}.cat'
data = np.genfromtxt(feature_txt, dtype=np.float32)
return tensor(data[1:].reshape((-1, 2))).flip(1)
def show_cat_with_features(image_fname):
img = open_image(path/image_fname)
img.show(y=ImagePoints(FlowField(img.size, get_face_features(image_fname)), scale=True), figsize=(10, 10))
show_cat_with_features(fname)
The result was following:
The 2 changes in code was:
- Add
flip(1)
to a tensor inget_face_features
- Remove
y_first=False
inshow_cat_with_features
as points were in (y, x) format.
As far as I understand these changes shouldn’t lead to any difference, so I digged into the fastai code and found following code in https://github.com/fastai/fastai/blob/c008ea13f808e37a22556ecb88819cb6fae915c5/fastai/vision/image.py#L226
class ImagePoints(Image):
"Support applying transforms to a `flow` of points."
def __init__(self, flow:FlowField, scale:bool=True, y_first:bool=True):
"Create from raw tensor image data `px`."
if scale: flow = scale_flow(flow)
if y_first: flow.flow = flow.flow.flip(1)
------------------------------------------------------------
As you can see we first apply scale_flow
and then flip
. In my opinion it could be a bug, as this operations are not commutative.
Am I right?
Thanks in advance!