.get_preds()/.predict() have different results

I’m confused that my learner get different results with get_preds() and predict()

I’m a starter and trying to solve the facial keypoint detection problem from Kaggle.

Here is my process (skip get_x() and get_y()):


# 定义 DataBlock
dblock = DataBlock(
    blocks=(ImageBlock, PointBlock),
    get_x=get_x,
    get_y=get_y,
    splitter=RandomSplitter(0.1),
)

# 创建 DataLoaders
dls = dblock.dataloaders(df, bs=64)

learn.fine_tune(1, 3e-2)

then I tried to predict testdata:

learn = load_learner('face_keypoint.pkl')

test_df = pd.read_csv('./data/test.csv')

img = string_to_img(test_df['Image'][0])

pred = learn.predict(img)
print(pred[0])
print(pred[1])

imgs = test_df['Image'].map(string_to_img)
test_dl = learn.dls.test_dl(imgs)
preds, _ = learn.get_preds(dl=test_dl)

print(preds[0])

but I got this:

TensorPoint([[65.3436, 30.3640],
             [32.2660, 40.7135],
             [57.6765, 34.4008],
             [72.6347, 38.5018],
             [35.3546, 33.9121],
             [24.3648, 40.0208],
             [60.0303, 27.1097],
             [76.1809, 24.6169],
             [36.9387, 28.5674],
             [18.0109, 27.5102],
             [44.6158, 54.4272],
             [63.9060, 74.6134],
             [30.8123, 73.3621],
             [51.1658, 72.5630],
             [44.6618, 87.2385]])
tensor([ 0.3613, -0.3674, -0.3278, -0.1518,  0.2016, -0.2833,  0.5132, -0.1979,
        -0.2634, -0.2935, -0.4924, -0.1662,  0.2506, -0.4352,  0.5871, -0.4871,
        -0.2304, -0.4048, -0.6248, -0.4269, -0.0705,  0.1339,  0.3314,  0.5544,
        -0.3581,  0.5284,  0.0660,  0.5117, -0.0695,  0.8175])
tensor([ 0.3613, -0.3674, -0.3278, -0.1518,  0.2016, -0.2833,  0.5132, -0.1979,
        -0.2634, -0.2935, -0.4924, -0.1662,  0.2506, -0.4352,  0.5871, -0.4871,
        -0.2304, -0.4048, -0.6248, -0.4269, -0.0705,  0.1339,  0.3314,  0.5544,
        -0.3581,  0.5284,  0.0660,  0.5117, -0.0695,  0.8175])

it looks like that pred[0] is the REAL keypoint postion, but I use the get_predict() to batch predict whole dataset just get some decimals (Exactly the same as pred[1]). Is this question related to normalization?

how can I get the true positions?

I don’t speak English very well, forgive me. Thanks a lot for your help! :smiley:

I once encountered this problem myself. Maybe this will help you:
good = learn.get_preds(dl=test_dl, with_decoded=True)

@JumpyJason
I tried that, but it seems not work ):

imgs = test_df['Image'].map(string_to_img)
preds = learn.get_preds(dl=learn.dls.test_dl(imgs))
decoded_preds = learn.get_preds(dl=learn.dls.test_dl(imgs), with_decoded=True)

print(preds)
print(decoded_preds)

output:

(tensor([[ 0.3652, -0.2001, -0.4073,  ...,  0.4606,  0.0130,  0.7156],
        [ 0.3540, -0.2094, -0.3962,  ...,  0.4514,  0.0024,  0.7101],
        [ 0.3500, -0.3029, -0.3497,  ...,  0.4191, -0.0114,  0.7321],
        ...,
        [ 0.3605, -0.2154, -0.3960,  ...,  0.4887,  0.0117,  0.7206],
        [ 0.3479, -0.2149, -0.3969,  ...,  0.5086,  0.0184,  0.7256],
        [ 0.3425, -0.2235, -0.3874,  ...,  0.4936,  0.0244,  0.7236]]), None)
(tensor([[ 0.3652, -0.2001, -0.4073,  ...,  0.4606,  0.0130,  0.7156],
        [ 0.3540, -0.2094, -0.3962,  ...,  0.4514,  0.0024,  0.7101],
        [ 0.3500, -0.3029, -0.3497,  ...,  0.4191, -0.0114,  0.7321],
        ...,
        [ 0.3605, -0.2154, -0.3960,  ...,  0.4887,  0.0117,  0.7206],
        [ 0.3479, -0.2149, -0.3969,  ...,  0.5086,  0.0184,  0.7256],
        [ 0.3425, -0.2235, -0.3874,  ...,  0.4936,  0.0244,  0.7236]]), None, tensor([[ 0.3652, -0.2001, -0.4073,  ...,  0.4606,  0.0130,  0.7156],
        [ 0.3540, -0.2094, -0.3962,  ...,  0.4514,  0.0024,  0.7101],
        [ 0.3500, -0.3029, -0.3497,  ...,  0.4191, -0.0114,  0.7321],
        ...,
        [ 0.3605, -0.2154, -0.3960,  ...,  0.4887,  0.0117,  0.7206],
        [ 0.3479, -0.2149, -0.3969,  ...,  0.5086,  0.0184,  0.7256],
        [ 0.3425, -0.2235, -0.3874,  ...,  0.4936,  0.0244,  0.7236]]))

In addition, I found a topic about how to decode by PointScaler:

sclr = PointScaler()
sclr(img)
dp = sclr.decode(TensorPoint.create(pred[1])) 

This works, but I’m wondering why with_decoded=True doesn’t work as expected :face_with_raised_eyebrow:

1 Like


I just checked Fastai’s book and found an interesting fact: points are squeezed within the range (-1, 1). Maybe that’s why they need special decoding.

yep, but my code is copied from fastbook.

learn = vision_learner(dls, resnet18, y_range=(-1, 1))

same as the book.