How to predict on the test set

This issue appears to have been fixed in the later versions of the fast.ai library, but for completeness sake, I wanted to mention here what it was. The test set internally used to get labels [0]. On the other hand, it still tried to treat these as the x and y coordinates as instructed previously in the notebook; hence it was missing a coordinate. In the new versions of the library, the labels in the test set have the same number of (zero) entries as the number of outputs.

Hello Jonathan, thanks for the clear and in-order steps, could you tell me how did you manage to match the image with its correct probability in detail? Did you print the os.listdir(path) and the probabilities and then arrange them is excel?

@poseidon Yep, simple as that. I printed os.listdir(path), copied and pasted into Excel, then copied and pasted the output probabilities (and transposed them in excel).

At some stage, I will look at modifying:

data = ImageClassifierData.from_paths(PATH, tfms=tfms)

So that the data returned can sort the os.listdir(path) files. Then both the file list and associated probabilities will already be ordered.

nice :smiley:

regards dogsbreed ,i can’t submit my file to kaggle so could you please someone provide the code needed.

You have gotten this error for a test set on bounding boxes, right? I have the same error and was wondering if you managed to solve it. Help is much appreciated

Hi guys, I am having a problem of converting 2 unit output into 1 probability output.

How should I convert a cat probability and dog probability into a 0-1 probability?

I have no clue. Thanks!

I have solved the problem.
I use the formula 0 * prob(cat) + 1 * prob(dog) to generate one output instead of two separate probabilities.

Well, 0 times the prob(cat) will alway be zero and 1 times prob(dog) is just prob(dog), so you’re just left with prob(dog).

you are right, so can you tell me what formula did you use to do that? Thanks a lot. I really appreciate it

If your question is with respect to the 2 column log_preds array in Lesson 1, the np.exp of the first and second value of each row sums to 1 (though there is some rounding error).

This line calculates the probability that each image is a dog:

probs = np.exp(log_preds[:,1])

You can see this, because log_preds[;,1] extracts the second column for all rows.

Also, you can run this to see that the first 10 rows each sum to 1 (after converting back to probs):

np.sum(np.exp(log_preds[:10]), axis=1)

As a result prob(\text{cat}) = 1 - prob(\text{dog}). However, in the lesson 1 notebook, the probabilities printed above the images are all prob(\text{dog}), that’s why values near 0 indicate a cat and values near 1 indicate a dog.

1 Like

I get the same error while trying to predict bounding boxes.
Has someone figured out a solution?

Thanks for this. it helped and made sense too. Just a quick fix here: the increment of index should be included in the loop.

When I try to predict on a learner I built from an ImageDataBunch that included a test dataset, I get:

TypeError: predict() missing 1 required positional argument: 'item'

Thoughts?

5 Likes

Hi,
Did you manage to solve it? I am getting the same error

Use learn.get_preds() instead as follows:

learn.get_preds(ds_type=DatasetType.Test)

predict() is used to get a prediction on one specific data item I believe thus expects you to pass this data item into the function call.

2 Likes

Hi All, I am working on multiclass problem with 5 classes and using below code for ‘Data’.

data = ImageDataBunch.from_csv(path=path, 
                               folder='train', 
                               csv_labels='train.csv', 
                               test='test', 
                               ds_tfms=get_transforms(), 
                               size=224, 
                               bs=bs).normalize(imagenet_stats)

print(data.classes)
[1, 2, 3, 4, 5]

I am stuck at predict on test set as get_preds is predicting the target variable between 0 to 4 but in my case I am expecting 1 to 5. I am new to fastai. Can someone please guide me.

log_preds, test_labels = learn50.get_preds(ds_type=DatasetType.Test)
preds = np.argmax(log_preds, 1)
preds[:10]

Output:

tensor([4, 1, 1, 0, 4, 0, 0, 0, 4, 0])

try something like [data.classes[idx] for idx in preds]

I have same issue. Any suggestions? I am working on custom data. I checked github code for “predict” function, it accepts only 2 parameter “is_test” and “use_swa”. What is this item then?

1 Like