How to predict on the test set

mchmutov · June 20, 2018, 4:32pm

This issue appears to have been fixed in the later versions of the fast.ai library, but for completeness sake, I wanted to mention here what it was. The test set internally used to get labels [0]. On the other hand, it still tried to treat these as the x and y coordinates as instructed previously in the notebook; hence it was missing a coordinate. In the new versions of the library, the labels in the test set have the same number of (zero) entries as the number of outputs.

poseidon · June 28, 2018, 10:38am

Hello Jonathan, thanks for the clear and in-order steps, could you tell me how did you manage to match the image with its correct probability in detail? Did you print the os.listdir(path) and the probabilities and then arrange them is excel?

jonathan.spiller · July 1, 2018, 12:39pm

@poseidon Yep, simple as that. I printed os.listdir(path), copied and pasted into Excel, then copied and pasted the output probabilities (and transposed them in excel).

At some stage, I will look at modifying:

data = ImageClassifierData.from_paths(PATH, tfms=tfms)

So that the data returned can sort the os.listdir(path) files. Then both the file list and associated probabilities will already be ordered.

poseidon · July 2, 2018, 4:44am

nice

Omar · July 3, 2018, 7:32pm

regards dogsbreed ,i can’t submit my file to kaggle so could you please someone provide the code needed.

Loob · August 8, 2018, 1:32pm

You have gotten this error for a test set on bounding boxes, right? I have the same error and was wondering if you managed to solve it. Help is much appreciated

kachun1017 · September 4, 2018, 3:10am

Hi guys, I am having a problem of converting 2 unit output into 1 probability output.

How should I convert a cat probability and dog probability into a 0-1 probability?

I have no clue. Thanks!

kachun1017 · September 4, 2018, 3:26am

I have solved the problem.
I use the formula 0 * prob(cat) + 1 * prob(dog) to generate one output instead of two separate probabilities.

stephenjohnson · September 4, 2018, 7:55pm

Well, 0 times the prob(cat) will alway be zero and 1 times prob(dog) is just prob(dog), so you’re just left with prob(dog).

kachun1017 · September 6, 2018, 7:42am

you are right, so can you tell me what formula did you use to do that? Thanks a lot. I really appreciate it

toddrjohnson · September 30, 2018, 5:34am

If your question is with respect to the 2 column log_preds array in Lesson 1, the np.exp of the first and second value of each row sums to 1 (though there is some rounding error).

This line calculates the probability that each image is a dog:

probs = np.exp(log_preds[:,1])

You can see this, because log_preds[;,1] extracts the second column for all rows.

Also, you can run this to see that the first 10 rows each sum to 1 (after converting back to probs):

np.sum(np.exp(log_preds[:10]), axis=1)

As a result prob(\text{cat}) = 1 - prob(\text{dog}). However, in the lesson 1 notebook, the probabilities printed above the images are all prob(\text{dog}), that’s why values near 0 indicate a cat and values near 1 indicate a dog.

ChristophNeuner · October 22, 2018, 1:24pm

I get the same error while trying to predict bounding boxes.
Has someone figured out a solution?

Ghiya6548 · December 13, 2018, 6:54am

Thanks for this. it helped and made sense too. Just a quick fix here: the increment of index should be included in the loop.

matthewarthur · February 9, 2019, 4:29pm

When I try to predict on a learner I built from an ImageDataBunch that included a test dataset, I get:

TypeError: predict() missing 1 required positional argument: 'item'

Thoughts?

kakods · April 6, 2019, 4:54pm

Hi,
Did you manage to solve it? I am getting the same error

AhriaR · April 7, 2019, 2:47am

Use learn.get_preds() instead as follows:

learn.get_preds(ds_type=DatasetType.Test)

predict() is used to get a prediction on one specific data item I believe thus expects you to pass this data item into the function call.

chetanambi · June 3, 2019, 8:06am

Hi All, I am working on multiclass problem with 5 classes and using below code for ‘Data’.

data = ImageDataBunch.from_csv(path=path, 
                               folder='train', 
                               csv_labels='train.csv', 
                               test='test', 
                               ds_tfms=get_transforms(), 
                               size=224, 
                               bs=bs).normalize(imagenet_stats)

print(data.classes)
[1, 2, 3, 4, 5]

I am stuck at predict on test set as get_preds is predicting the target variable between 0 to 4 but in my case I am expecting 1 to 5. I am new to fastai. Can someone please guide me.

log_preds, test_labels = learn50.get_preds(ds_type=DatasetType.Test)
preds = np.argmax(log_preds, 1)
preds[:10]

Output:

tensor([4, 1, 1, 0, 4, 0, 0, 0, 4, 0])

dreambeats · June 3, 2019, 9:10am

try something like [data.classes[idx] for idx in preds]

asunayak · December 29, 2019, 4:27pm

I have same issue. Any suggestions? I am working on custom data. I checked github code for “predict” function, it accepts only 2 parameter “is_test” and “use_swa”. What is this item then?