How to predict on the test set

(Michael Chmutov) #42

This issue appears to have been fixed in the later versions of the library, but for completeness sake, I wanted to mention here what it was. The test set internally used to get labels [0]. On the other hand, it still tried to treat these as the x and y coordinates as instructed previously in the notebook; hence it was missing a coordinate. In the new versions of the library, the labels in the test set have the same number of (zero) entries as the number of outputs.

(Saksham Gupta) #43

Hello Jonathan, thanks for the clear and in-order steps, could you tell me how did you manage to match the image with its correct probability in detail? Did you print the os.listdir(path) and the probabilities and then arrange them is excel?

(Jonathan) #44

@poseidon Yep, simple as that. I printed os.listdir(path), copied and pasted into Excel, then copied and pasted the output probabilities (and transposed them in excel).

At some stage, I will look at modifying:

data = ImageClassifierData.from_paths(PATH, tfms=tfms)

So that the data returned can sort the os.listdir(path) files. Then both the file list and associated probabilities will already be ordered.

(Saksham Gupta) #45

nice :smiley:

(Omar Ayman) #46

regards dogsbreed ,i can’t submit my file to kaggle so could you please someone provide the code needed.


You have gotten this error for a test set on bounding boxes, right? I have the same error and was wondering if you managed to solve it. Help is much appreciated

(Ben) #49

Hi guys, I am having a problem of converting 2 unit output into 1 probability output.

How should I convert a cat probability and dog probability into a 0-1 probability?

I have no clue. Thanks!

(Ben) #50

I have solved the problem.
I use the formula 0 * prob(cat) + 1 * prob(dog) to generate one output instead of two separate probabilities.

(Stephen Johnson) #51

Well, 0 times the prob(cat) will alway be zero and 1 times prob(dog) is just prob(dog), so you’re just left with prob(dog).

(Ben) #52

you are right, so can you tell me what formula did you use to do that? Thanks a lot. I really appreciate it

(Todd Richard Johnson) #53

If your question is with respect to the 2 column log_preds array in Lesson 1, the np.exp of the first and second value of each row sums to 1 (though there is some rounding error).

This line calculates the probability that each image is a dog:

probs = np.exp(log_preds[:,1])

You can see this, because log_preds[;,1] extracts the second column for all rows.

Also, you can run this to see that the first 10 rows each sum to 1 (after converting back to probs):

np.sum(np.exp(log_preds[:10]), axis=1)

As a result prob(\text{cat}) = 1 - prob(\text{dog}). However, in the lesson 1 notebook, the probabilities printed above the images are all prob(\text{dog}), that’s why values near 0 indicate a cat and values near 1 indicate a dog.


I get the same error while trying to predict bounding boxes.
Has someone figured out a solution?

(Bumblebee) #55

Thanks for this. it helped and made sense too. Just a quick fix here: the increment of index should be included in the loop.

(Matthew Arthur) #56

When I try to predict on a learner I built from an ImageDataBunch that included a test dataset, I get:

TypeError: predict() missing 1 required positional argument: 'item'