Planet Classification Challenge

Wow, that is awesome. I was planning on using 0.5, but something tells me this might (100% definitely) be a better method.

Yea lot of people used 0.2 as a good fixed threshold for all labels in this comp, 0.5 would be too high I think. Using opt_th of course should be even better as the optimal threshold shouldn’t be the same value for all labels.



(Preamble: There’s an issue with using the planet dataset that’s being discussed here. There’s also a way to side-step the exception-trace as long as you’re not producing the submission file!

Got my first submission in. At 0.8898 on the leaderboard, think it’s fairly okay. I barely fit the model once, and focused more on getting the submission file done. Here’s my lousy script that produces the file. Nothing fancy, and very java-ish like. I couldn’t, for the sake of good life, get one of them fancy python generators working.

for i in range(17):
    result = data.classes[i]
    mapp[i] = result

so, that mapp is:

{0: 'agriculture',
 1: 'artisinal_mine',
 2: 'bare_ground',
 3: 'blooming',
 4: 'blow_down',
 5: 'clear',
 6: 'cloudy',
 7: 'conventional_mine',
 8: 'cultivation',
 9: 'habitation',
 10: 'haze',
 11: 'partly_cloudy',
 12: 'primary',
 13: 'road',
 14: 'selective_logging',
 15: 'slash_burn',
 16: 'water'}

The script for producing the test thus would be:

import re
tta_test = learn.TTA(is_test=True)
files =
predictions = tta_test[0]
pattern = 'test-jpg\/(.*)\.jpg'
files = list(

with open("planet_submission.csv.apil","w") as f:
    for i in tqdm(range(61191)):
        # files[i] is of this form: test-jpg/test_xyze.jpg
        # Only want to extract the 'text_xyze' part for the
        # submission
        pattern ='test-jpg\/(.*)\.jpg',files[i])        
        predLine =','
        prediction = predictions[i]

        for j in range(17):
            # only use the prediction if
            # score is greater than 0.2
            if(prediction[j] > 0.2):
                predLine +=mapp[j]+' '

Now, all I need to do is figure out how to leverage the op_th function! Hmmm… Wife’s really mad I’m doing nothing but coding since I got home. Gotta go to bed !!


You can do that after the course finishes :wink:

Thanks for sharing that code. But gosh yes it is java-ish - we’ll try to help fix it up a little…


Dear Python/PyTorch gurus out there,

Is this just me or is this something most people can reproduce? I ran it on an AWS instance created by a script from version 1 of this course and also on a new one that’s created via the new part 1 v2 AMI and saw the same result. Also, I downloaded data from Kaggle on each instance rather than copying them over.

I did “git pull” and “conda update --all”. But, no module named 'fast_gen" in planet_cv.ipynb

@Moody, might be an issue of where you are running the notebook from, as often this error happens when the fastai directory is not where the notebooks expects to find it, based on your current working directory

Yeah we haven’t covered planet_cv yet, so it’s not in working condition. You should change the imports to be the same as our lesson1 and lesson2 notebooks. Probably some other changes needed too.

So it looks like that opt_th is for the validation set, once you are moving from validation to test, how do you determine what th you should use?

I haven’t tested it extensively yet but I think we have to just go with the thresholds based on how they did on the validation set, since we don’t really know in advance how they will perform on the test set…However, if you were to retrain with all of the data have you checked to see if opt_th still works? Cause that would probably be an even better indication of which thresholds to use…Looking at the code now…for the targs parameter couldn’t you just feed it with the y values from the whole dataset (training with all images - no validation set) ?

Yeah, you could figure out the optimal threshold on all of the known values, So you would just do the actuals vs predicted of all the training images and that would give you the threshold and then you would just have to put that into your testing threshold and hope it is the same. I think for my first submission I’m just going to use 0.2. That can be v2 for me.

this may help for the predictions

prob_preds, y= learn.TTA(is_test=True)
classes = np.array(data.classes, dtype=str)
res = [" ".join(classes[np.where(pp > 0.5)]) for pp in prob_preds] 
test_fnames = [os.path.basename(f).split(".")[0] for f in data.test_ds.fnames]
test_df = pd.DataFrame(res, index=test_fnames, columns=['tags'])
test_df.to_csv('planet-amazon-from-the-space_Deb.csv', index_label='image_name')

That is such an elegant way to handle that. I have been hacking together a solution for the past hour and this just got me my last piece I needed. Especially liked this line:

res = [" ".join(classes[np.where(pp > 0.5)]) for pp in prob_preds] 

Short, but there is a lot going on there and it is really powerful.

You will probably want to lower that 0.5 to somewhere around 0.2 is the only change I would recommend based on other conversations.

I was able to get a 0.93045 score on my first submission. Pretty happy with that as a starting point. That would put me around 47th when the competition was going on. My next step is going to be recreating without looking at other code snippets (or at least keeping that to a minimum).


Umm why you trying to be so modest Kevin, thats a crazy good score for this competition!! LOL


Thank you Kevin for the kind words. Also thanks for the tip on the threshold. I’ll try my hand at 0.2 and opt_th.

James, when I went through the opt_th function it seems to return 1 threshold. Could you please explain how it can return threshold per label? Do we use a optimizer?
Also the f2 function itself applies a constant threshold between 0.17 and 0.24 and returns the max. Does this mean different thresholds were applied to different batches?

@jeremy In the source code, the lower and upper bounds are set as 0.17 and 0.24 with step = 0.01 (ie 8 steps in total). Is it a rule of thumb? Or, do we need to reset the lower and upper bounds depending on the number of variables?

For this challenge,17 possible tags: agriculture, artisinal_mine, bare_ground, blooming, blow_down, clear, cloudy, conventional_mine, cultivation, habitation, haze, partly_cloudy, primary, road, selective_logging, slash_burn, water. 4 of them are under a weather group (ie clear, cloudy, partly_cloudy and haze) which should be mutually exclusive and included in every prediction. Should we treat the weather group and non-weather group separately?

Got my F2 score up to 0.93189 on tonight’s run. Biggest change was removing my validation set and also running longer after unfrozen. I’ll get my code uploaded on github and share the link here.


This question may be asked somewhere - can someone explain on an abstract level what does f2 and opt_th metrics does?