Poor scoring on dog breed compitition

agielchinsky · June 3, 2018, 12:17pm

I’ve watched lessons 1-3 and tried the kaggle dog breed competition but my results are pretty bad right now. best score is 11.7 (for reference if I submit a CSV full of 1/120’s I score a 4.8). Is there anything wrong with my code? My training accuracy is in the low 80%s. Thanks

%reload_ext autoreload
%autoreload 2
%matplotlib inline

from fastai.imports import *
from fastai.transforms import *
from fastai.conv_learner import *
from fastai.model import *
from fastai.dataset import *
from fastai.sgdr import *
from fastai.plots import *

PATH = "data/dogbreed/"
sz=224

images = !ls {PATH}/train
images = pd.DataFrame(images, columns=["Files"])
n_validation_files=int(len(images)*0.2)
validation_idx = images.sample(n=n_validation_files, replace=False).index
val_list = []
for i in range(0,len(validation_idx)):
    val_list.append(validation_idx[i])

arch=resnet34
tfms=tfms_from_model(arch, sz, aug_tfms=transforms_side_on, max_zoom=1.1)
data = ImageClassifierData.from_csv(PATH, 
                                    folder='train', 
                                    csv_fname=f'{PATH}labels2.csv', 
                                    tfms=tfms,
                                    test_name='test1',
                                    val_idxs=val_list)
learn = ConvLearner.pretrained(arch, data, precompute=False)
learn.fit(0.01, 3)

learn.precompute=False
learn.fit( 1e-2, 3)
learn.fit( 1e-2, 3)

test_names = !ls {PATH}/test1
x = []
for i in range(0,len(test_names)):
    x.append(test_names[i].replace('.jpg',''))
test_names = pd.DataFrame(x, columns=['id'])

log_pred=learn.predict(is_test=True)
probs = pd.DataFrame(np.exp(log_pred), columns=data.classes)
output = pd.concat([test_names, probs],axis=1, sort=False)
output.to_csv(f'{PATH}test_predictions.csv', index=False)

ZachL · June 3, 2018, 8:27pm

Did you use the learning rate finder? Try using the learning rate finder to set the learning rate, and also try using cycle_mult and cycle_len params.

Mariam · June 4, 2018, 4:58am

I think that test_names and probs are not in the same order.

agielchinsky · June 4, 2018, 8:54am

That would explain it. How do I tell what order the predictions or ordered in?

Mariam · June 4, 2018, 10:00am

try: id_raw = data.test_dl.dataset.fnames

agielchinsky · June 4, 2018, 11:19am

that worked. thanks!