Lesson 1 discussion

(Jeremy Howard) #245

That means you’re using python 3. You may find other issues too, since the course was written for python 2.

(Jeremy Howard) #246

One is for batchnorm, one without. They are saved in .keras/models in your home directory automatically.

(Adam Smith) #247

In the lesson 1 notebook when creating the VGG model from scratch we use:
model = Sequential()
model.add(Lambda(vgg_preprocess, input_shape=(3,224,224)))

I’ve looked at the images in the data set, and they’re of different sizes and none seem to be 224 x 224.

Should the Cats and Dogs images be cropped to 224 x 224 prior to running them through the model? If so, what’s the best practice for standardizing images of various sizes?

(Angel) #248

@adamontherun The https://keras.io/preprocessing/image/ ImageDataGenerator used in get_batches function does the image conversion automatically for you

(Vijay Kumar) #249

Please help to to print the csv file after prediction, I am facing issue in printing the csv file .
Can someone share the code.

(Vijay Kumar) #250

Please share the code for printing the csv file after prediction I am getting stuck in that place

(Matthew Kleinsmith) #251


Saving predictions to a CSV

Approach 1: import numpy as np

isdog = preds[:,1]
filenames = batches.filenames
ids = np.array([int(f[8:f.find('.')]) for f in filenames])
subm = np.stack([ids,isdog], axis=1)
submission_file_name = 'submission1.csv'
np.savetxt(submission_file_name, subm, fmt='%d,%.5f', header='id,label', comments='')

where batches and preds are the output of vgg.test.

Submission properties: Probabilities as answers; two classes

See the end of this notebook: https://github.com/fastai/courses/blob/master/deeplearning1/nbs/dogs_cats_redux.ipynb

Approach 2: import csv

filenames = test_batches.filenames
ids = [filename.split('/')[1].split('.')[0] + ".ppm" for filename in filenames]
classes = [np.argmax(prob) for prob in probs]
pairs = zip(ids, classes)
with open(results_path+"submission--lesson-1--vgg16--12-epochs--peak.csv", "w") as f:
    writer = csv.writer(f, delimiter=";")

where test_batches and probs are the output of vgg.test.

Submission properties: Labels as answers; 43 classes

See the end of this notebook: https://github.com/MatthewKleinsmith/fast-ai-MOOC/blob/master/german-traffic-signs.ipynb

(Angel) #253

When doing the fine-tuning, why is validation accuracy higher than training accuracy?

(Vijay Kumar) #254

Thanks Matthew

(WG) #255

How to improve my kaggle score.

I’m currently in the top 29%. Best results achieved by running 10 epochs and changing the maximum and minimum values for my “is_dog” probability to fit between .0455 and .0955.

Wondering … what can I do/try to improve the score?


(Rachel Thomas) #256

@wgpubs There are more tips for improving Kaggle scores in later lessons, and you may want to return to this competition as you proceed with the course.

(Rachel Thomas) #257

@Gelu74 That is because we are using dropout on the training set (throwing away some information so as to avoid overfitting), but not on the validation set (because we want to be as accurate as possible on the validation and test sets). We’ll talk more about this idea in Lesson 3.

(Rachel Thomas) #258

@noodles Thanks for the heads up! I just found 3 links on the wiki that I updated. If you find more outdated links, could you please reply listing their locations.

(Rachel Thomas) #259

@adamontherun Keras can resize images. This is set by the target_shape parameter in flow_from_directory. This function is being called from our get_batches method with target_shape=(224,224), so it is already being handled.

(Elle Idan) #260

I also am encountering cPickle module not found error. I am running Python 3. Should I make sure to downgrade to Python 2 to avoid future issues, or can those issues be easily fixed as I get to them?

(Elle Idan) #261

Thanks for the tip. That works for me too. I just hope there won’t be many more such issues since I’m also working with Python 3, which is apparently the cause of this problem.

(Elle Idan) #262

Woops, I spoke too soon. In the blink of an eye, the error reappeared. I closed and re-opened Jupyter Notebook, but that didn’t work :-/

Even though I edited the utils.py file, it is still referring to the line which I had commented out
i.e. import cPickle as pickle

What’s going on??:anguished:

C:\Users\Admin\Documents\Fast_AI\courses\deeplearning1\nbs\utils.py in ()
1 from future import division,print_function
2 import math, os, json, sys, re
----> 3 import cPickle as pickle
4 from glob import glob
5 import numpy as np

(chenjun) #263

how long train the model ?

when run vgg.fit(), it has take two hours ,still in epoch .the train data is not so big . train data size = 200 images.

vgg.fit(batches, val_batches, nb_epoch=1)
Epoch 1/1

something went wrong ?


That is quite strange… reloading jupyter should take care of it and reload(utils) should also take care of it. Maybe you’ve edited a different file or forgot to save?

(Elle Idan) #265

Ahh! It’s a different error, I didn’t see it:

ImportError: No module named ‘bcolz’

How do I fix that?