Thanks! Sorry didn’t notice it was an old question. I am just start going through the lessons.
yes, it’s hard to see - but awesome of you to add the answer in-line for future readers
How come training data for this competition is around 600MB, but when saved as numpy with bcolz, the file size is around 6GB.
is it normal?
Anyone familiar with the css Jeremy is using for this notebooks?
@Gelu74 thanks. I had stumbled across that, but was also looking for the color scheme, widening the boxes, etc. Most themes I’ve found are for old Ipython notebooks that don’t work in Jupyter notebooks any more.
@jeremy In the video lecture for Lesson 2, at 1h 16m 21s, when discussing derivatives and defining function
upd(), it seems that you are taking derivatives for the loss function. At first it seemed very confusing to see
dydb = 2 * (y_pred - y) but then I understood that it was actually not
dydb but rather
dLossdb as we are interested in how the loss value will change when we change our
take a look at these https://github.com/ipython-contrib/IPython-notebook-extensions/wiki/Codefolding
I found the course “Computational Photography” on Udacity very relevant. See lesson 100-130 on topics such as:
- Cross correlation
- Mean and median filtering
- Gaussian filters
Thank So much, i enjoyed the lectures, i learnt the following
- Image Smoothing/Normalization using kernel and neighbourhood computations
- X Correlation
- Mean and Median Filtering
- Convolution Method and Properties
- Diff between Convolution and correlation
- Gaussian Filter
- Linear Filtering
- Image gradients
- Detect Features in Images
So I’m back at it again. Been working through lesson2. I understand the idea of using the training data and validation data. I understand how to setup the directory structure and using the linear model on it in lesson 2. My real question is whats the best way to use the test data after the valadation and training data has been used with the model. I’ve looked in the dogs and cats redux notebook to try and understand how to call the test data and the line i see is:
batches, preds = vgg.test(test_path, batch_size = batch_size*2)
However, I also see a model.predict or model.predict_generator for batches. Do I simply use test_batches = get_batches(…) on the test data and then run model.predict_generator on the test_batches to get the predictions on the unlabeled test data?
Thanks ahead of time.
def test(self, path, batch_size=8):
test_batches = self.get_batches(path, shuffle=False, batch_size=batch_size, class_mode=None)
return test_batches, self.model.predict_generator(test_batches, test_batches.nb_sample)
It just includes another step that sets up the batches for prediction. So you could do it either way. vgg.test is just a helper function that makes it easy.
Thanks Even. I’ll try that tonight when I get home.
So I’m not sure its working exactly like I think it should.
import utils; reload(utils)
from utils import *
from future import division,print_function
import os, json
from glob import glob
import numpy as np
from sklearn.preprocessing import OneHotEncoder
from sklearn.metrics import confusion_matrix
from matplotlib import pyplot as plt
import utils; reload(utils)
from utils import plots, get_batches, plot_confusion_matrix, get_data
from numpy.random import random, permutation
from scipy import misc, ndimage
from scipy.ndimage.interpolation import zoom
from keras import backend as K
path = "dogscats/sample/"
test_path = “dogscats/sample/test/”
#path = "dogscats/"
model_path = path + 'models/'
if not os.path.exists(model_path): os.mkdir(model_path)
from vgg16 import Vgg16
vgg = Vgg16()
model = vgg.model
def save_array(fname, arr): c=bcolz.carray(arr, rootdir=fname, mode=‘w’); c.flush()
def load_array(fname): return bcolz.open(fname)[:]
def onehot(x): return np.array(OneHotEncoder().fit_transform(x.reshape(-1,1)).todense())
val_batches = get_batches(path+‘valid’, shuffle=False, batch_size=1)
batches = get_batches(path+‘train’, shuffle=False, batch_size=1)
val_data = get_data(val_batches)
trn_data = get_data(batches)
save_array(model_path+ ‘train_data.bc’, trn_data)
save_array(model_path + ‘valid_data.bc’, val_data)
trn_data = load_array(model_path+‘train_data.bc’)
val_data = load_array(model_path+‘valid_data.bc’)
val_classes = val_batches.classes
trn_classes = batches.classes
val_labels = onehot(val_classes)
trn_labels = onehot(trn_classes)
trn_features = model.predict(trn_data, batch_size=batch_size)
val_features = model.predict(val_data, batch_size=batch_size)
save_array(model_path+ ‘train_lastlayer_features.bc’, trn_features)
save_array(model_path + ‘valid_lastlayer_features.bc’, val_features)
trn_features = load_array(model_path+‘train_lastlayer_features.bc’)
val_features = load_array(model_path+‘valid_lastlayer_features.bc’)
lm = Sequential([ Dense(2, activation=‘softmax’, input_shape=(1000,)) ])
lm.compile(optimizer=RMSprop(lr=0.01), loss=‘categorical_crossentropy’, metrics=[‘accuracy’])
lm.fit(trn_features, trn_labels, nb_epoch=1, batch_size=batch_size,
test_batches, predz = vgg.test(test_path, batch_size=batch_size*2)
Output from code above:
All zeros? I’m not sure I think that makes sense. Not sure if I’m doing this right…
Found 104 images belonging to 1 classes.
[[ 2.3819e-06 8.2677e-07 1.5837e-07 …, 2.0192e-06 3.4587e-05 2.7597e-04]
[ 8.6598e-08 7.3182e-07 3.6479e-07 …, 1.6141e-08 1.6603e-05 3.5059e-03]
[ 3.9418e-06 8.7915e-06 3.6428e-05 …, 1.1809e-06 7.2487e-05 6.3038e-03]
[ 4.1993e-09 4.5927e-09 3.1021e-09 …, 1.2422e-08 9.2751e-08 3.1926e-06]
[ 8.2603e-08 9.9292e-07 1.6900e-07 …, 1.2691e-09 2.0147e-06 3.2606e-05]
[ 7.2147e-08 3.3645e-06 3.7729e-07 …, 1.0056e-07 2.1949e-04 6.8866e-04]]
Does this output make any sense? Running it on just a sample of the data…
Here is the weight update method from sgd.ipynb that was introduced in Lesson 2:
def upd(): global a_guess, b_guess y_pred = lin(a_guess, b_guess, x) dydb = 2 * (y_pred - y) dyda = x*dydb a_guess -= lr*dyda.mean() b_guess -= lr*dydb.mean()
I can see that dyda and dydb are going to be vectors containing as many partial derivatives as the number of points. Can someone please explain why we are taking the
mean()? How do we interpret it geometrically?
Here is the animate method from sgd.ipynb that was introduced in Lesson 2:
fig = plt.figure(dpi=100, figsize=(5, 4)) plt.scatter(x,y) line, = plt.plot(x,lin(a_guess,b_guess,x)) plt.close() def animate(i): line.set_ydata(lin(a_guess,b_guess,x)) for i in range(100): upd() return line, ani = animation.FuncAnimation(fig, animate, np.arange(0, 40), interval=100) ani
In this snippet, every time the animate method is called, the weight update method is called a 100 times. Is 100 chosen for a particular reason? Should we not check for the gradient to be all 0s (we have reached a minima), and then we can stop?
I am a bit lost in the part where @jeremy discusses using a linear model with the imagenet probabilities as inputs. Text in the notebook:
“They ignore information available in the predictions; for instance, if the models predicts that there is a bone in the image, it’s more likely to be a dog than a cat.”
Here, they refers to the manual mapping of the 1000 class probabilities to 2 classes (dog and cat). I am not able to understand how the class probabilities encode this kind of information?
I’m busy rewriting lesson 2 notebook in my own code. I’ve arrived at the part where we pop the last layer (1000 classes) and replace it with a 2-class dense fullyconnected layer. When I run fit_model the validation accuracy is stuck at .5, or 50%. I don’t understand what is wrong here. What could be causing that? All the steps before that ran fine and produce the same results as in the original notebook.
Could it be that class information is missing from the training set? Did anyone else ran into this problem? Or are the training and validation classes ‘misaligned’?
earlier, linear model approach =>
- keep the last layer that outputs probabilities across 1000 labels that sum to one for each example where we run the prediction
- stick another layer on top of that that takes a linear combination of those predictions
- vgg claims it likely sees a bone, some animal and maybe a ball in the image -> our last layer combines those predictions and maybe is inclined to indicate this is a picture of a dog
- get rid of the last layer, the output layer predicting across 1000 labels
- stick a layer with two classes on top of the model
- train the model to go directly from features it produces in the last but one layer to predictions across two classes, cats and dogs
- the rest of the model is frozen - we are only training the top most layer (for the time being at least)