Wiki: Lesson 3

I dont know what you are doing but it is working fine in my case


try running all cells again and check if your fastai repo is updated

1 Like

No, I didn’t find out. I just assumed it was simply an error, as you say.

1 Like

Hi

Did you manage to do any progress with state farm? No matter what I do I’m completely stuck - either resnet50, resnext50 or resnext101.
I can’t get my model to achieve decent training/validation accuracy.

BTW - my earlier attempts included wrong validation set prep; i actually had the the same driver in both validation and training set, unlike to what kaggle asks; in those I achieved much better results (even on the test test) than what i achive now with proper validation separation - driver only shows in either test of validation stets.

I am doing Amazon Planet Space competition which Jeremy showed in lesson2-image_models.ipynb.
At the end of the notebook, learn.TTA() is called to get the log probabilities.


Unlike catsvs dogs / Dog breeds after doing learn.TTA() why the output from learn.TTA() is not converted to actual probabilities by calling np.exp() in this notebook.
I assume learn.TTA() always gives log_probabilities…

I printed the model both in Amazon Planet & DogBreeds.
The Amazon Planet is using sigmoid as the last activation layer, whereas
dog-breeds is using LogSoftMax as the last activation layer.

The below is the output from both my notebooks.
Planet Space Amazon Notebook: (lesson2-image_models.ipynb)
learn = ConvLearner.pretrained(f_model, data, metrics=metrics)
learn

(10): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True)
(11): Dropout(p=0.25)
(12): Linear(in_features=1024, out_features=512, bias=True)
(13): ReLU()
(14): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True)
(15): Dropout(p=0.5)
(16): Linear(in_features=512, out_features=17, bias=True)
(17): Sigmoid()

Dog-Breeds Identification Challenge:

Sequential(
(0): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True)
(1): Dropout(p=0.25)
(2): Linear(in_features=1024, out_features=512, bias=True)
(3): ReLU()
(4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True)
(5): Dropout(p=0.5)
(6): Linear(in_features=512, out_features=120, bias=True)
(7): LogSoftmax()
)

Is that the reason why in the Amazon Planet Space notebook, the output from learn.TTA()
is not converted to probabilities using np.exp(). because its already in that form as given by sigmoid().

But whereas in dog-breeds its in log probabilities as returned by LogSoftmax() and hence converted to probabilities using np.exp()

Is my understanding correct?? Is the sigmoid() function defined by PyTorch ?

@jeremy you mentioned in the video that the Keras transfer learning example behaves strangely compared to the fast.ai/PyTorch version. I think the reason is that the Keras example doesn’t have an equivalent of bn_freeze.

Setting trainable = False on BN layers in Keras freezes the trainable parameters, but it doesn’t cause Keras to use the running mean/variance; it still uses the batch mean/variance. To fix this, you need to do

from keras import backend as K
...
K.set_learning_phase(0)
model = Model(inputs=base_model.input, outputs=predictions)
K.set_learning_phase(1)

Disabling the learning phase will cause Keras to use the running mean/var instead of the batch mean/var: https://github.com/keras-team/keras/blob/master/keras/layers/normalization.py#L202

The only other way would be to set training=False when calling the BN layer: https://github.com/keras-team/keras/blob/master/keras/layers/normalization.py#L130, but this only works if you modify the code for the base model.

Do you know why bn_freeze has such an impact on performance? I’ve been trying to retrain MobileNet, and it didn’t work at all until I used the Keras equivalent of bn_freeze. Without bn_freeze, Keras reports reasonable train and validation accuracy, but the accuracy plummets in testing.

I’m getting the following error, when I go to run the planet lesson 2 notebook:

data = get_data(256)

ParserError: Error tokenizing data. C error: Expected 1 fields in line 6, saw 2

I’m unsure what the issue is here… ?

Disregard. It was a corrupt file. Another download of the CSV fixed it.

It is crystally clear about the filters from input to Conv1: they are intended to detect top edges and side edges.

I might have missed in the video, but how are the filters from Conv1 to Conv2 were created? those
0.5 0.3 0.3
0.9 -0.5 0
0.8 0.01 -0.7
etc. Are they also parameters?

Maybe this will help someone else …

When I re-created the ‘Quick Dogs v Cats’ notebook, I ran into an error while calculating accuracy. Specifically:
torch.max received an invalid combination of arguments - got (numpy.ndarray, dim=int), but expected one of: ...

The solution: use accuracy_np(np.mean((log_preds),0),y) instead of accuracy(np.mean((log_preds),0),y)

Trying to run the imports for the keras_lesson_1 notebook and receiving an error message:

ModuleNotFoundError: No module named 'keras’

Using paperspace GPU+ hourly machine and wondering if I should just try pip installing keras from the CLI there?

Update: I tried Jeremy’s recommendation of running `pip install tensorflow-gpu keras’ then ran into another error (below)

I did some ‘stackoverflowing’ and came across this discussion: https://github.com/tensorflow/tensorflow/issues/15604

It recommended uninstalling the general tensorflow-gpu then reinstalling with a specific version (1.4). Tried doing that and also just pip install keras but still get the same error (below).

Any help is much appreciated!

---------------------------------------------------------------------------

ImportError Traceback (most recent call last)
~/anaconda3/envs/fastai/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py in ()
57
—> 58 from tensorflow.python.pywrap_tensorflow_internal import *
59 from tensorflow.python.pywrap_tensorflow_internal import version

~/anaconda3/envs/fastai/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in ()
27 return _mod
—> 28 _pywrap_tensorflow_internal = swig_import_helper()
29 del swig_import_helper

~/anaconda3/envs/fastai/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in swig_import_helper()
23 try:
—> 24 _mod = imp.load_module(’_pywrap_tensorflow_internal’, fp, pathname, description)
25 finally:

~/anaconda3/envs/fastai/lib/python3.6/imp.py in load_module(name, file, filename, details)
242 else:
–> 243 return load_dynamic(name, filename, file)
244 elif type_ == PKG_DIRECTORY:

~/anaconda3/envs/fastai/lib/python3.6/imp.py in load_dynamic(name, path, file)
342 name=name, loader=loader, origin=path)
–> 343 return _load(spec)
344

ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

ImportError Traceback (most recent call last)
in ()
1 import numpy as np
----> 2 from keras.preprocessing.image import ImageDataGenerator
3 from keras.preprocessing import image
4 from keras.layers import Dropout, Flatten, Dense
5 from keras.applications import ResNet50

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/init.py in ()
1 from future import absolute_import
2
----> 3 from . import utils
4 from . import activations
5 from . import applications

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/utils/init.py in ()
4 from . import data_utils
5 from . import io_utils
----> 6 from . import conv_utils
7
8 # Globally-importable utils.

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/utils/conv_utils.py in ()
7 from six.moves import range
8 import numpy as np
----> 9 from … import backend as K
10
11

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/backend/init.py in ()
87 elif _BACKEND == ‘tensorflow’:
88 sys.stderr.write(‘Using TensorFlow backend.\n’)
—> 89 from .tensorflow_backend import *
90 else:
91 # Try and load external backend.

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py in ()
3 from future import print_function
4
----> 5 import tensorflow as tf
6 from tensorflow.python.framework import ops as tf_ops
7 from tensorflow.python.training import moving_averages

~/anaconda3/envs/fastai/lib/python3.6/site-packages/tensorflow/init.py in ()
22
23 # pylint: disable=wildcard-import
—> 24 from tensorflow.python import *
25 # pylint: enable=wildcard-import
26

~/anaconda3/envs/fastai/lib/python3.6/site-packages/tensorflow/python/init.py in ()
47 import numpy as np
48
—> 49 from tensorflow.python import pywrap_tensorflow
50
51 # Protocol buffers

~/anaconda3/envs/fastai/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py in ()
70 for some common reasons and solutions. Include the entire stack trace
71 above this error message when asking for help.""" % traceback.format_exc()
—> 72 raise ImportError(msg)
73
74 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long

ImportError: Traceback (most recent call last):
File “/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py”, line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File “/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py”, line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File “/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py”, line 24, in swig_import_helper
_mod = imp.load_module(’_pywrap_tensorflow_internal’, fp, pathname, description)
File “/home/paperspace/anaconda3/envs/fastai/lib/python3.6/imp.py”, line 243, in load_module
return load_dynamic(name, filename, file)
File “/home/paperspace/anaconda3/envs/fastai/lib/python3.6/imp.py”, line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.

I am trying to run the lesson2-image-models.ipynb of fastai/courses/dl1, however I am getting this error. Can anybody enlighten me how to solve this?


PermissionError Traceback (most recent call last)
in
2
3 os.makedirs(‘data/planet/models’, exist_ok=True)
----> 4 os.makedirs(’/cache/planet/tmp’, exist_ok=True)
5
6 get_ipython().system(‘ln -s /datasets/kaggle/planet-understanding-the-amazon-from-space/train-jpg {PATH}’)

~/anaconda2/envs/fastai/lib/python3.7/os.py in makedirs(name, mode, exist_ok)
209 if head and tail and not path.exists(head):
210 try:
–> 211 makedirs(head, exist_ok=exist_ok)
212 except FileExistsError:
213 # Defeats race condition when another thread created the path

~/anaconda2/envs/fastai/lib/python3.7/os.py in makedirs(name, mode, exist_ok)
209 if head and tail and not path.exists(head):
210 try:
–> 211 makedirs(head, exist_ok=exist_ok)
212 except FileExistsError:
213 # Defeats race condition when another thread created the path

~/anaconda2/envs/fastai/lib/python3.7/os.py in makedirs(name, mode, exist_ok)
219 return
220 try:
–> 221 mkdir(name, mode)
222 except OSError:
223 # Cannot rely on checking for EEXIST, since the operating system

PermissionError: [Errno 13] Permission denied: ‘/cache’

What is the purpose of this line

data = data.resize(int(sz*1.3), ‘/tmp’)

And also these path files

TMP_PATH = “/tmp/tmp”
MODEL_PATH = “/tmp/model/”

@vitality:

In his Excel example, Jeremy just made up the top/side kernels as simple demonstrations.
I assume he did the same with the Conv1 -> Conv2, but I’m not sure.
In Real Life, they would all be learned. (Unless you’re very good. :slight_smile: )

@ceceshao1:

The following worked for me on Paperspace:

  1. Use gradient and launch the FastAI course notebook (v0.7).
  2. Launch a terminal from the “+” menu.
  3. pip install tensorflow-gpu keras
  4. Open the lesson*.ipynb
  5. Possibly add these lines before the imports (I need them on subsequent notebook launches, but not the first time):
    import sys
    sys.append('../../')
  1. Then I am able to import tensorflow as tf without error.

@hasib_zunair:

I believe the .resize() call does: (a) select only images up to sz*1.3; (b) resize them to sz; © save in /tmp; (d) overwrite data with this subset. I’m pretty sure the paths (a) show where to get the subset data; (b) where to get the model trained on the subset.

1 Like

Hi there. I’m trying to use the submission code posted at the beginning of the lesson, but ran into errors. I found suggestions on here: Lesson1.ipynb - Error - TypeError: torch.max received an invalid combination of arguments - got (numpy.ndarray, dim=int) but they have not helped im afraid.

When calling

accuracy_np(probs, y), metrics.log_loss(y, probs)

I receive

~/src/fastai/courses/dl1/AaronsWorkbook/fastai/metrics.py in accuracy_np(preds, targs)
     13 def accuracy_np(preds, targs):
     14     preds = np.argmax(preds, 1)
---> 15     return (preds==targs).mean()
     16 
     17 def accuracy_thresh(thresh):

AttributeError: 'bool' object has no attribute 'mean'

I’m passing it the np.mean results of learn.TTA

log_preds, y = learn.TTA()
probs = np.mean(np.exp(log_preds),0)
accuracy_np(probs, y), metrics.log_loss(y, probs)

Is there another function or something I should be using? I realize this is in the old fastai folder, given im using v2.

So i ended up moving around that error, just using the multi predictions from TTA is_test=True, then writing custom code to line up prediction numbers with their image files + tags. Problem is the test-jpgv2 folder has 40,669 images, the competition expects 61,000. There are another 20,522 in the test-jpg-additional folder, but that would be too many total. Has anyone figured out the right setup to do a submission?

I am getting error for bn_freeze function.

AttributeError Traceback (most recent call last)
in ()
1 learn.unfreeze()
----> 2 learn.bn_freeze(True)
3 get_ipython().magic(‘time learn.fit([1e-5,1e-4,1e-2], 1, cycle_len=1)’)

AttributeError: ‘ConvLearner’ object has no attribute ‘bn_freeze’

What am I doing wrong? Is this functionality not available any more?

The planet dataset,what can i do if i want the labels instead of possibility of labels,what threshold should i take or what?

Hi,

I am a little unclear about Segmentation. My understanding of Segmentation is that we use pre labeled datasets and get colored images where pixels belonging to same group are colored differently. Thus, we use the validation set within the dataset only.

What do I do if I wish to use segmentation on my own validation set?

For example, if I want to segment traffic lights or stop signs only, I feel that after training the Neural network on the CamVid dataset, the model might segment irrelevant information like other cars, sky, trees, etc.

Is my line of thinking correct ? If yes, how do I stop irrelevant information from being segmented by the model ?