I replace dogs and cats collection with buildings and pictures,
get an error in the image classifier
ValueError: zero-size array to reduction operation maximum which has no identity
2 points, not sure how to adjsut sz yet,
also what I deduce from the following data
img[:4,:4]
array([[[ 64, 121, 198],
[ 64, 121, 198],
[ 64, 121, 198],
[ 64, 121, 198]],
[[ 64, 121, 198],
[ 64, 121, 198],
[ 64, 121, 198],
[ 64, 121, 198]],
[[ 65, 122, 199],
[ 65, 122, 199],
[ 65, 122, 199],
[ 65, 122, 199]],
[[ 65, 122, 199],
[ 65, 122, 199],
[ 65, 122, 199],
[ 65, 122, 199]]], dtype=uint8)
…
Image classification with Convolutional Neural Networks
Welcome to the first week of the second deep learning certificate! We’re going to use convolutional neural networks (CNNs) to allow our computer to see - something that is only possible thanks to deep learning.
Introduction to our first task: ‘Dogs vs Cats’
We’re going to try to create a model to enter the Dogs vs Cats competition at Kaggle. There are 25,000 labelled dog and cat photos available for training, and 12,500 in the test set that we have to try to label for this competition. According to the Kaggle web-site, when this competition was launched (end of 2013): “State of the art: The current literature suggests machine classifiers can score above 80% accuracy on this task”. So if we can beat 80%, then we will be at the cutting edge as of 2013!
Put these at the top of every notebook, to get automatic reloading and inline plotting
%reload_ext autoreload
%autoreload 2
%matplotlib inline
Here we import the libraries we need. We’ll learn about what each does during the course.
This file contains all the main external libs we’ll use
from fastai.imports import *
from fastai.transforms import *
from fastai.conv_learner import *
from fastai.model import *
from fastai.dataset import *
from fastai.sgdr import *
from fastai.plots import *
PATH is the path to your data - if you use the recommended setup approaches from the lesson, you won’t need to change this. sz is the size that the images will be resized to in order to ensure that the training runs quickly. We’ll be talking about this parameter a lot during the course. Leave it at 224 for now.
PATH = “data/bp1/”
sz=224
It’s important that you have a working NVidia GPU set up. The programming framework used to behind the scenes to work with NVidia GPUs is called CUDA. Therefore, you need to ensure the following line returns True before you proceed. If you have problems with this, please check the FAQ and ask for help on the forums.
torch.cuda.is_available()
True
In addition, NVidia provides special accelerated functions for deep learning in a package called CuDNN. Although not strictly necessary, it will improve training performance significantly, and is included by default in all supported fastai configurations. Therefore, if the following does not return True, you may want to look into why.
torch.backends.cudnn.enabled
True
Extra steps if NOT using Crestle or Paperspace or our scripts
The dataset is available at http://files.fast.ai/data/dogscats.zip. You can download it directly on your server by running the following line in your terminal. wget http://files.fast.ai/data/dogscats.zip. You should put the data in a subdirectory of this notebook’s directory, called data/. Note that this data is already available in Crestle and the Paperspace fast.ai template.
Extra steps if using Crestle
Crestle has the datasets required for fast.ai in /datasets, so we’ll create symlinks to the data we want for this competition. (NB: we can’t write to /datasets, but we need a place to store temporary files, so we create our own writable directory to put the symlinks in, and we also take advantage of Crestle’s /cache/ faster temporary storage space.)
To run these commands (which you should only do if using Crestle) remove the # characters from the start of each line.
os.makedirs(‘data/bp1/models’, exist_ok=True)
!ln -s /datasets/fast.ai/bp1/train {PATH}
!ln -s /datasets/fast.ai/bp1/test {PATH}
!ln -s /datasets/fast.ai/bp1/valid {PATH}
os.makedirs(’/cache/tmp’, exist_ok=True)
!ln -fs /cache/tmp {PATH}
ln: failed to create symbolic link ‘data/bp1/train’: File exists
ln: failed to create symbolic link ‘data/bp1/test’: File exists
ln: failed to create symbolic link ‘data/bp1/valid’: File exists
os.makedirs(’/cache/tmp’, exist_ok=True)
!ln -fs /cache/tmp {PATH}
First look at cat pictures
!rm {PATH}.ipynb_checkpoints
rm: cannot remove ‘data/bp1/.ipynb_checkpoints’: No such file or directory
Our library will assume that you have train and valid directories. It also assumes that each dir will have subdirs for each class you wish to recognize (in this case, ‘cats’ and ‘dogs’).
os.listdir(PATH)
[‘test’, ‘valid’, ‘models’, ‘train’, ‘tmp’]
os.listdir(f’{PATH}valid’)
[‘p’, ‘b’]
files = os.listdir(f’{PATH}valid/b’)[:3]
files
[‘B16.jpg’, ‘b7.jpg’, ‘b3.jpg’]
files = os.listdir(f’{PATH}train/b’)[:5]
files
[‘bt3.jpg’, ‘bt5.jpg’, ‘bt1.jpg’, ‘bt4.jpg’, ‘bt2.jpg’]
img = plt.imread(f’{PATH}train/b/{files[1]}’)
plt.imshow(img);
Here is how the raw data looks like
img.shape
(259, 194, 3)
img[:4,:4]
array([[[ 64, 121, 198],
[ 64, 121, 198],
[ 64, 121, 198],
[ 64, 121, 198]],
[[ 64, 121, 198],
[ 64, 121, 198],
[ 64, 121, 198],
[ 64, 121, 198]],
[[ 65, 122, 199],
[ 65, 122, 199],
[ 65, 122, 199],
[ 65, 122, 199]],
[[ 65, 122, 199],
[ 65, 122, 199],
[ 65, 122, 199],
[ 65, 122, 199]]], dtype=uint8)
Our first model: quick start
We’re going to use a pre-trained model, that is, a model created by some one else to solve a different problem. Instead of building a model from scratch to solve a similar problem, we’ll use a model trained on ImageNet (1.2 million images and 1000 classes) as a starting point. The model is a Convolutional Neural Network (CNN), a type of Neural Network that builds state-of-the-art models for computer vision. We’ll be learning all about CNNs during this course.
We will be using the resnet34 model. resnet34 is a version of the model that won the 2015 ImageNet competition. Here is more info on resnet models. We’ll be studying them in depth later, but for now we’ll focus on using them effectively.
Here’s how to train and evalulate a dogs vs cats model in 3 lines of code, and under 20 seconds:
#Uncomment the below if you need to reset your precomputed activations
shutil.rmtree(f’{PATH}tmp’, ignore_errors=True)
!rm {PATH}tmp
!rmdir {PATH}.ipynb_checkpoints
arch=resnet34
data = ImageClassifierData.from_paths (PATH, tfms=tfms_from_model(arch, sz))
learn = ConvLearner.pretrained(arch, data, precompute=True)
learn.fit(0.01, 3)
ValueError Traceback (most recent call last)
in ()
1 arch=resnet34
----> 2 data = ImageClassifierData.from_paths (PATH, tfms=tfms_from_model(arch, sz))
3 learn = ConvLearner.pretrained(arch, data, precompute=True)
4 learn.fit(0.01, 3)
~/courses/fastai/courses/dl1/fastai/dataset.py in from_paths(cls, path, bs, tfms, trn_name, val_name, test_name, num_workers)
342 trn,val = [folder_source(path, o) for o in (trn_name, val_name)]
343 test_fnames = read_dir(path, test_name) if test_name else None
–> 344 datasets = cls.get_ds(FilesIndexArrayDataset, trn, val, tfms, path=path, test=test_fnames)
345 return cls(path, datasets, bs, num_workers, classes=trn[2])
346
~/courses/fastai/courses/dl1/fastai/dataset.py in get_ds(fn, trn, val, tfms, test, **kwargs)
289 def get_ds(fn, trn, val, tfms, test=None, **kwargs):
290 res = [
–> 291 fn(trn[0], trn[1], tfms[0], **kwargs), # train
292 fn(val[0], val[1], tfms[1], **kwargs), # val
293 fn(trn[0], trn[1], tfms[1], **kwargs), # fix
~/courses/fastai/courses/dl1/fastai/dataset.py in init(self, fnames, y, transform, path)
164 self.y=y
165 assert(len(fnames)==len(y))
–> 166 super().init(fnames, transform, path)
167 def get_y(self, i): return self.y[i]
168 def get_c(self): return self.y.shape[1]
~/courses/fastai/courses/dl1/fastai/dataset.py in init(self, fnames, transform, path)
141 def init(self, fnames, transform, path):
142 self.path,self.fnames = path,fnames
–> 143 super().init(transform)
144 def get_n(self): return len(self.y)
145 def get_sz(self): return self.transform.sz
~/courses/fastai/courses/dl1/fastai/dataset.py in init(self, transform)
91 self.transform = transform
92 self.n = self.get_n()
—> 93 self.c = self.get_c()
94 self.sz = self.get_sz()
95
~/courses/fastai/courses/dl1/fastai/dataset.py in get_c(self)
170
171 class FilesIndexArrayDataset(FilesArrayDataset):
–> 172 def get_c(self): return int(self.y.max())+1
173
174
/usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py in _amax(a, axis, out, keepdims)
24 # small reductions
25 def _amax(a, axis=None, out=None, keepdims=False):
—> 26 return umr_maximum(a, axis, None, out, keepdims)
27
28 def _amin(a, axis=None, out=None, keepdims=False):
ValueError: zero-size array to reduction operation maximum which has no identity