Deep Learning Brasília - Lição 8

pierreguillou · June 26, 2018, 2:18pm

Other references

Reminder

Part 1 was full of best practical techniques (differentiable layers, transfer learning, architecture design, handling over-fitting, embeddings) about DL grouped into concepts (computational vision, structured data, linguagem natural)
Part 2 is cutting edge : more in details with atual techniques but without 100% confident (still on research)
Lesson 8 (part 2) : object detection (creating much richer CNN structure)
Differentiable layers : (from Yann Lecun) DL = Differentiable Programming (setting up a differentiable function - the loss function that describes right scores - and using it to update weights in the NN)

What we’ve learned so far :

Transfer learning : most important single-thing to use DL effectively (“random weights” - starting from zero - is not the most important thing)
1.1) replace last layer
1.2) fine-tune new layers
1.3) fine-tune more layers (the whole model)
Architecture design : choose to adapt to dataset
** CNNs for fixed-size ordered data
** RNNs for sequences
** Softmax for single categorical outcome (sigmoids if you’ve got multiple outcomes)
** Relu for inner activations
Avoid overfitting in 5 steps (but starts with a model that overfits and get it better after)
3.1) more data
3.2) data augmentation
3.3) normalization : generalizable architectures (Batch Norm layers, Dense Nets (Densely Connected Convolutional Networks))
3.4) regularization (weight decay, dropout)
3.5) (last) reduce architecture complexity (less layers, less activations)
Embeddings allow us to use categorical data (it is very used today after the release of part 1)

From part 1 to part 2 :

Jupyter notebook

Get a GPU local or use an online GPU

Read academic papers

There are many opportunities for you in this class

What we’ll study in part 2

Generative models
**CNNS beyond classification : localization, enhancement (colorization, super-resolution, artistic style)
** NLP beyond classification (translation, Seq2seq, attention, large vocabularies)
Large data sets (large images, lots of data points, large outputs)
but note time-series and tabular data (cf part 1 and ML course) as we saw mostly all techniques

Object detection

We want to classify multiple objects (multi classe) that can overlap (detect bounding boxes and label them)
bounding box : a box rectangle which has an object quasi entirely with label
Step 1 : classify and localize the largest object in each image (classify, localize, classify & localize in a bounding box)

1) Analyse training data

We use the notebook pascal.ipynb
Download data (see HowTo)
Choose GPU if you have many : torch.cuda.set_device(x)
PATH.iterdir() : generator (for example, can get list of files from a folder : list(PATH.iterdir())
How to deal with a json file (in the video : get content from json.load and open, get dictionary with dict_keys…) : trn_j = json.load((PATH/'pascal_train2007.json').open())
Create a dictionary with image id as key and annotations (list with bound box values and category id) as value : defaultdict
MAKE THINGS CONSISTENT : to be numpy compatible, we change the [X, Y, Height, Width] coordinates of the bound box to [Y, X, X+W-1, Y+H-1].

How to navigate through the Fastai code by using an editor

Visual Studio Code : good choice as editor
Setup : download and install, open fastai folder, change interpreter to the python one of your fastai environment (CTRL+SHIFT+P), open the integrated terminal
vscode shortcuts

2) Visualize training images

Some libs take VOC format bounding boxes, so this let’s us convert back when required: def bb_hw(a): return np.array([a[1],a[0],a[3]-a[1]+1,a[2]-a[0]+1])
import an image : use open_image() which is a fastai function that uses OpenCV which is 5 to 10 times faster than PyTorch Vision (Pillow PIL is faster that PyTorch Vision but not as OpenCV) : im = open_image(IMG_PATH/im0_d[FILE_NAME])
display an image : use matplotlib :
** plt.subplots() returns fig and ax. Really useful wrapper for creating plots, regardless of whether you have more than one subplot.
** Create functions to draw bounded boxes (rectangles) with labels on training images (black/white/black for the line in order to get contrast)

Largest item classifier

Define a function get_lrg(b)
Create a dataframe with columns fn (files names) and category.
Create a CSV file from this DataFrame.
Then, you can use the Fastai function ImageClassifierData.from_csv to get a classifier model !
In the tfms to create the data object (md), do not crop ! Use crop_type=cropType.NO
Fit the model as usually.

Python debugger

Do you know how to create a bounding box model?

Now we’ll try to find the bounding box of the largest object. This is simply a regression with 4 outputs !