[ Português ] Esta thread permite que os participantes do “Grupo de Estudo do Deep Learning de Brasília (DLB)” estudem coletivamente (em reunião presencial e online) a lição 8 (parte 2) do curso fast.ai, mas com certeza, está aberto para todos. Todos os vídeos desse curso estão online em http://course.fast.ai/part2.html.
O idioma dessa thread é principalmente o português (e o inglês quando evitar traduções inúteis).
Por favor, use essa thread para perguntar e responder às perguntas da lição 1 (parte 2), mas antes de postar, leia a thread seguinte : Part 2 Lesson 8 wiki
[ English ] This thread allows the participants of the “Brasília Deep Learning (DLB) Study Group” to study collectively (face-to-face and online) the lesson 1 (part 1) of the course fast ai but of course, it is opened to all. All the course videos are online at http://course.fast.ai/part2.html.
The language of this thread is mainly Portuguese (and English when it avoids useless translations).
Please, use this thread to ask / answer questions on lesson 1 (part 1) but before to post in this thread, please read : Part 2 Lesson 8 wiki
(from @jeremy : Part 2 Lesson 8 wiki)
Lesson resources
Tips
Returning to AWS?
- Login to AWS and get your public IP (xx.xx.xx.xx). Then, follow commands:
ssh ubuntu@xx.xx.xx.xx -L8888:localhost:8888
git pull
conda env update
jupyter notebook
- For PyCharm and Mac users - a list of the shortcuts Jeremy provided for Visual Studio Code:
- Action (PyCharm + Mac shortcut)
- Command palette- (
Shift + Command + A
)
- Select interpreter (for fastai env) - (Shift+Command+A) and then look for “interpreter”
- Select terminal shell- (
Shift+Command+A
) and then look for “terminal shell”
- Go to symbol (
Option + Command + O
)
- Find references (
Command+ G
)(go down in the references) (Command + Shift + G
) (go up)(Command + Function + F7
) (look for all)
- Go to definition (
Command + Down Arrow Key
)
- Go back (
Command + [
)
- View documentation (
Option + Space
) for viewing source and (Function + F1
) for viewing documentation
- Zen mode (
Control + Command + F
) and same to get out too
- Hide sidebar
(Command + 1
) redoing it will bring it back
- Find them all with the (
Shift + Command+ A
) palette option for reference.
Time line for videos
Resources related to stuff that Jermey mentioned we can learn/Homework
Others
Other references
Reminder
-
Part 1 was full of best practical techniques (differentiable layers, transfer learning, architecture design, handling over-fitting, embeddings) about DL grouped into concepts (computational vision, structured data, linguagem natural)
-
Part 2 is cutting edge : more in details with atual techniques but without 100% confident (still on research)
-
Lesson 8 (part 2) : object detection (creating much richer CNN structure)
-
Differentiable layers : (from Yann Lecun) DL = Differentiable Programming (setting up a differentiable function - the loss function that describes right scores - and using it to update weights in the NN)
What we’ve learned so far :
-
Transfer learning : most important single-thing to use DL effectively (“random weights” - starting from zero - is not the most important thing)
1.1) replace last layer
1.2) fine-tune new layers
1.3) fine-tune more layers (the whole model)
-
Architecture design : choose to adapt to dataset
** CNNs for fixed-size ordered data
** RNNs for sequences
** Softmax for single categorical outcome (sigmoids if you’ve got multiple outcomes)
** Relu for inner activations
- Avoid overfitting in 5 steps (but starts with a model that overfits and get it better after)
3.1) more data
3.2) data augmentation
3.3) normalization : generalizable architectures (Batch Norm layers, Dense Nets (Densely Connected Convolutional Networks))
3.4) regularization (weight decay, dropout)
3.5) (last) reduce architecture complexity (less layers, less activations)
-
Embeddings allow us to use categorical data (it is very used today after the release of part 1)
From part 1 to part 2 :
- requires significant fastai customization
- need to understand python well
- other code resources will generally be research quality
- code samples online nearly always have problems
- each lesson will be incomplete - ideas to explore
Jupyter notebook
- don’t copy/paste : try yourself by typing code
- you must practice !
Get a GPU local or use an online GPU
Read academic papers
There are many opportunities for you in this class
- Your homework will be a the cutting
- There are few DL practiioners that know what you know now
- Experiment lots, especially in your area of expertise
- Muche of what you find will have not be written about before
- Don’t wait to be perfect before you start communicating
- If you don’t have a blog, try medium.com
What we’ll study in part 2
-
Generative models
**CNNS beyond classification : localization, enhancement (colorization, super-resolution, artistic style)
** NLP beyond classification (translation, Seq2seq, attention, large vocabularies)
-
Large data sets (large images, lots of data points, large outputs)
- but note time-series and tabular data (cf part 1 and ML course) as we saw mostly all techniques
Object detection
- We want to classify multiple objects (multi classe) that can overlap (detect bounding boxes and label them)
-
bounding box : a box rectangle which has an object quasi entirely with label
-
Step 1 : classify and localize the largest object in each image (classify, localize, classify & localize in a bounding box)
1) Analyse training data
- We use the notebook pascal.ipynb
- Download data (see HowTo)
- Choose GPU if you have many :
torch.cuda.set_device(x)
-
PATH.iterdir()
: generator (for example, can get list of files from a folder : list(PATH.iterdir())
- How to deal with a json file (in the video : get content from
json.load
and open
, get dictionary with dict_keys…) : trn_j = json.load((PATH/'pascal_train2007.json').open())
- Create a dictionary with image id as key and annotations (list with bound box values and category id) as value : defaultdict
- MAKE THINGS CONSISTENT : to be numpy compatible, we change the [X, Y, Height, Width] coordinates of the bound box to [Y, X, X+W-1, Y+H-1].
How to navigate through the Fastai code by using an editor
2) Visualize training images
- Some libs take VOC format bounding boxes, so this let’s us convert back when required:
def bb_hw(a): return np.array([a[1],a[0],a[3]-a[1]+1,a[2]-a[0]+1])
-
import an image : use
open_image()
which is a fastai function that uses OpenCV which is 5 to 10 times faster than PyTorch Vision (Pillow PIL is faster that PyTorch Vision but not as OpenCV) : im = open_image(IMG_PATH/im0_d[FILE_NAME])
-
display an image : use matplotlib :
** plt.subplots() returns fig and ax. Really useful wrapper for creating plots, regardless of whether you have more than one subplot.
** Create functions to draw bounded boxes (rectangles) with labels on training images (black/white/black for the line in order to get contrast)
Largest item classifier
- Define a function
get_lrg(b)
- Create a dataframe with columns fn (files names) and category.
- Create a CSV file from this DataFrame.
- Then, you can use the Fastai function
ImageClassifierData.from_csv
to get a classifier model !
- In the
tfms
to create the data object (md), do not crop ! Use crop_type=cropType.NO
- Fit the model as usually.
Python debugger
Do you know how to create a bounding box model?
- Now we’ll try to find the bounding box of the largest object. This is simply a regression with 4 outputs !