Deep Learning Brasília - Revisão (lições 1, 2, 3 e 4)


(Pierre Guillou) #1

<<< Post: Lição 4 | Post: Lição 5 >>>

!

Revisão

Revisão do conteúdo ministrado até aqui, tanto teórico quanto prático, com apresentações do @pierreguillou, @saulberardo e @thgomes.

Caixa de ferramentas do Deep Learning

Exercício 1

  • [UPDATE 21/04/18] Use este Juptypter notebook : https://github.com/piegu/fastai-projects/blob/master/lesson1-quick.ipynb
  • Goal : write the code without copy/paste
  • How to get it ? In a terminal on your GPU machine (local or online) :
    wget --no-check-certificate https://raw.githubusercontent.com/piegu/fastai-projects/master/lesson1-quick.ipynb
    ou
    git clone https://github.com/piegu/fastai-projects.git

Exercício 2

  • [UPDATE 21/04/18] Use este Juptypter notebook : https://github.com/piegu/fastai-projects/blob/master/lesson1-DogBreed.ipynb
  • Goal : understand the code
  • How to get it ? In a terminal on your GPU machine (local or online) :
    wget --no-check-certificate https://raw.githubusercontent.com/piegu/fastai-projects/master/lesson1-DogBreed.ipynb
    ou
    git clone https://github.com/piegu/fastai-projects.git

Pontos chaves

Fonte : Wiki thread: Intro workshop

  1. Terminal (it is an interface in which you can type and execute text based commands)
  2. GPU
  3. Git
  4. Python
  5. Python environment
  6. Jupyter notebook (1h45m54s : video do @jeremy sobre como usar um Jupyter Notebook + Jupyter Notebook Commands & Shortcuts)
  7. O nosso grupo
  8. Deep Learning (What you need to do deep learning)
  9. Pytorch (A practitioner’s guide to PyTorch)
  10. Kaggle (script kaggle-cli to download:upload images)

1) Terminal

2) GPU (Graphic Processing Units)

GPU local

GPU online

  • GPU NVIDIA usa CUDA que é uma linguagem de programação dos GPUs usada pela maioria das bibliotecas de DL (tensorflow, pytorch…).
  • pytorch installs automatically CUDA.
  • Crestle
  • Paperspace
  • Amazon Web Services (AWS)
  • Clouderizer + Google Colab
Paperspace
  • Paperspace setup
  • Use fastai template, choose P4000 + Public IP + Promo code FASTAI15 ou FASTAI3BDG (15$ credit) + Auto-shutdown (1h)
  • Launch + use default password in paperspace email
  • Change password in terminal by using the command : passwd
  • update conda by using the command : conda update --all
  • update fastai files by using the 2 commands : cd fastai + git pull
  • update fastai library by using the command (in the fastai folder) : conda env update
  • don’t forget to switch off your paperspace machine !!! by using the command : sudo shutdown -h now
Clouderizer + Google Colab

3) Git

  • Git is a version control system that allows to work on a project with checkpoints backup.
  • GitHub is a Git repository hosting service that allows to share your project (Github guide).
  • Main commands :
    git clone (ex: git clone https://github.com/fastai/fastai.git)
    git pull

4) Python

5) Setup a python environment

  • Anaconda (https://anaconda.org/) : global environnment to run python scripts within jupyter notebooks
    – Update the conda environment in a terminal : conda update --all
  • Why ? To practice python 3.6, pandas, numpy, jupyter notebooks… in your computer.
  • How to create a virtual environment ? In a terminal :
    conda create -n envname python=3.6
    source activate envname (to start the virtual environment)
    source desactivate (to stop the virtual environment)
  • Update a virtual environment :
    – Go to folder (cd ...)
    – type in a terminal : conda env update

6) Jupyter Notebook

7) Fastai

8) O nosso grupo

9) Deep Learning

What is Deep Learning ? A kind of Machine Learning.

The Universal Approximation Theorem & Examples using Deep Learning

  1. Infinitely flexible function that can solve any problem. This is the Neural Network : a set of linear layers and non linear layers that can approximate any given problem if we have sufficient data for training it (Universal Approximation Theorem). This is a multiple hidden layers network (multiple hidden layers are necessary to get a better accuracy).
  2. Need to setup the parameters of this general function (all-purpose parameters filtering) for each use. This is the Gradient descent : this is the learning method of the parameters of our DL algorithm (after each epoch, we use it on the loss function to improve the value of our parameters by searching the global minimum of the loss function, even if it is often a local minimum).
  3. Need fast and scalable computation. This is the GPU : GPUs are globally 10 times faster than CPUs and cheaper. GPUs are necessary to train our DL models in a reasonably time.
    – DL uses : generate automatic answers to an email, Skype Translator, Semantic Style Transfer, cancer detection in medical images.
    – A lot of opportunities to solve problems with DL.

More examples using Deep Learning & CNN

Non linearity, Gradient Descent & Learning rate

  • Link : https://www.youtube.com/embed/IPBSB1HLNLo?hl=en_US&cc_lang_pref=en_US&cc_load_policy=1&autoplay=1&start=3733&end=4102
  • Duration : 3 mn 26s (01:02:13 to 01:08:22)
  • Topics :
    – Adding a Non-Linear Layer to our model, sigmoid or ReLu (rectified linear unit) : the key point is that allows us to create any kind of functions and so to solve any kind of problems.
    SGD (Stochastic Gradient Descent) : in order to find the minimum of an loss function, we can move from a point on the curve to the opposite direction of the gradient (derivative).
    – But we need to take a small step (but not too small…) : this is the learning rate.

What “sees” a CNN & Learning rate finder

  • Link : https://www.youtube.com/embed/IPBSB1HLNLo?hl=en_US&cc_lang_pref=en_US&cc_load_policy=1&autoplay=1&start=4103&end=4899
  • Duration : 13 mn 16s (01:08:23 to 01:21:39)
  • Topics :
    – A paper on “Visualizing and Understanding Convolutional Networks” : on the left, there are 9 kernel (filters) applied to the input layer that result in 9 feature maps in the first hidden layer (layer 1). Each filter searches a specific basic shape (edge), and on the right, there are the 9 top activations for each filter (within the images set used).
    — Key point : the filters are learned (the 9 numbers of its matrix), not programmed.
    — Layer 2 is the results of applying a new series of filters (16) on the 9 features maps of layer 1. Again, we show the 9 top activations for each filter. For example, we can see that now there are activations by photos of sunset, horizontal lines or corners : the layer 2 detects things more complicated than layer 1.
    — Layer 3 : its filters recognize text or human faces (only 3 layers to perform that detection !).
    — Layer 5 : its filters recognize animals…
    – Implementation on ‘lesson1.ipynb’
    – ‘Cyclical Learning Rates for Training Neural Networks’ with Fastai library as “lr_find” or learning rate finder : it allows to get semi-automatically the right learning rate to train our neural network :slight_smile:
    – Why it starts training a model but stops before 100%: use Learner Schedule Finder.
    – How many epochs to run ? (epochs : go through the entire dataset by batch). As many you want until the accuracy gets worst.

Brasília part 1 group
(Pierre Guillou) #2

Pessoal,
quer entender como funciona um CNN ?

3 artigos a estudar :


(Pierre Guillou) #3

Meu post sobre Resnet e a razão pela qual dá certo :
Understand how works Resnet… without talking about residual


(Pierre Guillou) #4

Video do Yann Lecun sobre Convolutional Neural Networks : https://www.youtube.com/watch?v=xgqm6TDhjrQ


(Pierre Guillou) #5


(Pierre Guillou) #6

No sábado, falei sobre o problema de obter as previsões sobre o dataset test quando a gente esqueceu de colocar o nome da sua pasta no objeto data (e ai, antes de treinar o seu objeto learn).

Acabei de encontrar um post dando uma solução que publiquei no ano passado :slight_smile: