Deep Learning Brasília - Revisão (lições 1, 2, 3 e 4)


(Pierre Guillou) #1

<<< Post: Lição 4Post: Lição 5 >>>

Revisão

Revisão do conteúdo ministrado até aqui, tanto teórico quanto prático, com apresentações do @pierreguillou, @saulberardo e @thgomes.

Caixa de ferramentas do Deep Learning

Exercício 1

  • [UPDATE 02/06/18] Use este Juptypter notebook : https://github.com/piegu/fastai-projects/blob/master/lesson1-quick.ipynb
  • Goal : write the code without copy/paste
  • How to get it ? In a terminal on your GPU machine (local or online) :
    wget --no-check-certificate https://raw.githubusercontent.com/piegu/fastai-projects/master/lesson1-quick.ipynb
    ou
    git clone https://github.com/piegu/fastai-projects.git

Exercício 2

  • [UPDATE 21/04/18] Use este Juptypter notebook : https://github.com/piegu/fastai-projects/blob/master/lesson1-DogBreed.ipynb
  • Goal : understand the code
  • How to get it ? In a terminal on your GPU machine (local or online) :
    wget --no-check-certificate https://raw.githubusercontent.com/piegu/fastai-projects/master/lesson1-DogBreed.ipynb
    ou
    git clone https://github.com/piegu/fastai-projects.git

Pontos chaves

Fonte : Wiki thread: Intro workshop

  1. Terminal (it is an interface in which you can type and execute text based commands)
  2. GPU
  3. Git
  4. Python
  5. Python environment
  6. Jupyter notebook (1h45m54s : video do @jeremy sobre como usar um Jupyter Notebook + Jupyter Notebook Commands & Shortcuts)
  7. O nosso grupo
  8. Deep Learning (What you need to do deep learning)
  9. Pytorch (A practitioner’s guide to PyTorch)
  10. Kaggle (script kaggle-cli to download:upload images)

1) Terminal

  • Test um terminal online
  • Ubuntu (terminal Linux no Windows 10)
  • Cygwin (terminal (quase) Linux no Windows exceto 10)
  • iTem2 (Mac) : melhor terminal quanto ao terminal incluido num Mac
  • HomeBrew (brew num terminal Mac = apt num terminal Windows)
  • Tmux (multiple terminal windows : A Quick and Easy Guide to tmux)
  • Editor num terminal linux
    – Vim (https://benmccormick.org/2014/07/14/learning-vim-in-2014-configuring-vim/)
    – Nano (https://www.hostinger.com.br/tutoriais/como-instalar-editor-de-texto-nano/)
    – Cat (https://www.computerhope.com/unix/ucat.htm)
  • 50 Most Frequently Used UNIX / Linux Commands (With Examples)
    %%bash (followed by a list of commands) (run all the listed bash commands like in a Terminal but do nt move the path after running all theses commands)
    ls (what is inside a folder)
    ls -l (what is inside a folder with details about files size, rights and symlinks)
    ls -alt (what is inside a folder with details about files size, rights and symlinks, even cached files and organized by date)
    cd (go to)
    pwd (where I am)
    history (list of all my commands)
    – alt enter (full screen)
    – up/down arrows to get commands already typed
    wget (download a file)
    mkdir (create a folder)
    mv (move to another folder)
    find path -name 'file_name' (find a file with its name in all subfolders of a path)
    grep -r 'text' path (find a text in all files of all subfolders of a path)
    bash (run a bash script *.sh)
    source .bashrc
    cd ~ (go to your home in Linux)
    cd /mnt/ (go to your home in Windows)
    which python (get the path to python in use)
    source activate xxx (activate a virtual environment)
    jupyter notebook (launch a jupyter notebook)

2) GPU (Graphic Processing Units)

GPU local

GPU online

  • GPU NVIDIA usa CUDA que é uma linguagem de programação dos GPUs usada pela maioria das bibliotecas de DL (tensorflow, pytorch…).
  • pytorch installs automatically CUDA.
  • Crestle
  • Paperspace (credit of 15$)
  • Clouderizer + Google Colab (FREE !!!)
  • Google Cloud Platform (credit of 300$)
  • Amazon Web Services (AWS)
Paperspace (credit of 15$)
  • Paperspace setup
  • Use fastai template, choose P4000 + Public IP + Promo code FASTAI15 ou FASTAI3BDG (15$ credit) + Auto-shutdown (1h)
  • Launch + use default password in paperspace email
  • Change password in terminal by using the command : passwd
  • update conda by using the command : conda update --all
  • update fastai files by using the 2 commands : cd fastai + git pull
  • update fastai library by using the command (in the fastai folder) : conda env update
  • don’t forget to switch off your paperspace machine !!! by using the command : sudo shutdown -h now
Clouderizer + Google Colab (FREE !!!)
Google Cloud Platform (credit of 300$)
Amazon Web Services (AWS)

3) Git

  • Git is a version control system that allows to work on a project with checkpoints backup.
  • GitHub is a Git repository hosting service that allows to share your project (Github guide).
  • Main commands :
    git clone (ex: git clone https://github.com/fastai/fastai.git)
    git pull

4) Python

5) Setup a python environment

  • Anaconda (https://anaconda.org/) : global environnment to run python scripts within jupyter notebooks
    – Update the conda environment in a terminal : conda update --all
  • Why ? To practice python 3.6, pandas, numpy, jupyter notebooks… in your computer.
  • How to create a virtual environment ? In a terminal :
    conda create -n envname python=3.6
    source activate envname (to start the virtual environment)
    source desactivate (to stop the virtual environment)
  • Update a virtual environment :
    – Go to folder (cd ...)
    – type in a terminal : conda env update

6) Jupyter Notebook

7) Fastai

8) O nosso grupo

9) Deep Learning

What is Deep Learning ? A kind of Machine Learning.

The Universal Approximation Theorem & Examples using Deep Learning

  1. Infinitely flexible function that can solve any problem. This is the Neural Network : a set of linear layers and non linear layers that can approximate any given problem if we have sufficient data for training it (Universal Approximation Theorem). This is a multiple hidden layers network (multiple hidden layers are necessary to get a better accuracy).
  2. Need to setup the parameters of this general function (all-purpose parameters filtering) for each use. This is the Gradient descent : this is the learning method of the parameters of our DL algorithm (after each epoch, we use it on the loss function to improve the value of our parameters by searching the global minimum of the loss function, even if it is often a local minimum).
  3. Need fast and scalable computation. This is the GPU : GPUs are globally 10 times faster than CPUs and cheaper. GPUs are necessary to train our DL models in a reasonably time.
    – DL uses : generate automatic answers to an email, Skype Translator, Semantic Style Transfer, cancer detection in medical images.
    – A lot of opportunities to solve problems with DL.

More examples using Deep Learning & CNN

Non linearity, Gradient Descent & Learning rate

  • Link : https://www.youtube.com/embed/IPBSB1HLNLo?hl=en_US&cc_lang_pref=en_US&cc_load_policy=1&autoplay=1&start=3733&end=4102
  • Duration : 3 mn 26s (01:02:13 to 01:08:22)
  • Topics :
    – Adding a Non-Linear Layer to our model, sigmoid or ReLu (rectified linear unit) : the key point is that allows us to create any kind of functions and so to solve any kind of problems.
    SGD (Stochastic Gradient Descent) : in order to find the minimum of an loss function, we can move from a point on the curve to the opposite direction of the gradient (derivative).
    – But we need to take a small step (but not too small…) : this is the learning rate.

What “sees” a CNN & Learning rate finder

  • Link : https://www.youtube.com/embed/IPBSB1HLNLo?hl=en_US&cc_lang_pref=en_US&cc_load_policy=1&autoplay=1&start=4103&end=4899
  • Duration : 13 mn 16s (01:08:23 to 01:21:39)
  • Topics :
    – A paper on “Visualizing and Understanding Convolutional Networks” : on the left, there are 9 kernel (filters) applied to the input layer that result in 9 feature maps in the first hidden layer (layer 1). Each filter searches a specific basic shape (edge), and on the right, there are the 9 top activations for each filter (within the images set used).
    — Key point : the filters are learned (the 9 numbers of its matrix), not programmed.
    — Layer 2 is the results of applying a new series of filters (16) on the 9 features maps of layer 1. Again, we show the 9 top activations for each filter. For example, we can see that now there are activations by photos of sunset, horizontal lines or corners : the layer 2 detects things more complicated than layer 1.
    — Layer 3 : its filters recognize text or human faces (only 3 layers to perform that detection !).
    — Layer 5 : its filters recognize animals…
    – Implementation on ‘lesson1.ipynb’
    – ‘Cyclical Learning Rates for Training Neural Networks’ with Fastai library as “lr_find” or learning rate finder : it allows to get semi-automatically the right learning rate to train our neural network :slight_smile:
    – Why it starts training a model but stops before 100%: use Learner Schedule Finder.
    – How many epochs to run ? (epochs : go through the entire dataset by batch). As many you want until the accuracy gets worst.

Brasília part 1 group
Another treat! Early access to Intro To Machine Learning videos
Deep Learning Brasília - Lição 5
(Pierre Guillou) #2

Pessoal,
quer entender como funciona um CNN ?

3 artigos a estudar :


(Pierre Guillou) #3

Meu post sobre Resnet e a razão pela qual dá certo :
Understand how works Resnet… without talking about residual


(Pierre Guillou) #4

Video do Yann Lecun sobre Convolutional Neural Networks : https://www.youtube.com/watch?v=xgqm6TDhjrQ


(Pierre Guillou) #5


(Pierre Guillou) #6

No sábado, falei sobre o problema de obter as previsões sobre o dataset test quando a gente esqueceu de colocar o nome da sua pasta no objeto data (e ai, antes de treinar o seu objeto learn).

Acabei de encontrar um post dando uma solução que publiquei no ano passado :slight_smile:


(Pierre Guillou) #7

Video do Otavio Good : A visual and intuitive understanding of deep learning


(Pierre Guillou) #8

Uma thread excepcional para entender porque a gente pode mudar o tamanho das imagens com a rede Resnet (mas não com a rede VGG !) : Changing Image Size during training!


(Pierre Guillou) #9

Fastai parte 1 : 30+ Best Practices :slight_smile:


(Pierre Guillou) #10

Boa noite, foi realmente uma turma de prática hoje !

Com o dataset do Fernando e o script do Welton para criar automaticamente training e validation sets baseados numa percentagem, conseguimos criar uma rede neural de reconhecimento dos senadores :slight_smile:

Para as pessoas que precisam das linhas de códigos que o grupo desenvolveu hoje (dentro do caderno lesson1.ipynb) e em particular Lucas Bessa, atualizei a minha página https://github.com/piegu/fastai-projects/blob/master/lesson1-quick.ipynb criada nesta turma de revisão há 1 mês.

  • Parágrafos atualizados :
    ** “Setup the path to data”
    ** “9) Get the prediction of a specific image”
    ** “Confusion Matrix”
  • Temas das novas linhas de códigos :
    ** “Visualize an image after transformation by tfms_from_model()”
    ** “Display the scatter diagram of the predictions (probabilities) for the tested image”
    ** “Increase the display size of the confusion matrix”

(Pierre Guillou) #11

Mais uma maneira de usar um GPU online with Fastai instalada : “Running fast.ai notebooks with Amazon SageMaker”. :slight_smile:


(Pierre Guillou) #12

Oi pessoal,

meu post no Medium sobre “Fastai | How to start ?”. Espero que possa ajudar novos participantes.

Sinta-se à vontade para me perguntar mais informações


(Pierre Guillou) #13

Passo-a-passo para instalar Fastai no seu computador Windows usando o CPU (não o GPU)

Leia o documento https://github.com/fastai/fastai/blob/master/README.md mas siga as etapas seguintes :

  1. instale Anaconda for Windows
  2. Abre o terminal “Anaconda Prompt” (que foi instalado pela Anaconda) e digitalize os comandos seguintes
  3. mkdir fastai (para criar a pasta fastai)
  4. cd fastai (para entrar na pasta fastai)
  5. git clone https://github.com/fastai/fastai.git (para baixar os arquivos Fastai incluindo notebooks e arquivo para instalar o ambiente virtual fastai-cpu: pytorch, bibliotecas numpy, pandas, bcolz, etc.)
  6. conda env update -f environment-cpu.yml (IMPORTANT : use o arquivo environment-cpu.yml porque você quer usar o seu CPU, não um GPU)
  7. conda activate fastai-cpu (para ativar o ambiente virtual fastai-cpu)
  8. cd courses\ml1 (entre na pasta ml1)
  9. del fastai (apague o symlink fastai que foi criado para funcionar no ambiente bash)
  10. mklink /d fastai ..\..\fastai (crie o symlink windows fastai para a pasta fastai que tem os arquivos da biblioteca Fastai)
  11. cd ..\.. (saia da pasta ml1 para votlar para raiz da pasta fastai criada na etapa 3)
  12. jupyter notebook (lance o jupyter notebook que vai abrir-se num navegador Web)

“et voilà” :slightly_smiling_face: Você tem a biblioteca Fastai (e seus notebooks) instalada no seu computador e pode rodar todos os notebooks da pasta ml1.

Nota : as etapas 8 para 10 permetem a criação de um symlink entre a pasta ml1 e a pasta fastai que tem todos os arquivos da biblioteca Fastai : se quiser rodar os notebooks da pasta dl1 por exemplo, tem de criar também o symlink entre dl1 e a pasta fastai seguindo as etapas de 8 para 10 !
** 8) cd courses/dl1
** 9) del fastai
** 10) mklink /d fastai ..\..\fastai


(Kadu M Pires) #14

Excelente post Pierre, obrigado pela Ajuda… Apesar de seguir suas orientações, enfrentei problemas com o pacote PyTorch, sempre dava erro Conda HTTPError.

Depois de muita pesquisa, tentativa e erro, consegui contornar este problema e descrevi neste post o passo a passo de como resolvê-lo.

Espero que seja útil…