Fast.ai v3 2019课程中文版笔记

Daniel · March 8, 2019, 8:55pm

第二课创造你的数据集

如何使用论坛与参与贡献

how to use forum and contribute to fastai

start - 4:26
how to use forum and contribute to fastai
resources
* How to contribute to fastai - Part 1 (2019)
* Doc Maintenance | fastai
where is the most important information at forum?
- official updates and resources
- start from here
how not be intimidated by the overwhelming forum
- click summary button

如何重返工作

How to return to work?

- with [kaggle](https://course.fast.ai/update_kaggle.html)
	- click into kernels
- with your local workplace
	- `git pull`
	- `condo update conda` outside conda environment
	- `conda install -c fastai fastai`

学员第一周的成果

What students have done after the first week?

4:26-12:56
What students have done after the first week
- use NN to clear whatapp downloaded images
- use NN to beat the state of art on recognize background noise
- new state of art performance on a language DHCD recognition
- turn point mutation of tumor into images and beat the sate of art
- automatically playing science communication games with transfer learning and fastai
- James Delinger: do useful things without reading math equations (greek)
- Daniel R. Armstrong: want to contribute to the library, step by step, you will get there
- project to classify zucchinis (39 images) and cucumbers (47 images)
- use PCA to create a hairless classifier for dogs and cats
- classifier for new and old special buses
- models classify 110 cities from satellite images
- models to classify complete and incomplete construction sites

课程结构和教学哲学

What is the course structure and teaching philosophy

12:56 - 16:20
What is the course structure and teaching philosophy
* recursive learning in curriculum
* Perkins’s theory (chinese version)
* code first
* whole game with videos
* concepts not details
* keep moving forward

如何创造属于你的图片分类数据集

How to create your own dataset for classifier

16:20-23:47
How to create your own dataset for classifier
inspired by PyImageSearch, great resources
project to classify teddy bear, grizzly bear and black bear
search “teddy bear” in google image
- ctrl+shift+j or cmd+opt+j
paste the codes and save image urls into a file in your directory
how to create three set of folders experimentally
- create variables for a folder and url.txt
- create the folder path
- download the images into the folder
- do it three times for three kinds of bears
How to verify images that are problematic with `verify_images’?

如何从单一图片文件夹中创造DataBunch

How to create DataBunch from a single fold of images?

23:47-25:42
How to create DataBunch from a single fold of images
- how to set the training set from the single folder
- how to split into a validation set from the single folder
- why set random seed before creating DataBunch?

如何检验图片，标注，数据集的大小

How to check images, labels, and sizes of train and validation set

25:42-26:49
How to check images, labels, and sizes of train and validation set
* How to display images from a batch
* How to check labels and classes
* How to count the size of train_ds and valid_ds?

如何训练和保存模型

How to train and save the model

26:49-27:41
How to train and save the model
- how to create a CNN model with ResNet34 and plot error-rate
- how to train the model for 4 epochs
- how to save the trained model

如何寻找最优学习率

如何从图中读取最优学习率区间

27:36-29:39
视频节点

怎样的下坡才是真正有意义的最优学习率区间？
- “bumpy"起伏不平的不好，”平滑坡陡”更好？
- 主要靠实验来积累感官经验，构造良好直觉
怎么选择的 (3e-5, 3e-4)?
- 确定好了3e-5后，通常选择1e-4或3e-4
- 依旧是依靠实验和经验累积

如何解读模型

How to interpret the model

29:39-29:57
How to interpret the model
how to read most confused matrix?

噪音数据和模型输出

Noisy data and model output

29:57-31:31
Noisy data and model output
What does noisy data mean?
- such as mislabelled data
What problem noisy data could cause model to have?
- unlikely, some data are predicted correctly with high confidence
- these data are likely to be mislabelled
Solution approach
- joint domain expert and machine automation

如何用widget清理数据中的噪音

How to clean up noisy data with widget?

31:31-35:32
How to clean up noisy data with widget
How to work with widget to clean mislabelled data manually?

如何为Nb创造一个widget

How to build a ipywidget for your notebook

35:12-37:37
How to build a ipywidget for your notebook
how to read the source code of the widget?
how to build a tool for notebook experimenter?
Exciting to create tools for fellow practitioners
encouraged to dig into the ipywidget docs
not a production web app

什么是偏差噪音

What is biased noise?

37:35-38:32
What is biased noise?
* most time after remove mislabelled data, model improved only a little
* it is normal as model can handle some level of noise itself
* what is toxic is biased noise, not randomly noisy data

如何将模型植入APP

How to put model into production web app?

38:32-45:50
How to put model into production web app
* why to run production on CPU not GPU?
* the time difference between CPU web app vs GPU server is 0.2 vs 0.01s
* how to prepare your model for production use?
* it is very easy and free to use with some instruction on course wiki
* try to make all your classifier into web apps

99%的时间里我们只需调控学习率和训练次数

99% of time what we need to finetune is lr and epochs for CV

46:05-53:09
99% of time what we need to finetune is lr and epochs for CV
experiment what happen when lr is very high
- no way to undo it, has to recreate model
experiment what happen when lr is too low
- loss down very slow
validation loss is lower than training loss
- lr is too low
- too few epochs
too many epochs
- overfitting - to learn specific images of teddy bears
- signal - loss goes down but goes up again
- but it is difficult to make our model to overfit

图片和图片识别背后的数学

what is the math behind an image and its classification?

53:09-62:15
what is the math behind an image and its classification
what is the math behind an image and its classification?
what is behind learn.predict source
what does np.argmax do
what is error_rate source code?
what is behind accuracy function?
which dataset does metric apply to?
doc is not just nice printing of ?, because it may has examples
why use the 3 of 3e-5 often?

线性函数，数组乘法与神经网络的关系

what is linear function, and how matrix multiplication fit in?

62:15- 68:23
what is linear function, and how matrix multiplication fit in
* KhanAcademy for basics and advanced math
* to replace b with a_2*x_2
* there are lots of examples (x1, y1), (x2, y2), …
* Rachel’s best linear algebra course
* vectorization, dot product, matrix product to avoid loop and speed up
* matrix multiplication in visualization

关于数据大小，不对称数据，模型结构，参数的问题

QA on data size, unbalanced data, model framework and weights

68:32-74:14
QA on data size, unbalanced data, model framework and weights
How do we know we don’t have enough data
* lr is good, can’t be a little higher or lower
* if epochs goes a little bigger then make validation loss worse
* then we may need to get more data
* most time you need less data than you think
How do you deal with unbalanced data?
- do nothing, it always works
What is ResNet34 as function?
- function framework without number or weights
- pretrained model with weights

如何手动构建一个简单的神经网络

How to create the simplest NN (tensor, rank)?

74:14-101:10
How to create the simplest NN (tensor, rank)
what is the simplest architecture?
what is SGD?
how to generate some data for a simple linear function?
how to use matrix product @ to create the linear architecture?
what is a tensor?
- array
what is a rank?
- rank 1 tensor is a vector
how to create the X features?
how to create the coefficients or the weights?
how to plot the x and y (ignoring x_2 as it is just 1)?
what about matplotlib?
how to create MSE function?
how to do scatter plot?
how to do Gradient Descent?
how to calculate derivative with Pytorch?

为什么需要学习率

why do we need learning rate at all?

101:10-105:47
why do we need learning rate at all
- derivative tells us direction and how much
- but it may not best reduce the loss
- we need learning rate to help get loss down appropriately

如何让作图动起来

How to animate the graphs

106:21-108:09
How to animate the graphs

为什么小批量让训练更高效

why mini-batches makes training more efficient?

108:09-109:49
why mini-batches makes training more efficient

新学到的词汇

What are the new vocab learnt?

109:49-111:43
What are the new vocal learnt?
Learning rate
epoch: too many epochs, easily overfit
mini batch: more efficient than full batch training
SGD : GD with mini-batch
Model/Architecture: y = x@a, Resnet34, matrix product
parameters: weights
loss function

总结

Summary

111:43-114:43
Summary
DL as function approximation
You are a math person

什么是过拟合，正则化与验证集

what is overfitting and regularization and validation set

114:43-end
what is overfitting and regularization and validation set
- what is training dataset on the graph?
- which model/graph is underfitting the training set?
- doing bad, having worse loss
- which model/graph is overfitting the training set?
- doing good, having low loss
- both are different from the right model
- both have bad loss on new/validation dataset
- false assumption
- more parameters -> overfitting
- less parameters -> underfitting
- truth
- overfitting and underfitting -> nothing to do with parameter number
- boss and org
- training set can tell underfitting from overfitting and ok models
- validation set can differ overfitting model from OK model
- use validation set from being sold snake oil
- further study
- Rachel’s blog post
- Rachel’s courses