Thank you for your translation. I would like to ask if there is a Chinese communication group (QQ group or WeChat group). Some questions are convenient for communication. Thank you, the author, bother.
fast.ai 见闻
搜集在fast.ai世界里看到的值得关注的动态和见闻
insights from fastai team
interviews with DL heros
interview with Sylvain by Sanyam Bhutani thanks to @init_27
I simply copied the following Q&As from @init_27 's post above
How Sylvain got started with fastai?
I kind of forgot about it (neural net) until October 2017… I was curious to see how the field had progressed — of course, I had heard all the hype around it — so I followed the MOOC version 1…I instantly loved the top-down approach… I have a strong background in Math, but it’s my love for coding practical things that kept me going.
What is it like to work with Jeremy Howard?
We never sleep, but that’s mostly because we both have toddlers!..I’ve improved a lot as a coder and I keep on learning new things from him. Just seeing how he iterates through your code to refactor it in a simpler or more elegant way is always fascinating. And I really love how he is never satisfied with anything short of perfect, always pushing to polish this bit of code or this particular API until it’s as easy to use as possible.
Could you tell us more about your role at fast.ai and how does a day at fast.ai look like?
Since I am based in New York City, we mostly work in parallel. We chat a lot on Skype to coordinate and the rest of the time is spent coding or reviewing code, whether it’s to make the library better or try a new research idea.
As for my role, it’s a mix of reviewing the latest papers and see what we could use, as well as help Jeremy develop new functionality in the library and prepare the next course.
What more can we expect next from the awesome library?
we’ll try to make it easier to put fastai models into production, we’ll focus on the applications we didn’t have time to finalize during the first part of the course (object detection, translation, sequence labeling), we’ll find some way to deal with very big datasets that don’t always fit in RAM, and also play with some research ideas we didn’t get to investigate (training on rectangular images for instance).
How do you discover these ideas, what is the methodology of experimentation at fast.ai?
The methodology could be summarized into: “try blah!”, as Jeremy said in one of the courses. We try to have an intuitive understanding of what happens when training a given model, then we experiment all the ideas we think of to see if they work empirically.
Very often, research papers focus on the Math first and come with this one new theory that is going to revolutionize everything. When you try to apply it though, you often don’t get any good results. We’re more interested in things that work in practice.
How do you stay up to date with the cutting edge?
By experimenting a lot! The fastai library isn’t just a great tool for the beginner, its high flexibility makes it super easy when I want to implement a research article to see if its suggestion results in a significant improvement. The callbacks system or the data block API allow you to do pretty much anything with just a few lines of code.
any advice for the beginners?
Start a blog, where you explain what you have learned. Explaining things is often the best way to realize you hadn’t fully understood them; you may discover there were tons of small details you hadn’t dug enough into.
中文社区动态
meetups
上海meetup征集中, 2019.3.4开始的,thanks to @royam0820 ,上海的小伙伴有福气啊!meetup提供微信群和slack供大家交流。
开启GPU使用心得
技术应用
竞赛分享
JN 技巧分享
thanks to @stas tips and tricks
欢迎使用Kaggle kernels!
Kaggle 是数据科学家和机器学习实践者的在线社区,隶属于谷歌公司。Kaggle 允许用户搜索和发布数据集,在网页环境中搭建和训练模型,与其他数据科学家和机器学习工程师在线合作,和参与竞赛解决数据科学问题。Kaggle 起源于提供竞赛,现在已成长为云端中数据科学实践的公开平台 ( 更多 ).
但是Kaggle Kernels 仍旧有其局限性,见资源与局限。如果你是重返kaggle, 直接前往你的Kernels,点击需要重返工作的kernel即可。
用 Kaggle kernels 来做 fast.ai v3 课程Notebook
Kaggle kernels 自带 fastai library, William Horton @wdhorton
and Sanyam Bhutani @init_27
将课程 notebooks 输出到 Kaggle kernels上。Sanyam Bhutani 在维护这些 kernels, 相关问题可前往 discussion thread here.
没有任何设置安装要求,只需点击 “fork” 然后运行Notebook即可。
Kernels 目录
- Lesson-1 Pets
- Lesson 2 Download
- Lesson 2 SGD
- Lesson 3 Camvid-tiramisu
- Lesson 3 Camvid
- Lesson 3 Head-Pose
- Lesson 3 Planet
- Lesson 3 Tabular
- Lesson 4 Collab
- Lesson 4 Tabular
- Lesson 5 SGD-MNIST
- Lesson 6 Pets-more
- Rossmann data clean
- Lesson 6 Rossmann
- Lesson 7 Human-numbers
- Lesson 7 Resnet MNIST
首次上手步骤
步骤 1: 创建Kaggle账户
注册Kaggle here,到邮件中确认。确认后,即可登陆账户。
步骤 2: 导航到相关Notebook (kernel)
点击上述任意课程Notebook链接,打开页面后,点击fork即可使用。
步骤 3: 一切就绪,直接上手!
我们以及设置好了课程所需的所有的数据集和前提要求,你可以像在本地环境中使用 jupyter notebook一样使用Kaggle kernel.
资源与局限
- Kaggle kernels 是完全免费的
- Notebook 不会像 fastai repository 一样频繁更新
- 这些Notebooks 没有fast.ai 官方维护。 (Sanyam Bhutani 在坚持维护工作,相关链接 discussion thread )
- GPU 时限 (K-80 instance) 每次是6小时。
- 硬盘使用量 = 5 GB/kernel。
- 内存用量 = 14 GB/kernel.
我的本地设置
如何不commit下,对Kaggle kernel大型文件下载 论坛分享
如何给你的code snippet做快捷键 论坛分享
如何做你的第一个文档改进PR 听写
如何创建你的第一个多行代码snippet 论坛分享
我的快捷键设置
how to use mac to snapshot screen
install kite for vim
-
install Kite
-
select vim as editor during installation process
-
go to local setting and install vim and neovim plugins
-
then ready to use kite with vim
vim cursor moving
0 = go to start of a line
$ = go to end of a line
H = go to top of a window
L = go to bottle of a window
M = go to middle of a window
G = go to the end of a file
gg = go to the first line of a file
20G = go to the 20th line of a file
e = next word
b = previous word
( = previous sentence
) = next sentence
{ = previous paragraph or block
} = next paragraph or block
`` = go to previous edit place
Monosnap for video
-
set 5 frame/second
-
high quality
-
capture mouse cursor and clicks
-
it will be small enough
-
it can also create gif from movie too
如何使用git merge
git help merge # to check out how to use git merge
# inside exp branch by `git checkout exp`, run the following to merge with master
git merge master
# then run `git commit -a -m "merge"` to finish it up
.pdbrc.py
"""
This is an example configuration file for pdb++.
Actually, it is what the author uses daily :-). Put it into ~/.pdbrc.py to use
it.
"""
import readline
import pdb
class Config(pdb.DefaultConfig):
filename_color = pdb.Color.yellow
truncate_long_lines = False # so you get all content insight
highlight = True
sticky_by_default = True
line_number_color = pdb.Color.red
filename_color = pdb.Color.yellow
use_pygments = True
bg = 'light'
current_line_color = 1 # white arrow
Looper for youtube
-
chrome extension : looper for youtube
-
set automaticall loop all videos
use atom with Hydrogen
-
atom core packages
-
install hydrogen and its extensions (may not use at all though)
-
install autocompletion python
-
atom beautify
-
source activate fastai
-
go to a folder and then atom
如何将youtube sbv字幕转化为srt
-
在youtube翻译字幕页面下载你的翻译sbv文件
-
前往https://captionsconverter.com/ 做转化
-
前往B站字幕上传你的字幕
翻译Youtube字幕常用快捷键
-
将鼠标放置在主输入栏,翻译即可
-
shift
+space
= 暂停/播放 -
shift
+arrow left/right
= 后退/前进 -
如要修改,前往具体字幕栏修改
如何去除YouTube字幕翻译时的卡顿
-
先下载空白的YouTube提供的sbv字幕
-
删除所有时间设置
-
再重新上传回去
如何在iterm2中切屏分屏跳跃
shift
+ cmd
+ d
= 横切屏幕
opt
+ cmd
+ up/down arrow
= 跳屏
cmd
+ w
= 关屏
最常用的terminal commands
最常用的terminal commands
# find out the size of directory folders
du -sh *
# move cursor to the front or end of a line
ctrl + a = to the end of a line
ctrl + e = to the start of a line
ctrl + u = clear the line before the cursor
ctrl + k = clear the line after the cursor
cmd + k = clear the terminal
ctrl + f = move forward a character
ctrl + b = backward
esc + f = move forward by a word
esc + b = move backward by a word
如何fastai本地安装
如何安装常用软件
-
下载安装conda
-
双击安装
-
更新
condo update conda
outside condo env -
创建独立工作环境
-
conda create -n fastai python=3
或者明确一个版本3.5 -
conda activate fastai
开启实验环境 -
conda deactivate
关闭实验环境 -
conda remove --name fastai --all
删除环境 -
下载安装pdbpp 适配python 3.6均可3.7(可能只要是fastai dev 版本,就行)
-
conda install pdbpp
is a must -
not
pip3 install pdbpp
-
下载安装Jupyter notebook
-
更新 pip:
python3 -m pip install --upgrade pip
-
下载更新Jupyter:
python3 -m pip install jupyter
-
下载安装 Pytorch和fastai libraries
-
一步安装:
conda install -c pytorch -c fastai fastai pytorch
-
更新
conda update conda -y
outside env -
更新
conda update -c fastai fastai
inside env -
检验
conda list
pip show
-
卸载
conda uninstall fastai
-
developer install
git clone fastai-fork
cd fastai-fork
tools/run-after-git-clone
pip install -e ".[dev]"
vim basics to start
Vim basics
learnt from this video by tutorialLinux
:q ; just quit
:w ; save
:wq ; save and quit
:q! ; quit without saving
i ; go into insert mode to write code
ecs ; go back to command mode
dd ; from command mode to delete a line
3dd ; delete 3 lines
u ; undo last action
ctrl + r ; redo action
/search_word ; to search a word inside a file
n ; to move to the next finding of your search
shift + n ; to move back the previous finding
:%s/search_word/replace_word/gc ; replace one by one
:%s/search_word/replace_word/g ; replace all at once
simple workflow
- use search to go around quicky and i to insert and u to delete
如何用上下键跳跃5行代码
go to .vimrc, copy the following
noremap <Up> 5k
noremap <Down> 5j
then, just use arrow up or down
用vim找pdbpp中运行的代码 vim find codelines in pdbpp
-
:find folder/filename
-
press esc
-
type line number
-
shift + g
vim如何剪切,复制,粘贴,保存
vim如何剪切,复制,粘贴,保存
: how to cut, under normal mode
: 1. put cursor to where you want to cut
: 2. press v and move cursor to select characters
: 2. press V and move cursor to select lines
: 3. press d to cut, press y to copy
: 4. move to where to paste,
: 5; press P to paste before cursor
: 5: press p to paset after cursor
: 6. insert mode, press :w and enter
如何做vim常规搜索
文本内如何做vim常规搜索
Searching | Vim Tips Wiki | FANDOM powered by Wikia
/ls ;; 我们在搜索ls, 前面不要有space空格
?*.ls
/path.ls
;; inside .vimrc
set ignorecase
如何退出vim
如何退出vim
:q ; to quit without save
:q! ; to quit without save
:wq ; save and quit
如何对文件夹做tag
如何对文件夹做tag
; terminal文件夹下输入 vim
; 再输入 :MT
; 尝试搜索untar_data
:tag untar ;tab to complete
如何探索代码
如何探索代码
; 将鼠标放在要探索的code上
ctrl + ] ;= dive in
ctrl + t ;= pull back
ctrl + w, ctrl + ] ;= dive in from another horizontal split
ctrl + w, up or dn ;= switch between splits
ctrl + \ ;= dive in from a new tab
ctrl + a, ;left or right ;= switch between tabs
如何寻找文件和文件夹搜索
如何寻找文件和文件夹搜索
:find pathlib ; 寻找pathlib所在文件
- ; 调入上一级文件夹路径
:b# ; 从打开的文档中跳回上一次打开的路径
:tag Path ; 进入文件后再搜索
如何展开和折叠
如何展开和折叠
za ;将鼠标放在+-
如何知道当前所在文件地址
如何知道当前所在文件地址
:F ; tab to complete and enter
安装下载 vim
安装下载 vim
brew install vim
brew upgrade vim
vim # to run vim
设置 vim source
设置 vim source
nano ~/.vimrc
安装 ctags
安装 ctags
brew install ctags
查看 .vimrc
查看.vimrc
set tags=tags
set foldcolumn=3
set foldmethod=indent
set ignorecase
command FileAddress echo expand('%:p')
syntax on
set background=dark
filetype indent plugin on
""""" current millenium
set nocompatible
syntax enable
filetype plugin on
""""" file finder or fuzzy search
set path+=**
""""" display all matching files when tab
set wildmenu
""""" Tag Jumping
command! MakeTags !ctags -R .
command MT MakeTags
""""" tag jump with new tab horizontally or vertically
map <C-\> :tab split<CR>:exec("tag ".expand("<cword>"))<CR>
"""" switch tabs in vim
map <C-a><up> :tabr<cr>
map <C-a><down> :tabl<cr>
map <C-a><left> :tabp<cr>
map <C-a><right> :tabn<cr>
"""""""""""""""" make presentation with vim files
au VimEnter no_plugins.vim setl window=66
au VimEnter no_plugins.vim normal 8Gzz
au VimEnter no_plugins.vim command! GO normal M17jzzH
au VimEnter no_plugins.vim command! BACK normal M17kzzH
au VimEnter no_plugins.vim command! RUN execute getline(".")
" au VimEnter no_plugins.vim unmap H
" au VimEnter no_plugins.vim unmap L
" why dont these work :(
au VimEnter no_plugins.vim nnoremap ^f :GO<CR>
au VimEnter no_plugins.vim nnoremap ^b :BACK<CR>
Conda
Conda
# download miniconda https://docs.conda.io/en/latest/miniconda.html
conda --version # check version:
conda update conda # update conda: , install outside env
conda create -n mesa-abm python=3.6 anaconda # build environment
source activate mesa-abm
source deactivate
conda info --envs # check envs
conda env list # all envs to view
conda create --name new_env --clone existed_env # clone an env
conda remove --name old_env --all # delete an env
conda env export > environment.yml # 输出env
conda env create -f environment.yml # build env from yml
Jupyter notebook install
Jupyter notebook
# If you have Python 3 installed (which is recommended):
python3 -m pip install --upgrade pip
python3 -m pip install jupyter
jupyter notebook # to start
如何撤回本地和推送的commit
如何撤回本地和推送的commit
git checkout -- filename # 撤回未commit的changes
git reset --hard HEAD~1 # 撤回已经commit的changes
git push origin +master
如何免去用户名和密码
如何免去用户名和密码
# Permanently authenticating with Git repositories
$ git config credential.helper store
$ git push https://github.com/repo.git
Username for 'https://github.com': <USERNAME>
Password for 'https://USERNAME@github.com': <PASSWORD>
如何快速git push
如何快速git push
# 一步完成
lazygit 'message'
# 分步骤操作
# create a new repo on github
# go to your Mac directory
git init
git add README.md
git commit -m "first commit"
git remote add origin official-repo.git
git push -u origin master
git reset # to undo git add .
如何在原fastai repo和你的fork repo之间更新?
如何在原fastai repo和你的fork repo之间更新?
# 一步完成
lazyupdate
# 分步骤操作
# step1: fork from official
# step2: git clone from your fork
git clone https://github.com/EmbraceLife/my-fork
cd my_fork
tools/run-after-git-clone # fastai tools
git remote add upstream official-url-git # link to official repo
git remote -v # check all branches local and remote
git pull upstream master # pull from official repo, or
######## suggested by fastai is better I guess
git fetch upstream
git checkout master
git merge --no-edit upstream/master
git push
######## suggested by fastai
git push # update my-fork
git pull # pull from my-fork
如何创建branch并将master更新给branch
如何创建branch, make changes and git push to cloud
git branch # check all branches
git branch new_branch_name # create a branch from where we are
git branch -m a_new_name # rename
git branch -d branch_to_go # delete
git checkout new_branch # switch to a new branch
# make changes, do commit, then push with the following code, it won't affect master branch!!!
git push --set-upstream origin new-branch-name #
git push origin --delete new_branch_name # to delete a branch remote in github
svn checkout url-folder-replace-tree/master-with-trunk # only download part of a repo
如何做版本内容修改和测试
如何做版本内容修改和测试
conda uninstall -y fastai
cd fastai-fork
tools/run-after-git-clone
pip install -e ".[dev]"
# 做版本内容修改,接下来做测试
## 如果是源代码测试
make test
pytest
## 如果是docsrc 测试 (无需!!!)
cd docs_src
./run_tests.sh
ipdb
ipdb
python -m pdb file-name.py
# 原来进入代码,输入insert import pdb; pdb.set_trace() 来debug已经不需要了
sticky # 看到全局代码
ll # 从debug跳回到全局代码
# l 20
# l 1, 20: see line from 1 to 20
s # step into a function
n # 运行下一行
w # call stack, where I started and where I am in source code, d go down a stack, u to go up a stack
b 88 # 运行到88行,暂停
# b file.py:41 or b func_name
# b 11, this_year==2017: conditional breakpoint, at line 11 to breakpoint, if this_year == 2017
cl 1 # 删除第一个breakpoint
r # 运行所在 function
c # 运行直到结束
q # 终止
? # 查看文档
hit return # 重复上一次操作
pp variable_name # 友好打印 该变量
# 完成当前loop: until
构建bash_profile
构建bash_profile
cd # go to home directory
nano .bash_profile # go inside .bash_profile:
alias ex='cd /Users/Natsume/Documents/experiments; conda activate fastai'
alias ft='cd /Users/Natsume/Documents/fastai_treasures/plantseedling/; conda activate fastai'
alias v3='cd /Users/Natsume/Documents/course-v3/nbs/dl1; conda activate fastai'
alias fastai='cd /Users/Natsume/Documents/fastai; conda activate fastai'
alias sfastai='cd /Users/Natsume/miniconda3/envs/fastai/lib/python3.7/site-packages/fastai'
alias pdbpp='python -m pdb'
alias de='conda deactivate'
alias xcode="open -a Xcode"
alias jn='jupyter notebook'
function lazygit() {
git add .
git commit -a -m "$1"
git push
}
export PS1="\w "
export LC_ALL=zh_CN.UTF-8
export LANG=zh_CN.UTF-8
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
# added by Anaconda3 5.2.0 installer
export PATH="/anaconda3/bin:$PATH"
# added by Miniconda3 4.5.12 installer
# >>> conda init >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$(CONDA_REPORT_ERRORS=false '/Users/Natsume/miniconda3/bin/conda' shell.bash hook 2> /dev/null)"
if [ $? -eq 0 ]; then
\eval "$__conda_setup"
else
if [ -f "/Users/Natsume/miniconda3/etc/profile.d/conda.sh" ]; then
. "/Users/Natsume/miniconda3/etc/profile.d/conda.sh"
CONDA_CHANGEPS1=false conda activate base
else
\export PATH="/Users/Natsume/miniconda3/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda init <<<
# added by Miniconda3 4.5.12 installer
# >>> conda init >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$(CONDA_REPORT_ERRORS=false '/Users/Natsume/miniconda3/bin/conda' shell.bash hook 2> /dev/null)"
if [ $? -eq 0 ]; then
\eval "$__conda_setup"
else
if [ -f "/Users/Natsume/miniconda3/etc/profile.d/conda.sh" ]; then
. "/Users/Natsume/miniconda3/etc/profile.d/conda.sh"
CONDA_CHANGEPS1=false conda activate base
else
\export PATH="/Users/Natsume/miniconda3/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda init <<<
构建pdbrc
构建pdbrc
## located at ~ directory, named .pdbrc, no need for source, just save it
alias dr pp dir(%1) # 查看everything underneath the object
alias dt pp %1.__dict__ # 查看object's dictionaries
alias pdt for k, v in %1.items(): print(k, ": ", v) # 查看一个纯 python dictionary
alias loc locals().keys() # local variables
alias doc from inspect import getdoc; from pprint import pprint; pprint(getdoc(%1)) # documents
alias sources from inspect import getsourcelines; from pprint import pprint; pprint(getsourcelines(%1)) # source code
alias module from inspect import getmodule; from pprint import pprint; pprint(getmodule(%1)) # module name
alias fullargs from inspect import getfullargspec; from pprint import pprint; pprint(getfullargspec(%1)) # all arguments names
alias opt_param optimizer.param_groups[0]['params'][%1] # all parameters
alias opt_grad optimizer.param_groups[0]['params'][%1].grad # all gradients of parameters
Jupyter notebook extensions
Jupyter notebook extensions
3 steps to install
conda install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
jupyter nbextension enable toc2/main # in terminal or notebook cell, both are fine
# edit/notebook_config (at bottom of the droplist)
jn color theme
jn color theme
conda install jupyterthemes
jt -t onedork
#| grade3 | oceans16 | chesterish | monokai | solarizedl | solarizedd
youtube-dl
youtube-dl
--write-sub Write subtitle file
--write-auto-sub Write automatic subtitle file (YouTube only)
--all-subs Download all the available subtitles of the video
--list-subs List all available subtitles for the video
--sub-format FORMAT Subtitle format, accepts formats preference, for example: "srt" or "ass/srt/best"
--sub-lang LANGS Languages of the subtitles to download (optional) separated by commas, use IETF language tags like 'en
youtube-dl --write-auto-sub --sub-lang en --sub-format srt https://youtu.be/1ZhtwInuOD0
youtube-dl -f 'best[ext=mp4]' --write-auto-sub --sub-lang en --sub-format srt https://www.youtube.com/playlist?list=PLfYUBJiXbdtSIJb-Qd3pw0cqCbkGeS0xn
其他参考链接
其他参考链接
How to Customize your Terminal Prompt | OSXDaily
inspect — Inspect live objects — Python 3.7.2 documentation
20 Terminal shortcuts developers need to know - TechRepublic
如何为视频做语音解说
如何为视频做语音解说
-
使用ytcropper做视频截取,循环播放
-
mac音量调到最低
-
用quicktime做屏幕录制,提供语音解读,音量调节适中
Tabular models
所需library
所需library
from fastai.tabular import *
pandas是必备
pandas是必备
Tabular data should be in a Pandas DataFrame
.
下载数据
下载数据
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')
预制 `dep_var`, `cat_names`, `cont_names`, `procs`
预制 dep_var
, cat_names
, cont_names
, procs
dep_var = 'salary'
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
cont_names = ['age', 'fnlwgt', 'education-num']
procs = [FillMissing, Categorify, Normalize]
构建test 的data source
构建test 的data source
test = TabularList.from_df(df.iloc[800:1000].copy(),
path=path,
cat_names=cat_names,
cont_names=cont_names)
在df和test data source基础上构建databunch
在df和test data source基础上构建databunch
data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
.split_by_idx(list(range(800,1000)))
.label_from_df(cols=dep_var)
.add_test(test)
.databunch())
展示10行batch数据样本
展示10行batch数据样本
data.show_batch(rows=10)
workclass | education | marital-status | occupation | relationship | race | education-num_na | age | fnlwgt | education-num | target |
---|---|---|---|---|---|---|---|---|---|---|
Private | HS-grad | Never-married | Sales | Not-in-family | White | False | -1.2158 | 1.1004 | -0.4224 | <50k |
? | HS-grad | Widowed | ? | Not-in-family | White | False | 1.8627 | 0.0976 | -0.4224 | <50k |
Self-emp-not-inc | HS-grad | Never-married | Craft-repair | Own-child | Black | False | 0.0303 | 0.2092 | -0.4224 | <50k |
Private | HS-grad | Married-civ-spouse | Protective-serv | Husband | White | False | 1.5695 | -0.5938 | -0.4224 | <50k |
Private | HS-grad | Married-civ-spouse | Handlers-cleaners | Husband | White | False | -0.9959 | -0.0318 | -0.4224 | <50k |
Private | 10th | Married-civ-spouse | Farming-fishing | Wife | White | False | -0.7027 | 0.6071 | -1.5958 | <50k |
Private | HS-grad | Married-civ-spouse | Machine-op-inspct | Husband | White | False | 0.1036 | -0.0968 | -0.4224 | <50k |
Private | Some-college | Married-civ-spouse | Exec-managerial | Own-child | White | False | -0.7760 | -0.6653 | -0.0312 | >=50k |
State-gov | Some-college | Never-married | Tech-support | Own-child | White | False | -0.8493 | -1.4959 | -0.0312 | <50k |
Private | 11th | Never-married | Machine-op-inspct | Not-in-family | White | False | -1.0692 | -0.9516 | -1.2046 | <50k |
构建tabular learner模型
构建tabular learner模型
learn = tabular_learner(data, layers=[200,100], metrics=accuracy)
训练
训练
learn.fit(1, 1e-2)
Total time: 00:03
epoch | train_loss | valid_loss | accuracy |
---|---|---|---|
1 | 0.354604 | 0.378520 | 0.820000 |
Inference
如何做tabular data预测
如何做tabular data预测
row = df.iloc[0]
learn.predict(row)
(Category >=50k, tensor(1), tensor([0.4402, 0.5598]))
MNIST SGD
所需library
所需library
%matplotlib inline
from fastai.basics import *
点击下载数据集
点击下载数据集
Get the ‘pickled’ MNIST dataset from http://deeplearning.net/data/mnist/mnist.pkl.gz. We’re going to treat it as a standard flat dataset with fully connected layers, rather than using a CNN.
查看数据文件夹
查看数据文件夹
path = Config().data_path()/'mnist'
path.ls()
[PosixPath('/home/ubuntu/.fastai/data/mnist/mnist.pkl.gz')]
解压pkl数据包
解压pkl数据包
with gzip.open(path/'mnist.pkl.gz', 'rb') as f:
((x_train, y_train), (x_valid, y_valid), _) = pickle.load(f, encoding='latin-1')
展示图片和训练数据shape
展示图片和训练数据shape
plt.imshow(x_train[0].reshape((28,28)), cmap="gray")
x_train.shape
(50000, 784)
将训练和验证数据转化为torch.tensor
将训练和验证数据转化为torch.tensor
x_train,y_train,x_valid,y_valid = map(torch.tensor, (x_train,y_train,x_valid,y_valid))
n,c = x_train.shape
x_train.shape, y_train.min(), y_train.max()
(torch.Size([50000, 784]), tensor(0), tensor(9))
In lesson2-sgd we did these things ourselves:
x = torch.ones(n,2)
def mse(y_hat, y): return ((y_hat-y)**2).mean()
y_hat = x@a
Now instead we’ll use PyTorch’s functions to do it for us, and also to handle mini-batches (which we didn’t do last time, since our dataset was so small).
将X与Y(torch.tensor)整合成TensorDataset
将X与Y(torch.tensor)整合成TensorDataset
bs=64
train_ds = TensorDataset(x_train, y_train)
valid_ds = TensorDataset(x_valid, y_valid)
将训练和验证集的TensorDataset 整合成DataBunch
将训练和验证集的TensorDataset 整合成DataBunch
data = DataBunch.create(train_ds, valid_ds, bs=bs)
从训练集DataBunch中一个一个提取数据点
从训练集DataBunch中一个一个提取数据点
x,y = next(iter(data.train_dl))
x.shape,y.shape
(torch.Size([64, 784]), torch.Size([64]))
创建模型的正向传递部分
创建模型的正向传递部分
class Mnist_Logistic(nn.Module):
def __init__(self):
super().__init__()
self.lin = nn.Linear(784, 10, bias=True)
def forward(self, xb): return self.lin(xb)
启用GPU机制
启用GPU机制
model = Mnist_Logistic().cuda()
查看模型
查看模型
model
Mnist_Logistic(
(lin): Linear(in_features=784, out_features=10, bias=True)
)
调用模型中的lin层
调用模型中的lin层
model.lin
Linear(in_features=784, out_features=10, bias=True)
模型输出值的shape
模型输出值的shape
model(x).shape
torch.Size([64, 10])
调取模型每一层的参数,查看shape
调取模型每一层的参数,查看shape
[p.shape for p in model.parameters()]
[torch.Size([10, 784]), torch.Size([10])]
设置学习率
设置学习率
lr=2e-2
调用分类问题损失函数
调用分类问题损失函数
loss_func = nn.CrossEntropyLoss()
一次正向反向传递计算函数详解
一次正向反向传递计算函数详解
def update(x,y,lr):
wd = 1e-5
y_hat = model(x)
# 设置 weight decay
w2 = 0.
# 计算 weight decay
for p in model.parameters(): w2 += (p**2).sum()
# 将 weight decay 添加到 常规损失值公式中
loss = loss_func(y_hat, y) + w2*wd
# 求导
loss.backward()
# 利用导数更新参数
with torch.no_grad():
for p in model.parameters():
p.sub_(lr * p.grad)
p.grad.zero_()
# 输出损失值
return loss.item()
对训练集中每一个数据点做一次正反向传递(即SGD),收集损失值
对训练集中每一个数据点做一次正反向传递(即SGD),收集损失值
losses = [update(x,y,lr) for x,y in data.train_dl]
将损失值作图
将损失值作图
plt.plot(losses);
构建一个2层模型,第一层含非线性激活函数ReLU
构建一个2层模型,第一层含非线性激活函数ReLU
class Mnist_NN(nn.Module):
def __init__(self):
super().__init__()
self.lin1 = nn.Linear(784, 50, bias=True)
self.lin2 = nn.Linear(50, 10, bias=True)
def forward(self, xb):
x = self.lin1(xb)
x = F.relu(x)
return self.lin2(x)
开启GPU设置
开启GPU设置
model = Mnist_NN().cuda()
用SGD计算获取训练集的损失值,并作图
用SGD计算获取训练集的损失值,并作图
losses = [update(x,y,lr) for x,y in data.train_dl]
plt.plot(losses);
再次开启模型的GPU计算模式
再次开启模型的GPU计算模式
model = Mnist_NN().cuda()
正反向传递中加入Adam优化算法和opt.step()取代手动参数更新公式
正反向传递中加入Adam优化算法和opt.step()取代手动参数更新公式
def update(x,y,lr):
opt = optim.Adam(model.parameters(), lr)
y_hat = model(x)
loss = loss_func(y_hat, y)
loss.backward()
opt.step()
opt.zero_grad()
return loss.item()
对训练集做SGD,收集损失值,并作图
对训练集做SGD,收集损失值,并作图
losses = [update(x,y,1e-3) for x,y in data.train_dl]
plt.plot(losses);
采用fastai Learner方式进行建模
采用fastai Learner方式进行建模
learn = Learner(data, Mnist_NN(), loss_func=loss_func, metrics=accuracy)
作图寻找学习率最优值
作图寻找学习率最优值
learn.lr_find()
learn.recorder.plot()
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
挑选最优值学习率,进行训练
挑选最优值学习率,进行训练
learn.fit_one_cycle(1, 1e-2)
Total time: 00:03
epoch | train_loss | valid_loss | accuracy |
---|---|---|---|
1 | 0.129131 | 0.125927 | 0.963500 |
画出损失值(训练vs验证)图
画出损失值(训练vs验证)图
learn.recorder.plot_losses()
Lesson 6: pets revisited
三行魔法代码和所需library
三行魔法代码和所需library
%reload_ext autoreload
%autoreload 2
%matplotlib inline
from fastai.vision import *
设置批量大小
设置批量大小
bs = 64
下载数据,获取图片文件夹地址
下载数据,获取图片文件夹地址
path = untar_data(URLs.PETS)/'images'
Data augmentation
对图片做特定处理
对图片做特定处理
tfms = get_transforms(max_rotate=20, max_zoom=1.3, max_lighting=0.4, max_warp=0.4,
p_affine=1., p_lighting=1.)
查看get_transforms文档
查看get_transforms文档
doc(get_transforms)
构建数据src
构建数据src
src = ImageList.from_folder(path).random_split_by_pct(0.2, seed=2)
创建一个定制函数来构建DataBunch
创建一个定制函数来构建DataBunch
def get_data(size, bs, padding_mode='reflection'):
return (src.label_from_re(r'([^/]+)_\d+.jpg$')
.transform(tfms, size=size, padding_mode=padding_mode)
.databunch(bs=bs).normalize(imagenet_stats))
展示同一张图片的各种变形效果(padding=0)
展示同一张图片的各种变形效果(padding=0)
data = get_data(224, bs, 'zeros')
def _plot(i,j,ax):
x,y = data.train_ds[3]
x.show(ax, y=y)
plot_multi(_plot, 3, 3, figsize=(8,8))
展示同一张图片的各种变形效果(padding=reflection)
展示同一张图片的各种变形效果(padding=reflection)
data = get_data(224,bs)
plot_multi(_plot, 3, 3, figsize=(8,8))
Train a model
释放内存空间
释放内存空间
gc.collect()
用迁移学习构建模型 (bn_final=True)
用迁移学习构建模型 (bn_final=True)
learn = create_cnn(data, models.resnet34, metrics=error_rate, bn_final=True)
训练模型 (pct_start=0.8)
训练模型 (pct_start=0.8)
learn.fit_one_cycle(3, slice(1e-2), pct_start=0.8)
Total time: 01:22
epoch | train_loss | valid_loss | error_rate |
---|---|---|---|
1 | 2.573282 | 1.364505 | 0.271989 |
2 | 1.545074 | 0.377077 | 0.094046 |
3 | 0.937992 | 0.270508 | 0.068336 |
解冻,再训练 max_lr=slice(1e-6,1e-3)
解冻,再训练 max_lr=slice(1e-6,1e-3)
learn.unfreeze()
learn.fit_one_cycle(2, max_lr=slice(1e-6,1e-3), pct_start=0.8)
Total time: 00:55
epoch | train_loss | valid_loss | error_rate |
---|---|---|---|
1 | 0.721187 | 0.294177 | 0.058187 |
2 | 0.675999 | 0.285875 | 0.050744 |
改变数据的图片大小
改变数据的图片大小
data = get_data(352,bs)
learn.data = data
再训练 max_lr=slice(1e-6,1e-4)
再训练 max_lr=slice(1e-6,1e-4)
learn.fit_one_cycle(2, max_lr=slice(1e-6,1e-4))
Total time: 01:37
epoch | train_loss | valid_loss | error_rate |
---|---|---|---|
1 | 0.627055 | 0.286791 | 0.058863 |
2 | 0.602765 | 0.286951 | 0.058863 |
保存模型
保存模型
learn.save('352')
Convolution kernel
改变数据批量大小 (缩小)
改变数据批量大小 (缩小)
data = get_data(352,16)
加载上次训练的模型
加载上次训练的模型
learn = create_cnn(data, models.resnet34, metrics=error_rate, bn_final=True).load('352')
展示验证集中的第一个数据点(图和label)
展示验证集中的第一个数据点(图和label)
idx=0
x,y = data.valid_ds[idx]
x.show()
data.valid_ds.y[idx]
Category american_pit_bull_terrier
创建一个kernel or filter
创建一个kernel or filter
k = tensor([
[0. ,-5/3,1],
[-5/3,-5/3,1],
[1. ,1 ,1],
]).expand(1,3,3,3)/6
k
tensor([[[[ 0.0000, -0.2778, 0.1667],
[-0.2778, -0.2778, 0.1667],
[ 0.1667, 0.1667, 0.1667]],
[[ 0.0000, -0.2778, 0.1667],
[-0.2778, -0.2778, 0.1667],
[ 0.1667, 0.1667, 0.1667]],
[[ 0.0000, -0.2778, 0.1667],
[-0.2778, -0.2778, 0.1667],
[ 0.1667, 0.1667, 0.1667]]]])
k.shape
torch.Size([1, 3, 3, 3])
从验证数据中提起一个数据点的图片tensor
从验证数据中提起一个数据点的图片tensor
t = data.valid_ds[0][0].data; t.shape
torch.Size([3, 352, 352])
将3D tensor变成4D
将3D tensor变成4D
t[None].shape
torch.Size([1, 3, 352, 352])
对这个4D tensor做filter处理
对这个4D tensor做filter处理
edge = F.conv2d(t[None], k)
显示filter处理结构
显示filter处理结构
show_image(edge[0], figsize=(5,5));
查看data.c
查看data.c
data.c
37
查看模型结构
查看模型结构
learn.model
Sequential(
(0): Sequential(
(0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
(3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(4): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(5): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(6): Sequential(
(0): BasicBlock(
(conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(4): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(5): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(7): Sequential(
(0): BasicBlock(
(conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
(1): Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=1)
(mp): AdaptiveMaxPool2d(output_size=1)
)
(1): Flatten()
(2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.25)
(4): Linear(in_features=1024, out_features=512, bias=True)
(5): ReLU(inplace)
(6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.5)
(8): Linear(in_features=512, out_features=37, bias=True)
(9): BatchNorm1d(37, eps=1e-05, momentum=0.01, affine=True, track_running_stats=True)
)
)
打印模型总结
打印模型总结
print(learn.summary())
'======================================================================\nLayer (type) Output Shape Param # Trainable \n======================================================================\nConv2d [16, 64, 176, 176] 9408 False \n______________________________________________________________________\nBatchNorm2d [16, 64, 176, 176] 128 True \n______________________________________________________________________\nReLU [16, 64, 176, 176] 0 False \n______________________________________________________________________\nMaxPool2d [16, 64, 88, 88] 0 False \n______________________________________________________________________\nConv2d [16, 64, 88, 88] 36864 False \n______________________________________________________________________\nBatchNorm2d [16, 64, 88, 88] 128 True \n______________________________________________________________________\nReLU [16, 64, 88, 88] 0 False \n______________________________________________________________________\nConv2d [16, 64, 88, 88] 36864 False \n______________________________________________________________________\nBatchNorm2d [16, 64, 88, 88] 128 True \n______________________________________________________________________\nConv2d [16, 64, 88, 88] 36864 False \n______________________________________________________________________\nBatchNorm2d [16, 64, 88, 88] 128 True \n______________________________________________________________________\nReLU [16, 64, 88, 88] 0 False \n______________________________________________________________________\nConv2d [16, 64, 88, 88] 36864 False \n______________________________________________________________________\nBatchNorm2d [16, 64, 88, 88] 128 True \n______________________________________________________________________\nConv2d [16, 64, 88, 88] 36864 False \n______________________________________________________________________\nBatchNorm2d [16, 64, 88, 88] 128 True \n______________________________________________________________________\nReLU [16, 64, 88, 88] 0 False \n______________________________________________________________________\nConv2d [16, 64, 88, 88] 36864 False \n______________________________________________________________________\nBatchNorm2d [16, 64, 88, 88] 128 True \n______________________________________________________________________\nConv2d [16, 128, 44, 44] 73728 False \n______________________________________________________________________\nBatchNorm2d [16, 128, 44, 44] 256 True \n______________________________________________________________________\nReLU [16, 128, 44, 44] 0 False \n______________________________________________________________________\nConv2d [16, 128, 44, 44] 147456 False \n______________________________________________________________________\nBatchNorm2d [16, 128, 44, 44] 256 True \n______________________________________________________________________\nConv2d [16, 128, 44, 44] 8192 False \n______________________________________________________________________\nBatchNorm2d [16, 128, 44, 44] 256 True \n______________________________________________________________________\nConv2d [16, 128, 44, 44] 147456 False \n______________________________________________________________________\nBatchNorm2d [16, 128, 44, 44] 256 True \n______________________________________________________________________\nReLU [16, 128, 44, 44] 0 False \n______________________________________________________________________\nConv2d [16, 128, 44, 44] 147456 False \n______________________________________________________________________\nBatchNorm2d [16, 128, 44, 44] 256 True \n______________________________________________________________________\nConv2d [16, 128, 44, 44] 147456 False \n______________________________________________________________________\nBatchNorm2d [16, 128, 44, 44] 256 True \n______________________________________________________________________\nReLU [16, 128, 44, 44] 0 False \n______________________________________________________________________\nConv2d [16, 128, 44, 44] 147456 False \n______________________________________________________________________\nBatchNorm2d [16, 128, 44, 44] 256 True \n______________________________________________________________________\nConv2d [16, 128, 44, 44] 147456 False \n______________________________________________________________________\nBatchNorm2d [16, 128, 44, 44] 256 True \n______________________________________________________________________\nReLU [16, 128, 44, 44] 0 False \n______________________________________________________________________\nConv2d [16, 128, 44, 44] 147456 False \n______________________________________________________________________\nBatchNorm2d [16, 128, 44, 44] 256 True \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 294912 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nReLU [16, 256, 22, 22] 0 False \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 32768 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nReLU [16, 256, 22, 22] 0 False \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nReLU [16, 256, 22, 22] 0 False \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nReLU [16, 256, 22, 22] 0 False \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nReLU [16, 256, 22, 22] 0 False \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nReLU [16, 256, 22, 22] 0 False \n______________________________________________________________________\nConv2d [16, 256, 22, 22] 589824 False \n______________________________________________________________________\nBatchNorm2d [16, 256, 22, 22] 512 True \n______________________________________________________________________\nConv2d [16, 512, 11, 11] 1179648 False \n______________________________________________________________________\nBatchNorm2d [16, 512, 11, 11] 1024 True \n______________________________________________________________________\nReLU [16, 512, 11, 11] 0 False \n______________________________________________________________________\nConv2d [16, 512, 11, 11] 2359296 False \n______________________________________________________________________\nBatchNorm2d [16, 512, 11, 11] 1024 True \n______________________________________________________________________\nConv2d [16, 512, 11, 11] 131072 False \n______________________________________________________________________\nBatchNorm2d [16, 512, 11, 11] 1024 True \n______________________________________________________________________\nConv2d [16, 512, 11, 11] 2359296 False \n______________________________________________________________________\nBatchNorm2d [16, 512, 11, 11] 1024 True \n______________________________________________________________________\nReLU [16, 512, 11, 11] 0 False \n______________________________________________________________________\nConv2d [16, 512, 11, 11] 2359296 False \n______________________________________________________________________\nBatchNorm2d [16, 512, 11, 11] 1024 True \n______________________________________________________________________\nConv2d [16, 512, 11, 11] 2359296 False \n______________________________________________________________________\nBatchNorm2d [16, 512, 11, 11] 1024 True \n______________________________________________________________________\nReLU [16, 512, 11, 11] 0 False \n______________________________________________________________________\nConv2d [16, 512, 11, 11] 2359296 False \n______________________________________________________________________\nBatchNorm2d [16, 512, 11, 11] 1024 True \n______________________________________________________________________\nAdaptiveAvgPool2d [16, 512, 1, 1] 0 False \n______________________________________________________________________\nAdaptiveMaxPool2d [16, 512, 1, 1] 0 False \n______________________________________________________________________\nFlatten [16, 1024] 0 False \n______________________________________________________________________\nBatchNorm1d [16, 1024] 2048 True \n______________________________________________________________________\nDropout [16, 1024] 0 False \n______________________________________________________________________\nLinear [16, 512] 524800 True \n______________________________________________________________________\nReLU [16, 512] 0 False \n______________________________________________________________________\nBatchNorm1d [16, 512] 1024 True \n______________________________________________________________________\nDropout [16, 512] 0 False \n______________________________________________________________________\nLinear [16, 37] 18981 True \n______________________________________________________________________\nBatchNorm1d [16, 37] 74 True \n______________________________________________________________________\n\nTotal params: 21831599\nTotal trainable params: 563951\nTotal non-trainable params: 21267648\n'
Heatmap
提取模型正向传递计算
提取模型正向传递计算
m = learn.model.eval();
提取一个数据点 (只用X部分)
提取一个数据点 (只用X部分)
xb,_ = data.one_item(x)
对数据点X部分做denormalization处理,在转化为图片格式
对数据点X部分做denormalization处理,在转化为图片格式
xb_im = Image(data.denorm(xb)[0])
对数据点X部分做GPU计算设置
对数据点X部分做GPU计算设置
xb = xb.cuda()
调用callbacks.hooks全部功能
调用callbacks.hooks全部功能
from fastai.callbacks.hooks import *
构建函数提取模型激活层数据
构建函数提取模型激活层数据
def hooked_backward(cat=y):
with hook_output(m[0]) as hook_a:
with hook_output(m[0], grad=True) as hook_g:
preds = m(xb)
preds[0,int(cat)].backward()
return hook_a,hook_g
hook_a,hook_g = hooked_backward()
提取激活层数据,纵向做均值处理
提取激活层数据,纵向做均值处理
acts = hook_a.stored[0].cpu()
acts.shape
torch.Size([512, 11, 11])
avg_acts = acts.mean(0)
avg_acts.shape
torch.Size([11, 11])
构建heatmap作图函数
构建heatmap作图函数
def show_heatmap(hm):
_,ax = plt.subplots()
xb_im.show(ax)
ax.imshow(hm, alpha=0.6, extent=(0,352,352,0),
interpolation='bilinear', cmap='magma');
show_heatmap(avg_acts)
Grad-CAM
论文提出的制作heatmap方法
论文提出的制作heatmap方法
Paper: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
案例1
案例1
grad = hook_g.stored[0][0].cpu()
grad_chan = grad.mean(1).mean(1)
grad.shape,grad_chan.shape
(torch.Size([512, 11, 11]), torch.Size([512]))
mult = (acts*grad_chan[...,None,None]).mean(0)
show_heatmap(mult)
案例2
案例2
fn = path/'../other/bulldog_maine.jpg' #Replace with your own image
x = open_image(fn); x
xb,_ = data.one_item(x)
xb_im = Image(data.denorm(xb)[0])
xb = xb.cuda()
hook_a,hook_g = hooked_backward()
acts = hook_a.stored[0].cpu()
grad = hook_g.stored[0][0].cpu()
grad_chan = grad.mean(1).mean(1)
mult = (acts*grad_chan[...,None,None]).mean(0)
show_heatmap(mult)
案例3: 通过处理数据类别,heatmap从聚焦猫到了狗
案例3: 通过处理数据类别,heatmap从聚焦猫到了狗
data.classes[0]
'american_bulldog'
hook_a,hook_g = hooked_backward(0)
acts = hook_a.stored[0].cpu()
grad = hook_g.stored[0][0].cpu()
grad_chan = grad.mean(1).mean(1)
mult = (acts*grad_chan[...,None,None]).mean(0)
show_heatmap(mult)
Lesson 6 Rossmann sales prediction
2行魔法代码
2行魔法代码
%reload_ext autoreload
%autoreload 2
所需library
所需library
from fastai.tabular import *
Data preparation
feature engineering
feature engineering
To create the feature-engineered train_clean and test_clean from the Kaggle competition data, run rossman_data_clean.ipynb
. One important step that deals with time series is this:
add_datepart(train, "Date", drop=False)
add_datepart(test, "Date", drop=False)
调取feature engineering 之后数据
调取feature engineering 之后数据
path = Config().data_path()/'rossmann'
train_df = pd.read_pickle(path/'train_clean')
查看数据
查看数据
train_df.head().T
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
index | 0 | 1 | 2 | 3 | 4 |
Store | 1 | 2 | 3 | 4 | 5 |
DayOfWeek | 5 | 5 | 5 | 5 | 5 |
Date | 2015-07-31 | 2015-07-31 | 2015-07-31 | 2015-07-31 | 2015-07-31 |
Sales | 5263 | 6064 | 8314 | 13995 | 4822 |
Customers | 555 | 625 | 821 | 1498 | 559 |
Open | 1 | 1 | 1 | 1 | 1 |
Promo | 1 | 1 | 1 | 1 | 1 |
StateHoliday | False | False | False | False | False |
SchoolHoliday | 1 | 1 | 1 | 1 | 1 |
Year | 2015 | 2015 | 2015 | 2015 | 2015 |
Month | 7 | 7 | 7 | 7 | 7 |
Week | 31 | 31 | 31 | 31 | 31 |
Day | 31 | 31 | 31 | 31 | 31 |
Dayofweek | 4 | 4 | 4 | 4 | 4 |
Dayofyear | 212 | 212 | 212 | 212 | 212 |
Is_month_end | True | True | True | True | True |
Is_month_start | False | False | False | False | False |
Is_quarter_end | False | False | False | False | False |
Is_quarter_start | False | False | False | False | False |
Is_year_end | False | False | False | False | False |
Is_year_start | False | False | False | False | False |
Elapsed | 1438300800 | 1438300800 | 1438300800 | 1438300800 | 1438300800 |
StoreType | c | a | a | c | a |
Assortment | a | a | a | c | a |
CompetitionDistance | 1270 | 570 | 14130 | 620 | 29910 |
CompetitionOpenSinceMonth | 9 | 11 | 12 | 9 | 4 |
CompetitionOpenSinceYear | 2008 | 2007 | 2006 | 2009 | 2015 |
Promo2 | 0 | 1 | 1 | 0 | 0 |
Promo2SinceWeek | 1 | 13 | 14 | 1 | 1 |
... | ... | ... | ... | ... | ... |
Min_Sea_Level_PressurehPa | 1015 | 1017 | 1017 | 1014 | 1016 |
Max_VisibilityKm | 31 | 10 | 31 | 10 | 10 |
Mean_VisibilityKm | 15 | 10 | 14 | 10 | 10 |
Min_VisibilitykM | 10 | 10 | 10 | 10 | 10 |
Max_Wind_SpeedKm_h | 24 | 14 | 14 | 23 | 14 |
Mean_Wind_SpeedKm_h | 11 | 11 | 5 | 16 | 11 |
Max_Gust_SpeedKm_h | NaN | NaN | NaN | NaN | NaN |
Precipitationmm | 0 | 0 | 0 | 0 | 0 |
CloudCover | 1 | 4 | 2 | 6 | 4 |
Events | Fog | Fog | Fog | NaN | NaN |
WindDirDegrees | 13 | 309 | 354 | 282 | 290 |
StateName | Hessen | Thueringen | NordrheinWestfalen | Berlin | Sachsen |
CompetitionOpenSince | 2008-09-15 | 2007-11-15 | 2006-12-15 | 2009-09-15 | 2015-04-15 |
CompetitionDaysOpen | 2510 | 2815 | 3150 | 2145 | 107 |
CompetitionMonthsOpen | 24 | 24 | 24 | 24 | 3 |
Promo2Since | 1900-01-01 | 2010-03-29 | 2011-04-04 | 1900-01-01 | 1900-01-01 |
Promo2Days | 0 | 1950 | 1579 | 0 | 0 |
Promo2Weeks | 0 | 25 | 25 | 0 | 0 |
AfterSchoolHoliday | 0 | 0 | 0 | 0 | 0 |
BeforeSchoolHoliday | 0 | 0 | 0 | 0 | 0 |
AfterStateHoliday | 57 | 67 | 57 | 67 | 57 |
BeforeStateHoliday | 0 | 0 | 0 | 0 | 0 |
AfterPromo | 0 | 0 | 0 | 0 | 0 |
BeforePromo | 0 | 0 | 0 | 0 | 0 |
SchoolHoliday_bw | 5 | 5 | 5 | 5 | 5 |
StateHoliday_bw | 0 | 0 | 0 | 0 | 0 |
Promo_bw | 5 | 5 | 5 | 5 | 5 |
SchoolHoliday_fw | 7 | 1 | 5 | 1 | 1 |
StateHoliday_fw | 0 | 0 | 0 | 0 | 0 |
Promo_fw | 5 | 1 | 5 | 1 | 1 |
93 rows × 5 columns
n = len(train_df); n
844338
构建一个小数据集样本
构建一个小数据集样本
Experimenting with a sample
idx = np.random.permutation(range(n))[:2000]
idx.sort()
small_train_df = train_df.iloc[idx[:1000]]
small_test_df = train_df.iloc[idx[1000:]]
small_cont_vars = ['CompetitionDistance', 'Mean_Humidity']
small_cat_vars = ['Store', 'DayOfWeek', 'PromoInterval']
small_train_df = small_train_df[small_cat_vars + small_cont_vars + ['Sales']]
small_test_df = small_test_df[small_cat_vars + small_cont_vars + ['Sales']]
small_train_df.head()
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Store | DayOfWeek | PromoInterval | CompetitionDistance | Mean_Humidity | Sales | |
---|---|---|---|---|---|---|
267 | 268 | 5 | NaN | 4520.0 | 67 | 7492 |
604 | 606 | 5 | NaN | 2260.0 | 61 | 7187 |
983 | 986 | 5 | Feb,May,Aug,Nov | 620.0 | 61 | 7051 |
1636 | 525 | 4 | NaN | 1870.0 | 55 | 9673 |
2348 | 123 | 3 | NaN | 16760.0 | 50 | 10007 |
small_test_df.head()
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Store | DayOfWeek | PromoInterval | CompetitionDistance | Mean_Humidity | Sales | |
---|---|---|---|---|---|---|
420510 | 829 | 3 | NaN | 110.0 | 55 | 6802 |
420654 | 973 | 3 | Jan,Apr,Jul,Oct | 330.0 | 59 | 6644 |
420990 | 194 | 2 | Feb,May,Aug,Nov | 16970.0 | 55 | 4720 |
421308 | 512 | 2 | Mar,Jun,Sept,Dec | 590.0 | 72 | 6248 |
421824 | 1029 | 2 | NaN | 1590.0 | 64 | 8004 |
对cat_vars, cont_vars做categorify处理
对cat_vars, cont_vars做categorify处理
categorify = Categorify(small_cat_vars, small_cont_vars)
categorify(small_train_df)
categorify(small_test_df, test=True)
small_test_df.head()
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Store | DayOfWeek | PromoInterval | CompetitionDistance | Mean_Humidity | Sales | |
---|---|---|---|---|---|---|
420510 | NaN | 3 | NaN | 110.0 | 55 | 6802 |
420654 | 973.0 | 3 | Jan,Apr,Jul,Oct | 330.0 | 59 | 6644 |
420990 | NaN | 2 | Feb,May,Aug,Nov | 16970.0 | 55 | 4720 |
421308 | 512.0 | 2 | Mar,Jun,Sept,Dec | 590.0 | 72 | 6248 |
421824 | 1029.0 | 2 | NaN | 1590.0 | 64 | 8004 |
查看数据categories文字与数字形式
查看数据categories文字与数字形式
small_train_df.PromoInterval.cat.categories
Index(['Feb,May,Aug,Nov', 'Jan,Apr,Jul,Oct', 'Mar,Jun,Sept,Dec'], dtype='object')
small_train_df['PromoInterval'].cat.codes[:5]
267 -1
604 -1
983 0
1636 -1
2348 -1
dtype: int8
处理数据缺失
处理数据缺失
fill_missing = FillMissing(small_cat_vars, small_cont_vars)
fill_missing(small_train_df)
fill_missing(small_test_df, test=True)
small_train_df[small_train_df['CompetitionDistance_na'] == True]
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Store | DayOfWeek | PromoInterval | CompetitionDistance | Mean_Humidity | Sales | CompetitionDistance_na | |
---|---|---|---|---|---|---|---|
185749 | 622 | 2 | NaN | 2300.0 | 93 | 4508 | True |
Preparing full data set
从pickle文件里提取全部训练和测试数据
从pickle文件里提取全部训练和测试数据
train_df = pd.read_pickle(path/'train_clean')
test_df = pd.read_pickle(path/'test_clean')
len(train_df),len(test_df)
(844338, 41088)
设置预处理,和全部的cat_vars, cont_vars
设置预处理,和全部的cat_vars, cont_vars
procs=[FillMissing, Categorify, Normalize]
cat_vars = ['Store', 'DayOfWeek', 'Year', 'Month', 'Day', 'StateHoliday', 'CompetitionMonthsOpen',
'Promo2Weeks', 'StoreType', 'Assortment', 'PromoInterval', 'CompetitionOpenSinceYear', 'Promo2SinceYear',
'State', 'Week', 'Events', 'Promo_fw', 'Promo_bw', 'StateHoliday_fw', 'StateHoliday_bw',
'SchoolHoliday_fw', 'SchoolHoliday_bw']
cont_vars = ['CompetitionDistance', 'Max_TemperatureC', 'Mean_TemperatureC', 'Min_TemperatureC',
'Max_Humidity', 'Mean_Humidity', 'Min_Humidity', 'Max_Wind_SpeedKm_h',
'Mean_Wind_SpeedKm_h', 'CloudCover', 'trend', 'trend_DE',
'AfterStateHoliday', 'BeforeStateHoliday', 'Promo', 'SchoolHoliday']
设置训练数据成分
设置训练数据成分
dep_var = 'Sales'
df = train_df[cat_vars + cont_vars + [dep_var,'Date']].copy()
找出测试数据长度
找出测试数据长度
test_df['Date'].min(), test_df['Date'].max()
('2015-08-01', '2015-09-17')
根据测试数据两计算需要多少验证数据
根据测试数据两计算需要多少验证数据
cut = train_df['Date'][(train_df['Date'] == train_df['Date'][len(test_df)])].index.max()
cut
41395
valid_idx = range(cut)
df[dep_var].head()
0 5263
1 6064
2 8314
3 13995
4 4822
Name: Sales, dtype: int64
用TabluarList和df构建databunch
用TabluarList和df构建databunch
data = (TabularList.from_df(df, path=path, cat_names=cat_vars, cont_names=cont_vars, procs=procs,)
.split_by_idx(valid_idx)
.label_from_df(cols=dep_var, label_cls=FloatList, log=True)
.add_test(TabularList.from_df(test_df, path=path, cat_names=cat_vars, cont_names=cont_vars))
.databunch())
查看FloatList 与 log的使用
查看FloatList 与 log的使用
doc(FloatList)
Model
计算y range
计算y range
max_log_y = np.log(np.max(train_df['Sales'])*1.2)
y_range = torch.tensor([0, max_log_y], device=defaults.device)
构建tabular learner
构建tabular learner
learn = tabular_learner(data, layers=[1000,500], ps=[0.001,0.01], emb_drop=0.04,
y_range=y_range, metrics=exp_rmspe)
查看模型
查看模型
learn.model
TabularModel(
(embeds): ModuleList(
(0): Embedding(1116, 81)
(1): Embedding(8, 5)
(2): Embedding(4, 3)
(3): Embedding(13, 7)
(4): Embedding(32, 11)
(5): Embedding(3, 3)
(6): Embedding(26, 10)
(7): Embedding(27, 10)
(8): Embedding(5, 4)
(9): Embedding(4, 3)
(10): Embedding(4, 3)
(11): Embedding(24, 9)
(12): Embedding(9, 5)
(13): Embedding(13, 7)
(14): Embedding(53, 15)
(15): Embedding(22, 9)
(16): Embedding(7, 5)
(17): Embedding(7, 5)
(18): Embedding(4, 3)
(19): Embedding(4, 3)
(20): Embedding(9, 5)
(21): Embedding(9, 5)
(22): Embedding(3, 3)
(23): Embedding(3, 3)
)
(emb_drop): Dropout(p=0.04)
(bn_cont): BatchNorm1d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(layers): Sequential(
(0): Linear(in_features=233, out_features=1000, bias=True)
(1): ReLU(inplace)
(2): BatchNorm1d(1000, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.001)
(4): Linear(in_features=1000, out_features=500, bias=True)
(5): ReLU(inplace)
(6): BatchNorm1d(500, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.01)
(8): Linear(in_features=500, out_features=1, bias=True)
)
)
查看数据内细节
查看数据内细节
len(data.train_ds.cont_names)
16
作图,寻找学习率最优值
作图,寻找学习率最优值
learn.lr_find()
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
learn.recorder.plot()
训练模型 1e-3, wd=0.2
训练模型 1e-3, wd=0.2
learn.fit_one_cycle(5, 1e-3, wd=0.2)
Total time: 11:27
epoch | train_loss | valid_loss | exp_rmspe |
---|---|---|---|
1 | 0.023587 | 0.020941 | 0.140551 |
2 | 0.017678 | 0.023431 | 0.132211 |
3 | 0.017453 | 0.016929 | 0.120169 |
4 | 0.012608 | 0.016296 | 0.109245 |
5 | 0.010222 | 0.011238 | 0.105433 |
learn.save('1')
画出损失值走势
画出损失值走势
learn.recorder.plot_losses(last=-1)
learn.load('1');
连续训练两次,每次5epochs,lr=3e-4
连续训练两次,每次5epochs,lr=3e-4
learn.fit_one_cycle(5, 3e-4)
Total time: 11:32
epoch | train_loss | valid_loss | exp_rmspe |
---|---|---|---|
1 | 0.012223 | 0.014312 | 0.116988 |
2 | 0.012001 | 0.017789 | 0.117619 |
3 | 0.011402 | 0.035596 | 0.114396 |
4 | 0.010067 | 0.015125 | 0.113652 |
5 | 0.009148 | 0.031326 | 0.116344 |
learn.fit_one_cycle(5, 3e-4)
Total time: 11:31
epoch | train_loss | valid_loss | exp_rmspe |
---|---|---|---|
1 | 0.011840 | 0.013236 | 0.110483 |
2 | 0.010765 | 0.057664 | 0.129586 |
3 | 0.010101 | 0.042744 | 0.111584 |
4 | 0.008820 | 0.116893 | 0.135458 |
5 | 0.009144 | 0.017969 | 0.126323 |
预测,生成submission
预测,生成submission
(10th place in the competition was 0.108)
test_preds=learn.get_preds(DatasetType.Test)
test_df["Sales"]=np.exp(test_preds[0].data).numpy().T[0]
test_df[["Id","Sales"]]=test_df[["Id","Sales"]].astype("int")
test_df[["Id","Sales"]].to_csv("rossmann_submission.csv",index=False)
Jeremy 在课程中是如何讲解学习率的
lesson 2
如何从图中读取最优学习率区间?
[quote=“Daniel, post:44, topic:39325”]
why do we need learning rate at all?
Lesson 3
How to pick learning rate carefully?
Lesson 5
What is fit-one-cycle
第七课: 手写Resnets; U-net; Generative (adversarial) networks
Overview 综述
In the final lesson of Practical Deep Learning for Coders we’ll study one of the most important techniques in modern architectures: the skip connection. This is most famously used in the resnet, which is the architecture we’ve used throughout this course for image classification, and appears in many cutting edge results. We’ll also look at the U-net architecture, which uses a different type of skip connection to greatly improve segmentation results (and also for similar tasks where the output structure is similar to the input).
在这最后一节课里,我们将学到最新结构设计中最重要的一个技术:skip connection 。这项技术诞生于resnet, 也就是我们在做图片分类时一直在用的模型框架,为我们带来了顶级的表现。我们还会学习U-net结构,这项技术采用另一种skip connection,极大地改进了segmentation的效果(这项技术同样适用于其他任务,只要他们的输入值和输出值结构相似)。
We’ll then use the U-net architecture to train a super-resolution model. This is a model which can increase the resolution of a low-quality image. Our model won’t only increase resolution—it will also remove jpeg artifacts, and remove unwanted text watermarks.
之后我们将用U-net训练 super-resolution 模型。这个模型能够提升低像素图片的清晰度。我们的模型不仅可以提升像素,还能去除jpeg格式的杂物,以及消除文字水印。
In order to make our model produce high quality results, we will need to create a custom loss function which incorporates feature loss (also known as perceptual loss), along with gram loss. These techniques can be used for many other types of image generation task, such as image colorization.
为了让模型生成高质量输出,我们需要设计一款特制损失函数来将特征损失值(也就是所谓的perceptual loss)和gram 损失值做融合。这些技巧同样适用于很多其他的图片生成任务,比如图片上色。
Finally, we’ll learn about a recent loss function known as generative adversarial loss (used in generative adversarial networks, or GANs), which can improve the quality of generative models in some contexts, at the cost of speed.
最后,我们讲学习一个很新的损失函数,被称为generative adversarial loss (被用于生成对抗网络模型,也就是GANs),它能改进生成模型在某任务背景下的表现,但速度上会有所牺牲。
The techniques we show in this lesson include some unpublished research that:
- Let us train GANs more quickly and reliably than standard approaches, by leveraging transfer learning
- Combines architectural innovations and loss function approaches that haven’t been used in this way before.
The results are stunning, and train in just a couple of hours (compared to previous approaches that take a couple of days).
我们在本课中展示但还未正式做学术发表的技巧,如下:
- 利用迁移学习,做到比常规方法更快更可靠的训练GANs
- 在结构设计创新与损失函数使用方法的融合上,也做出了史无前例的创新
对比之前常规方法需要数天时间,我们只是训练了数小时,模型表现已经非常靓丽。
Lesson Resources 课程资源
- 非常详尽的第七课课程笔记 - 感谢 @hiromi
- Notebooks:
- Lesson 7 in-class discussion thread
- Lesson 7 advanced discussion
Other Resources 其他资源
- 可视化神经网络的损失值的地表风景图 论文
- Convolution arithmetic 在课程中展示的论文
- Perceptual Losses for Real-Time Style Transfer and Super-Resolution 论文
- Github对Jeremy的采访
-
ipyexperiments - @stas 提供了比
gc.collect
更便捷的lib帮助释放你的GPU内存 - Documentation improvements thread (请帮助我们一起将文档做的更好!)
I will translate all the 7 lesson md files here. could you help me contribute to the official course-v3 repo without me doing the PR (which I found the process a bit tedious)? Is it possible?
thanks!
Lesson 6: Regularization; Convolutions; Data ethics
第六课:正则化,卷积,数据伦理
Overview 综述
Today we discuss some powerful techniques for improving training and avoiding over-fitting:
- Dropout: remove activations at random during training in order to regularize the model
- Data augmentation: modify model inputs during training in order to effectively increase data size
- Batch normalization: adjust the parameterization of a model in order to make the loss surface smoother.
今天我们学习讨论一些帮助我们改进训练和避免过拟合的强大技巧:
- Dropout: 随机去除一些激活层上的值,目的是对模型做正则处理
- Data augmentation数据增强:对模型输入值做调整,目的是显著增加数据尺寸
- Batch normalization批量正常化:规整模型参数值,目的是训练时让损失值变化更平滑
Next up, we’ll learn all about convolutions, which can be thought of as a variant of matrix multiplication with tied weights, and are the operation at the heart of modern computer vision models (and, increasingly, other types of models too).
接下来,我们会学习convolutions卷积,这个概念可以被理解为一种数组乘法与多个捆绑参数扫描器合作的变体,是当下机器视觉的核心算法(同时也在不断为其他类别的模型所使用)。
We’ll use this knowledge to create a class activated map, which is a heat-map that shows which parts of an image were most important in making a prediction.
我们将基于此创建一个类叫 activated map, 其功能是对图片做热力图处理,从而凸显展示图片中对预测最重要的部位。
Finally, we’ll cover a topic that many students have told us is the most interesting and surprising part of the course: data ethics. We’ll learn about some of the ways in which models can go wrong, with a particular focus on feedback loops, why they cause problems, and how to avoid them. We’ll also look at ways in which bias in data can lead to biased algorithms, and discuss questions that data scientists can and should be asking to help ensure that their work doesn’t lead to unexpected negative outcomes.
最后,我们将学习数据伦理, 许多学员认为这是一个非常有趣且出乎意料的课程内容。我们会了解到模型在什么情况下会出问题,会着重讲到feedback loops 反馈循环, 以及为什么这些会是问题,以及如何规避。我们还会看到数据中的偏差是如何导致算法偏差(歧视)的,还会探讨关于数据科学家能做什么,以及是否应该力争确保他们的工作不会导致意想不到的负面结果。
Lesson Resources 课程资源
- 第六课详细笔记 - 感谢 @hiromi
- Notebooks:
- 第六课 课内探讨 thread
- 第六课 深入探讨 (高级)
Other Resources 其他资源
- platform.ai 平台讨论
- 50 Years of Test (Un)fairness: Lessons for Machine Learning
- Convolutions: 什么是卷积
- Convolution Arithmetic: 卷积运算可视化解读
- Normalization: 如何理解 normalization
- Cross entropy loss: 如何理解entropy loss
- How CNNs work: CNN 如何工作
- Image processing and computer vision: 图片处理与机器视觉
- “Yes you should understand backprop”: 如何理解反向传递
- BERT state-of-the-art language model for NLP: 理解当下最先进的语言模型结构
- Hubel and Wiesel: 从脑神经角度理解视觉
- Perception: CNNbook中perception的解读
Lesson 5: Back propagation; Accelerated SGD; Neural net from scratch
第五课:反向传递,加速版随机梯度下降,手写神经网络
Overview 综述
In lesson 5 we put all the pieces of training together to understand exactly what is going on when we talk about back propagation. We’ll use this knowledge to create and train a simple neural network from scratch.
在本课中我们会深入训练环节细节来讲解什么是反向传递。在此基础上,我们会手写一个简单的神经网络
We’ll also see how we can look inside the weights of an embedding layer, to find out what our model has learned about our categorical variables. This will let us get some insights into which movies we should probably avoid at all costs…
我们还将深入观察embedding层的参数,看看模型学到了哪些关于类别变量的知识。这些知识将帮助我们识别那些需要权利回避的电影…
Although embeddings are most widely known in the context of word embeddings for NLP, they are at least as important for categorical variables in general, such as for tabular data or collaborative filtering. They can even be used with non-neural models with great success.
尽管embeddings的知名度在自然语言word embeddings领域里是最高的,但在广义的类别变量问题的背景下,如表格数据问题或者是推荐算法问题里,他们的重要性不容小觑。他们甚至在非神经网络模型里也有杰出表现。
Resources 资源
Lesson resources 课程资源
- 第六课 笔记 - 感谢 @PoonamV
- 第六课 详尽笔记 - 感谢 @hiromi
- Notebooks:
- Excel spreadsheets:
-
collab_filter.xlsx;
Google Sheets full version; 在需要运行 solver时,请使用Google Sheets short-cut 版本并按照@Moody的指南操作 - graddesc: Excel version ; Google sheets version
- entropy_example.xlsx
-
collab_filter.xlsx;
- Lesson 5 in-class discussion thread
- Lesson 5 advanced discussion
- Links to different parts in video by @melonkernel
Other resources 其他资源
- NY Times Article - Finally, a Machine That Can Finish Your Sentence
- Netflix and Chill: Building a Recommendation System in Excel - Latent Factor Visualization in Excel blog post
- An overview of gradient descent optimization algorithms - Sebastian Ruder
Hey @Daniel, I could walk you through the PR process if you would like to. If you still find it too complicated, I can certainly submit the PR for you.
Lesson 1: Image classification
You can click the blue arrow buttons on the left and right panes to hide them and make more room for the video. You can search the transcript using the text box at the bottom. Scroll down this page for links to many useful resources. If you have any other suggestions for links, edits, or anything else, you’ll find an “edit” link at the bottom of this (and every) notes panel.
点击左侧和右侧的蓝色箭头按钮来隐藏panel给你更多观看视频的空间。你可以屏幕下方搜索字幕并进行视频时间跳跃。页面下方还有大量有用资源。如果你有任何建议,可以在最下方的“编辑”链接,增加你想添加的链接或编辑。
Overview 综述
To follow along with the lessons, you’ll need to connect to a cloud GPU provider which has the fastai library installed (recommended; it should take only 5 minutes or so, and cost under $0.50/hour), or set up a computer with a suitable GPU yourself (which can take days to get working if you’re not familiar with the process, so we don’t recommend it). You’ll also need to be familiar with the basics of the Jupyter Notebook environment we use for running deep learning experiments. Up to date tutorials and recommendations for these are available from the course website.
跟随课程,你需要有一个云端GPU能运行fastai(推荐,目前最便宜的是每小时0.5美元),或者在本地设置自己的GPU(非常费时费事,不推荐)。你需要熟悉Jupyter Notebook的使用环境来做深度学习实验。更多最新的GPU指南可以在课程官网中查看。
The key outcome of this lesson is that we’ll have trained an image classifier which can recognize pet breeds at state of the art accuracy. The key to this success is the use of transfer learning, which will be a key platform for much of this course. We’ll also see how to analyze the model to understand its failure modes. In this case, we’ll see that the places where the model is making mistakes is in the same areas that even breeding experts can make mistakes.
本课的核心目标是训练一个图片分类器,将宠物种类识别做到最专业级的精确度。实验成功的关键是迁移学习 transfer learning,也是本课程的核心平台或模型模版工具之一. 我们会学习如何分析模型以理解错误发生所在。在此过程中,我们将看到模型犯错的地方,就连宠物种类鉴定专家也会判断出错。
We’ll discuss the overall approach of the course, which is somewhat unusual in being top-down rather than bottom-up. So rather than starting with theory, and only getting to practical applications later, instead we start with practical applications, and then gradually dig deeper and deeper in to them, learning the theory as needed. This approach takes more work for teachers to develop, but it’s been shown to help students a lot, for example in education research at Harvard by David Perkins.
我们还将探讨本课程的授课模式,即自上而下,而非自下而上。也就是说,我们是从实验开始,根据需求,逐步深入学习理论,而非传统方式,讲完理论,才慢慢开始实践。这种方法对老师挑战较大非常耗时,但对学生受益颇丰,例如 education research at Harvard by David Perkins.
We also discuss how to set the most important hyper-parameter when training neural networks: the learning rate, using Leslie Smith’s fantastic learning rate finder method. Finally, we’ll look at the important but rarely discussed topic of labeling, and learn about some of the features that fastai provides for allowing you to easily add labels to your images.
我们还将讨论在训练模型时如何设置那些最重要的超参数hyper-parameter。我们将采用Leslie Smith’s fantastic learning rate finder method来设置学习率。最后,我们将研究很少讨论但非常重要的labeling数据标记, 并学习fastai 库提供的轻松添加图片标注的功能
If you want to more deeply understand how PyTorch really works, you may want to check out this official PyTorch tutorial by Jeremy—although we’d only suggest doing that once you’ve completed a few lessons.
如果你想要深入理解pytorch的实际工作,可以参看 this official PyTorch tutorial by Jeremy,但先别急,建议你在学完本课程的几节课后再学
Links 链接
Lesson resources 课程资源
- Course site, 课程官网包含了所有平台的GPU设置指南
- Course repo 课程的github repo
- fastai docs library文档
- fastai datasets 课程用到的所有数据集
- Notebooks:
- 第一课 详尽笔记 - 感谢 @hiromi
- 第一课笔记 - 感谢 @PoonamV (wiki thread - 欢迎大家贡献共建!)
- 课程探讨 thread
Other resources 其他资源
- Thread on creating your own image dataset
- What you need to do deep learning (fast.ai 博客讲解了什么是GPU以及它们的必要性)
- Original Paper for Oxford-IIIT Pet Dataset
- The Oxford-IIIT Pet Dataset
- What the Regular Expressions in the notebook meant
- Understanding Regular Expressions (12 分钟视频)
- Visualize Regular Expressions
- Interactive tutorial to learn Regular Expressions
- Beginners Tutorial of Regular Expression
- One-Cycle Policy Fitting paper
- Visualizing and Understanding Convolutional Networks (paper)
How to scrape images 如何从网页爬取图片
- 官方课程指南
- Tips for building large image datasets
- Generating image datasets quickly
- How to scrape the web for images?
编辑此页面
编辑此页面, 点击这里. 你会进入GitHub一个页面让你上交修改。它们会自动生成 pull request 然后经由管理员审核后发布。
Hello Yangdf, we have setup a fastai study group in Shanghai. See the QR code
.We have also a slack channel for collaborative work. Here is an invitation to this slack channel.
Our first Meetup will take place in two Shanghai locations: Pudong and Hongqiao, on March 24th, 2019.
Welcome to the club! 欢迎来到俱乐部!
Thanks a lot! I would like you to do the submission for me. Do you mind just use the what I posted here or you want me to send you translations in md files?
Lesson 4: NLP; Tabular data; Collaborative filtering; Embeddings
第四课:自然语言,表格数据,推荐系统算法collab, 嵌入层embeddings
Overview 综述
In lesson 4 we’ll dive in to natural language processing (NLP), using the IMDb movie review dataset. In this task, our goal is to predict whether a movie review is positive or negative; this is called sentiment analysis. We’ll be using the ULMFiT algorithm, which was originally developed during the fast.ai 2018 course, and became part of a revolution in NLP during 2018 which led the New York Times to declare that new systems are starting to crack the code of natural language. ULMFiT is today the most accurate known sentiment analysis algorithm.
本课里我们将通过IMDb 电影评论数据集,深入学习自然语言NLP。我们的任务是预测影评的正负面情绪;也就是情绪分析。我们将采用ULMFiT 算法,这个算法是我们最初在2018年课程中开发的,随后成为了自然语言中的一个革命性变化的一部分。纽约时报还因此发文称新系统正在揭秘自然语言。如今ULMFiT已经成为最准确的情绪分析算法。
The basic steps are:
- Create (or, preferred, download a pre-trained) language model trained on a large corpus such as Wikipedia (a “language model” is any model that learns to predict the next word of a sentence)
- Fine-tune this language model using your target corpus (in this case, IMDb movie reviews)
- Extract the encoder from this fine tuned language model, and pair it with a classifier. Then fine-tune this model for the final classification task (in this case, sentiment analysis).
基本步骤:
- 创建(下载预先训练好的)language model语言模型,这个模型是在一个巨大的语言数据集如维基百科上训练而来的。(所谓的"语言模型" 就是能够学习预测句子中下一个词的模型)
- 用你的目标数据集微调这个语言模型(在我们的案例中,目标数据集是IMDb影评数据)
- 从这个微调的语言模型中提取encoder, 再给配上一个分类器。然后为最后的分类任务(也就是情绪判断)来微调模型。
After our journey into NLP, we’ll complete our practical applications for Practical Deep Learning for Coders by covering tabular data (such as spreadsheets and database tables), and collaborative filtering (recommendation systems).
完成NLP后,我们还会覆盖表格数据问题如excel和数据库中的表格,以及解决推荐系统问题的collaborative filtering 算法。到此为止,我们覆盖了全课程所有的深度学习应用。
For tabular data, we’ll see how to use categorical and continuous variables, and how to work with the fastai.tabular module to set up and train a model.
就表格数据而言,我们会学到如何使用类别和连续变量,如何使用fastai.tabular
模块来设置和训练模型。
Then we’ll see how collaborative filtering models can be built using similar ideas to those for tabular data, but with some special tricks to get both higher accuracy and more informative model interpretation.
随后我们将用表格数据问题所学来构建collaborative filtering模型,但是在使用了特殊技巧后,模型的准确度不仅更高,而且更具解释性。
This brings us to the half-way point of the course, where we have looked at how to build and interpret models in each of these key application areas:
- Computer vision
- NLP
- Tabular
- Collaborative filtering
到此,我们已完成了一半的课程,覆盖了全部的应用领域:
- 机器视觉
- 自然语言
- 表格数据
- 推荐系统的 Collaborate filtering
For the second half of the course, we’ll learn about how these models really work, and how to create them ourselves from scratch. For this lesson, we’ll put together some of the key pieces we’ve touched on so far:
- Activations
- Parameters
- Layers (affine and non-linear)
- Loss function.
在课程的后半段,我们讲学习这些模型到底是如何工作的,以及如何手写这些模型。本节课,我们将对以下核心概念做梳理:
- 激活层(值)
- 参数(权重)
- 层(affine线性 和非线性)
- 损失函数
We’ll be coming back to each of these in lots more detail during the remaining lessons. We’ll also learn about a type of layer that is important for NLP, collaborative filtering, and tabular models: the embedding layer. As we’ll discover, an “embedding” is simply a computational shortcut for a particular type of matrix multiplication (a multiplication by a one-hot encoded matrix).
我们会在后续的课时中进一步探索以上概念的相关细节。我们会学到对自然语言,Collaborative filtering, 以及表格数据模型都很重要的一种神经网络层设计:嵌入层 embedding layer。 我们会发现,其实“嵌入层”就是一种特殊数组乘法matrix multiplication (基于one-hot encoded 数组乘法)的简化算法。
Resources 资源
Lesson resources 课程资源
- Notebooks:
- Excel spreadsheets:
- 视频节点清单 by @melonkernel
- 第四课 笔记 by @PoonamV
- 第四课 详尽笔记 by @hiromi
- 简介版笔记 from @boy1729
- 课内探讨
- 课内高阶探讨
Other resources 其他资源
- QCon.ai keynote on Analyzing & Preventing Unconscious Bias in Machine Learning
- PyBay keynote with case studies of what can go wrong, and steps toward solutions
- Workshop on Word Embeddings, Bias in ML, Why You Don’t Like Math, & Why AI Needs You
- AI Ethics Resources (includes links to experts to follow)
- What HBR Gets Wrong About Algorithms and Bias
- When Data Science Destabilizes Democracy and Facilitates Genocide
- Vim and Ctags for fast function definition lookup
Lesson 3: Data blocks; Multi-label classification; Segmentation
Overview 综述
Lots to cover today! We start lesson 3 looking at an interesting dataset: Planet’s Understanding the Amazon from Space. In order to get this data in to the shape we need it for modeling, we’ll use one of fastai’s most powerful (and unique!) tools: the data block API. We’ll be coming back to this API many times over the coming lessons, and mastery of it will make you a real fastai superstar! Once you’ve finished this lesson, if you’re ready to learn more about the data block API, have a look at this great article: Finding Data Block Nirvana, by Wayde Gilliam.
本节课内容很多!一开始我们要看一个非常有趣的数据集:Planet’s Understanding the Amazon from Space. 为了让数据能“喂给”模型,我们需要用fastai强大且独特的data block API工具来处理数据。在后续的课时中,我们也会反复使用这个API,熟练掌握它能让你成为真正的fastai超级明星!当你完成本节课,如果你准备好学习更多data block API,可以看看这篇很棒的文章Finding Data Block Nirvana, 作者是 Wayde Gilliam.
One important feature of the Planet dataset is that it is a multi-label dataset. That is: each satellite image can contain multiple labels, whereas previous datasets we’ve looked at have had exactly one label per image. We’ll look at what changes we need to make to work with multi-label datasets.
planet数据集一个重要特征是多标签multi-label。也就是说:每张卫星图片可以包含多个标签/标注,而之前的数据集我们面对的是一张图对应一个标注。我们会学到需要做哪些调整来处理这个多标签问题。
Next, we will look at image segmentation, which is the process of labeling every pixel in an image with a category that shows what kind of object is portrayed by that pixel. We will use similar techniques to the earlier image classification models, with a few tweaks. fastai makes image segmentation modeling and interpretation just as easy as image classification, so there won’t be too many tweaks required.
接下来,我们将学习image segmentation 图片像素隔离,也就是对图片中每一个像素做类别标注,从而知道哪个像素对应哪个物体。我们会对前期所学的技巧做一些调整。fastai将图片像素隔离建模和解读做得跟图片分类一样简单,因此不会有太多需要调整的地方。
We will be using the popular Camvid dataset for this part of the lesson. In future lessons, we will come back to it and show a few extra tricks. Our final Camvid model will have dramatically lower error than an model we’ve been able to find in the academic literature!
我们将用著名的Camvid数据集来做图片像素隔离。后续课时中,还会回头学习更多技巧。我们最终Camvid模型对比所能找到的已发表的最优学术水平,将进一步大幅降低错误率。
What if your dependent variable is a continuous value, instead of a category? We answer that question next, looking at a keypoint dataset, and building a model that predicts face keypoints with high accuracy.
如果你的目标变量是连续的,而非类别,怎么办?我们将用下一个数据集keypoint来回答,我们将构建一个模型做高精度的脸部关键点预测。
Resources资源
Lesson resources 课程资源
- 第三课笔记 from @PoonamV
- 第三课 详尽笔记 by @hiromi
-
课程 notebooks需要 fastai 1.0.21 或更新. 请用
conda install -c fastai fastai
(或其他合适你平台的代码),不要忘记用git pull
更新 notebooks - Notebooks:
- Lesson 3 in-class discussion
- Links to different parts in video by @melonkernel
Other resources 其他资源
- 介绍ML背景知识的在线课程:
– Introduction to Machine Learning for Coders 作者 @jeremy
– Machine Learning 作者 Andrew Ng (coursera) - Video Browser with Searchable Transcripts Password: deeplearningSF2018 (do not share outside the forum group) - PRs welcome.
- Quick and easy model deployment using Render
- Introduction to Kaggle API in Google Colab (Part-I) 作者 @mmiakashs
- Data block API
- Python partials
- MoviePy @rachel提到的 python 视频剪辑工具
- WebRTC example for web video 作者 @etown
- Nov 14 Meetup (wait list) Conversation between Jeremy Howard and Leslie Smith
-
List of transforms in
vision.transform
package
Further reading 深入阅读
- Cyclical Learning Rates for Training Neural Networks Leslie Smith的论文
-
ULMFit fine-tuning for NLP Classification used in
language_model_learner()
- Michael Nielsen’s book