Fast.ai v3 2019课程中文版笔记

Lesson 6: pets revisited

三行魔法代码和所需library

三行魔法代码和所需library

%reload_ext autoreload
%autoreload 2
%matplotlib inline

from fastai.vision import *
设置批量大小

设置批量大小

bs = 64
下载数据,获取图片文件夹地址

下载数据,获取图片文件夹地址

path = untar_data(URLs.PETS)/'images'

Data augmentation

对图片做特定处理

对图片做特定处理

tfms = get_transforms(max_rotate=20, max_zoom=1.3, max_lighting=0.4, max_warp=0.4,
                      p_affine=1., p_lighting=1.)
查看get_transforms文档

查看get_transforms文档

doc(get_transforms)
构建数据src

构建数据src

src = ImageList.from_folder(path).random_split_by_pct(0.2, seed=2)
创建一个定制函数来构建DataBunch

创建一个定制函数来构建DataBunch

def get_data(size, bs, padding_mode='reflection'):
    return (src.label_from_re(r'([^/]+)_\d+.jpg$')
           .transform(tfms, size=size, padding_mode=padding_mode)
           .databunch(bs=bs).normalize(imagenet_stats))
展示同一张图片的各种变形效果(padding=0)

展示同一张图片的各种变形效果(padding=0)

data = get_data(224, bs, 'zeros')
def _plot(i,j,ax):
    x,y = data.train_ds[3]
    x.show(ax, y=y)

plot_multi(_plot, 3, 3, figsize=(8,8))

output_19_0

展示同一张图片的各种变形效果(padding=reflection)

展示同一张图片的各种变形效果(padding=reflection)

data = get_data(224,bs)
plot_multi(_plot, 3, 3, figsize=(8,8))

output_22_0

Train a model

释放内存空间

释放内存空间

gc.collect()
用迁移学习构建模型 (bn_final=True)

用迁移学习构建模型 (bn_final=True)

learn = create_cnn(data, models.resnet34, metrics=error_rate, bn_final=True)
训练模型 (pct_start=0.8)

训练模型 (pct_start=0.8)

learn.fit_one_cycle(3, slice(1e-2), pct_start=0.8)

Total time: 01:22

epoch train_loss valid_loss error_rate
1 2.573282 1.364505 0.271989
2 1.545074 0.377077 0.094046
3 0.937992 0.270508 0.068336
解冻,再训练 max_lr=slice(1e-6,1e-3)

解冻,再训练 max_lr=slice(1e-6,1e-3)

learn.unfreeze()
learn.fit_one_cycle(2, max_lr=slice(1e-6,1e-3), pct_start=0.8)

Total time: 00:55

epoch train_loss valid_loss error_rate
1 0.721187 0.294177 0.058187
2 0.675999 0.285875 0.050744
改变数据的图片大小

改变数据的图片大小

data = get_data(352,bs)
learn.data = data
再训练 max_lr=slice(1e-6,1e-4)

再训练 max_lr=slice(1e-6,1e-4)

learn.fit_one_cycle(2, max_lr=slice(1e-6,1e-4))

Total time: 01:37

epoch train_loss valid_loss error_rate
1 0.627055 0.286791 0.058863
2 0.602765 0.286951 0.058863
保存模型

保存模型

learn.save('352')

Convolution kernel

改变数据批量大小 (缩小)

改变数据批量大小 (缩小)

data = get_data(352,16)
加载上次训练的模型

加载上次训练的模型

learn = create_cnn(data, models.resnet34, metrics=error_rate, bn_final=True).load('352')
展示验证集中的第一个数据点(图和label)

展示验证集中的第一个数据点(图和label)

idx=0
x,y = data.valid_ds[idx]
x.show()
data.valid_ds.y[idx]
Category american_pit_bull_terrier

output_44_1

创建一个kernel or filter

创建一个kernel or filter

k = tensor([
    [0.  ,-5/3,1],
    [-5/3,-5/3,1],
    [1.  ,1   ,1],
]).expand(1,3,3,3)/6
k
tensor([[[[ 0.0000, -0.2778,  0.1667],
          [-0.2778, -0.2778,  0.1667],
          [ 0.1667,  0.1667,  0.1667]],

         [[ 0.0000, -0.2778,  0.1667],
          [-0.2778, -0.2778,  0.1667],
          [ 0.1667,  0.1667,  0.1667]],

         [[ 0.0000, -0.2778,  0.1667],
          [-0.2778, -0.2778,  0.1667],
          [ 0.1667,  0.1667,  0.1667]]]])
k.shape
torch.Size([1, 3, 3, 3])
从验证数据中提起一个数据点的图片tensor

从验证数据中提起一个数据点的图片tensor

t = data.valid_ds[0][0].data; t.shape
torch.Size([3, 352, 352])
将3D tensor变成4D

将3D tensor变成4D

t[None].shape
torch.Size([1, 3, 352, 352])
对这个4D tensor做filter处理

对这个4D tensor做filter处理

edge = F.conv2d(t[None], k)
显示filter处理结构

显示filter处理结构

show_image(edge[0], figsize=(5,5));

output_56_0

查看data.c

查看data.c

data.c
37
查看模型结构

查看模型结构

learn.model
Sequential(
  (0): Sequential(
    (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace)
    (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (4): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (5): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (6): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (4): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (5): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (7): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (1): Sequential(
    (0): AdaptiveConcatPool2d(
      (ap): AdaptiveAvgPool2d(output_size=1)
      (mp): AdaptiveMaxPool2d(output_size=1)
    )
    (1): Flatten()
    (2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.25)
    (4): Linear(in_features=1024, out_features=512, bias=True)
    (5): ReLU(inplace)
    (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): Dropout(p=0.5)
    (8): Linear(in_features=512, out_features=37, bias=True)
    (9): BatchNorm1d(37, eps=1e-05, momentum=0.01, affine=True, track_running_stats=True)
  )
)
打印模型总结

打印模型总结

print(learn.summary())
'======================================================================\nLayer (type)         Output Shape         Param #    Trainable \n======================================================================\nConv2d               [16, 64, 176, 176]   9408       False     \n______________________________________________________________________\nBatchNorm2d          [16, 64, 176, 176]   128        True      \n______________________________________________________________________\nReLU                 [16, 64, 176, 176]   0          False     \n______________________________________________________________________\nMaxPool2d            [16, 64, 88, 88]     0          False     \n______________________________________________________________________\nConv2d               [16, 64, 88, 88]     36864      False     \n______________________________________________________________________\nBatchNorm2d          [16, 64, 88, 88]     128        True      \n______________________________________________________________________\nReLU                 [16, 64, 88, 88]     0          False     \n______________________________________________________________________\nConv2d               [16, 64, 88, 88]     36864      False     \n______________________________________________________________________\nBatchNorm2d          [16, 64, 88, 88]     128        True      \n______________________________________________________________________\nConv2d               [16, 64, 88, 88]     36864      False     \n______________________________________________________________________\nBatchNorm2d          [16, 64, 88, 88]     128        True      \n______________________________________________________________________\nReLU                 [16, 64, 88, 88]     0          False     \n______________________________________________________________________\nConv2d               [16, 64, 88, 88]     36864      False     \n______________________________________________________________________\nBatchNorm2d          [16, 64, 88, 88]     128        True      \n______________________________________________________________________\nConv2d               [16, 64, 88, 88]     36864      False     \n______________________________________________________________________\nBatchNorm2d          [16, 64, 88, 88]     128        True      \n______________________________________________________________________\nReLU                 [16, 64, 88, 88]     0          False     \n______________________________________________________________________\nConv2d               [16, 64, 88, 88]     36864      False     \n______________________________________________________________________\nBatchNorm2d          [16, 64, 88, 88]     128        True      \n______________________________________________________________________\nConv2d               [16, 128, 44, 44]    73728      False     \n______________________________________________________________________\nBatchNorm2d          [16, 128, 44, 44]    256        True      \n______________________________________________________________________\nReLU                 [16, 128, 44, 44]    0          False     \n______________________________________________________________________\nConv2d               [16, 128, 44, 44]    147456     False     \n______________________________________________________________________\nBatchNorm2d          [16, 128, 44, 44]    256        True      \n______________________________________________________________________\nConv2d               [16, 128, 44, 44]    8192       False     \n______________________________________________________________________\nBatchNorm2d          [16, 128, 44, 44]    256        True      \n______________________________________________________________________\nConv2d               [16, 128, 44, 44]    147456     False     \n______________________________________________________________________\nBatchNorm2d          [16, 128, 44, 44]    256        True      \n______________________________________________________________________\nReLU                 [16, 128, 44, 44]    0          False     \n______________________________________________________________________\nConv2d               [16, 128, 44, 44]    147456     False     \n______________________________________________________________________\nBatchNorm2d          [16, 128, 44, 44]    256        True      \n______________________________________________________________________\nConv2d               [16, 128, 44, 44]    147456     False     \n______________________________________________________________________\nBatchNorm2d          [16, 128, 44, 44]    256        True      \n______________________________________________________________________\nReLU                 [16, 128, 44, 44]    0          False     \n______________________________________________________________________\nConv2d               [16, 128, 44, 44]    147456     False     \n______________________________________________________________________\nBatchNorm2d          [16, 128, 44, 44]    256        True      \n______________________________________________________________________\nConv2d               [16, 128, 44, 44]    147456     False     \n______________________________________________________________________\nBatchNorm2d          [16, 128, 44, 44]    256        True      \n______________________________________________________________________\nReLU                 [16, 128, 44, 44]    0          False     \n______________________________________________________________________\nConv2d               [16, 128, 44, 44]    147456     False     \n______________________________________________________________________\nBatchNorm2d          [16, 128, 44, 44]    256        True      \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    294912     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nReLU                 [16, 256, 22, 22]    0          False     \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    32768      False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nReLU                 [16, 256, 22, 22]    0          False     \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nReLU                 [16, 256, 22, 22]    0          False     \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nReLU                 [16, 256, 22, 22]    0          False     \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nReLU                 [16, 256, 22, 22]    0          False     \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nReLU                 [16, 256, 22, 22]    0          False     \n______________________________________________________________________\nConv2d               [16, 256, 22, 22]    589824     False     \n______________________________________________________________________\nBatchNorm2d          [16, 256, 22, 22]    512        True      \n______________________________________________________________________\nConv2d               [16, 512, 11, 11]    1179648    False     \n______________________________________________________________________\nBatchNorm2d          [16, 512, 11, 11]    1024       True      \n______________________________________________________________________\nReLU                 [16, 512, 11, 11]    0          False     \n______________________________________________________________________\nConv2d               [16, 512, 11, 11]    2359296    False     \n______________________________________________________________________\nBatchNorm2d          [16, 512, 11, 11]    1024       True      \n______________________________________________________________________\nConv2d               [16, 512, 11, 11]    131072     False     \n______________________________________________________________________\nBatchNorm2d          [16, 512, 11, 11]    1024       True      \n______________________________________________________________________\nConv2d               [16, 512, 11, 11]    2359296    False     \n______________________________________________________________________\nBatchNorm2d          [16, 512, 11, 11]    1024       True      \n______________________________________________________________________\nReLU                 [16, 512, 11, 11]    0          False     \n______________________________________________________________________\nConv2d               [16, 512, 11, 11]    2359296    False     \n______________________________________________________________________\nBatchNorm2d          [16, 512, 11, 11]    1024       True      \n______________________________________________________________________\nConv2d               [16, 512, 11, 11]    2359296    False     \n______________________________________________________________________\nBatchNorm2d          [16, 512, 11, 11]    1024       True      \n______________________________________________________________________\nReLU                 [16, 512, 11, 11]    0          False     \n______________________________________________________________________\nConv2d               [16, 512, 11, 11]    2359296    False     \n______________________________________________________________________\nBatchNorm2d          [16, 512, 11, 11]    1024       True      \n______________________________________________________________________\nAdaptiveAvgPool2d    [16, 512, 1, 1]      0          False     \n______________________________________________________________________\nAdaptiveMaxPool2d    [16, 512, 1, 1]      0          False     \n______________________________________________________________________\nFlatten              [16, 1024]           0          False     \n______________________________________________________________________\nBatchNorm1d          [16, 1024]           2048       True      \n______________________________________________________________________\nDropout              [16, 1024]           0          False     \n______________________________________________________________________\nLinear               [16, 512]            524800     True      \n______________________________________________________________________\nReLU                 [16, 512]            0          False     \n______________________________________________________________________\nBatchNorm1d          [16, 512]            1024       True      \n______________________________________________________________________\nDropout              [16, 512]            0          False     \n______________________________________________________________________\nLinear               [16, 37]             18981      True      \n______________________________________________________________________\nBatchNorm1d          [16, 37]             74         True      \n______________________________________________________________________\n\nTotal params: 21831599\nTotal trainable params: 563951\nTotal non-trainable params: 21267648\n'

Heatmap

提取模型正向传递计算

提取模型正向传递计算

m = learn.model.eval();
提取一个数据点 (只用X部分)

提取一个数据点 (只用X部分)

xb,_ = data.one_item(x)
对数据点X部分做denormalization处理,在转化为图片格式

对数据点X部分做denormalization处理,在转化为图片格式

xb_im = Image(data.denorm(xb)[0])
对数据点X部分做GPU计算设置

对数据点X部分做GPU计算设置

xb = xb.cuda()
调用callbacks.hooks全部功能

调用callbacks.hooks全部功能

from fastai.callbacks.hooks import *
构建函数提取模型激活层数据

构建函数提取模型激活层数据

def hooked_backward(cat=y):
    with hook_output(m[0]) as hook_a: 
        with hook_output(m[0], grad=True) as hook_g:
            preds = m(xb)
            preds[0,int(cat)].backward()
    return hook_a,hook_g
hook_a,hook_g = hooked_backward()
提取激活层数据,纵向做均值处理

提取激活层数据,纵向做均值处理

acts  = hook_a.stored[0].cpu()
acts.shape
torch.Size([512, 11, 11])
avg_acts = acts.mean(0)
avg_acts.shape
torch.Size([11, 11])
构建heatmap作图函数

构建heatmap作图函数

def show_heatmap(hm):
    _,ax = plt.subplots()
    xb_im.show(ax)
    ax.imshow(hm, alpha=0.6, extent=(0,352,352,0),
              interpolation='bilinear', cmap='magma');
show_heatmap(avg_acts)

output_82_0

Grad-CAM

论文提出的制作heatmap方法

论文提出的制作heatmap方法

Paper: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

案例1

案例1

grad = hook_g.stored[0][0].cpu()
grad_chan = grad.mean(1).mean(1)
grad.shape,grad_chan.shape
(torch.Size([512, 11, 11]), torch.Size([512]))
mult = (acts*grad_chan[...,None,None]).mean(0)
show_heatmap(mult)

output_89_0

案例2

案例2

fn = path/'../other/bulldog_maine.jpg' #Replace with your own image
x = open_image(fn); x
xb,_ = data.one_item(x)
xb_im = Image(data.denorm(xb)[0])
xb = xb.cuda()
hook_a,hook_g = hooked_backward()
acts = hook_a.stored[0].cpu()
grad = hook_g.stored[0][0].cpu()

grad_chan = grad.mean(1).mean(1)
mult = (acts*grad_chan[...,None,None]).mean(0)
show_heatmap(mult)

output_96_0

案例3: 通过处理数据类别,heatmap从聚焦猫到了狗

案例3: 通过处理数据类别,heatmap从聚焦猫到了狗

data.classes[0]
'american_bulldog'
hook_a,hook_g = hooked_backward(0)
acts = hook_a.stored[0].cpu()
grad = hook_g.stored[0][0].cpu()

grad_chan = grad.mean(1).mean(1)
mult = (acts*grad_chan[...,None,None]).mean(0)
show_heatmap(mult)

output_101_0

Lesson 6 Rossmann sales prediction

2行魔法代码

2行魔法代码

%reload_ext autoreload
%autoreload 2
所需library

所需library

from fastai.tabular import *

Data preparation

feature engineering

feature engineering

To create the feature-engineered train_clean and test_clean from the Kaggle competition data, run rossman_data_clean.ipynb. One important step that deals with time series is this:

add_datepart(train, "Date", drop=False)
add_datepart(test, "Date", drop=False)
调取feature engineering 之后数据

调取feature engineering 之后数据

path = Config().data_path()/'rossmann'
train_df = pd.read_pickle(path/'train_clean')
查看数据

查看数据

train_df.head().T
.dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
0 1 2 3 4
index 0 1 2 3 4
Store 1 2 3 4 5
DayOfWeek 5 5 5 5 5
Date 2015-07-31 2015-07-31 2015-07-31 2015-07-31 2015-07-31
Sales 5263 6064 8314 13995 4822
Customers 555 625 821 1498 559
Open 1 1 1 1 1
Promo 1 1 1 1 1
StateHoliday False False False False False
SchoolHoliday 1 1 1 1 1
Year 2015 2015 2015 2015 2015
Month 7 7 7 7 7
Week 31 31 31 31 31
Day 31 31 31 31 31
Dayofweek 4 4 4 4 4
Dayofyear 212 212 212 212 212
Is_month_end True True True True True
Is_month_start False False False False False
Is_quarter_end False False False False False
Is_quarter_start False False False False False
Is_year_end False False False False False
Is_year_start False False False False False
Elapsed 1438300800 1438300800 1438300800 1438300800 1438300800
StoreType c a a c a
Assortment a a a c a
CompetitionDistance 1270 570 14130 620 29910
CompetitionOpenSinceMonth 9 11 12 9 4
CompetitionOpenSinceYear 2008 2007 2006 2009 2015
Promo2 0 1 1 0 0
Promo2SinceWeek 1 13 14 1 1
... ... ... ... ... ...
Min_Sea_Level_PressurehPa 1015 1017 1017 1014 1016
Max_VisibilityKm 31 10 31 10 10
Mean_VisibilityKm 15 10 14 10 10
Min_VisibilitykM 10 10 10 10 10
Max_Wind_SpeedKm_h 24 14 14 23 14
Mean_Wind_SpeedKm_h 11 11 5 16 11
Max_Gust_SpeedKm_h NaN NaN NaN NaN NaN
Precipitationmm 0 0 0 0 0
CloudCover 1 4 2 6 4
Events Fog Fog Fog NaN NaN
WindDirDegrees 13 309 354 282 290
StateName Hessen Thueringen NordrheinWestfalen Berlin Sachsen
CompetitionOpenSince 2008-09-15 2007-11-15 2006-12-15 2009-09-15 2015-04-15
CompetitionDaysOpen 2510 2815 3150 2145 107
CompetitionMonthsOpen 24 24 24 24 3
Promo2Since 1900-01-01 2010-03-29 2011-04-04 1900-01-01 1900-01-01
Promo2Days 0 1950 1579 0 0
Promo2Weeks 0 25 25 0 0
AfterSchoolHoliday 0 0 0 0 0
BeforeSchoolHoliday 0 0 0 0 0
AfterStateHoliday 57 67 57 67 57
BeforeStateHoliday 0 0 0 0 0
AfterPromo 0 0 0 0 0
BeforePromo 0 0 0 0 0
SchoolHoliday_bw 5 5 5 5 5
StateHoliday_bw 0 0 0 0 0
Promo_bw 5 5 5 5 5
SchoolHoliday_fw 7 1 5 1 1
StateHoliday_fw 0 0 0 0 0
Promo_fw 5 1 5 1 1

93 rows × 5 columns

n = len(train_df); n
844338
构建一个小数据集样本

构建一个小数据集样本

Experimenting with a sample

idx = np.random.permutation(range(n))[:2000]
idx.sort()
small_train_df = train_df.iloc[idx[:1000]]
small_test_df = train_df.iloc[idx[1000:]]
small_cont_vars = ['CompetitionDistance', 'Mean_Humidity']
small_cat_vars =  ['Store', 'DayOfWeek', 'PromoInterval']
small_train_df = small_train_df[small_cat_vars + small_cont_vars + ['Sales']]
small_test_df = small_test_df[small_cat_vars + small_cont_vars + ['Sales']]
small_train_df.head()
.dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
Store DayOfWeek PromoInterval CompetitionDistance Mean_Humidity Sales
267 268 5 NaN 4520.0 67 7492
604 606 5 NaN 2260.0 61 7187
983 986 5 Feb,May,Aug,Nov 620.0 61 7051
1636 525 4 NaN 1870.0 55 9673
2348 123 3 NaN 16760.0 50 10007
small_test_df.head()
.dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
Store DayOfWeek PromoInterval CompetitionDistance Mean_Humidity Sales
420510 829 3 NaN 110.0 55 6802
420654 973 3 Jan,Apr,Jul,Oct 330.0 59 6644
420990 194 2 Feb,May,Aug,Nov 16970.0 55 4720
421308 512 2 Mar,Jun,Sept,Dec 590.0 72 6248
421824 1029 2 NaN 1590.0 64 8004
对cat_vars, cont_vars做categorify处理

对cat_vars, cont_vars做categorify处理

categorify = Categorify(small_cat_vars, small_cont_vars)
categorify(small_train_df)
categorify(small_test_df, test=True)
small_test_df.head()
.dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
Store DayOfWeek PromoInterval CompetitionDistance Mean_Humidity Sales
420510 NaN 3 NaN 110.0 55 6802
420654 973.0 3 Jan,Apr,Jul,Oct 330.0 59 6644
420990 NaN 2 Feb,May,Aug,Nov 16970.0 55 4720
421308 512.0 2 Mar,Jun,Sept,Dec 590.0 72 6248
421824 1029.0 2 NaN 1590.0 64 8004
查看数据categories文字与数字形式

查看数据categories文字与数字形式

small_train_df.PromoInterval.cat.categories
Index(['Feb,May,Aug,Nov', 'Jan,Apr,Jul,Oct', 'Mar,Jun,Sept,Dec'], dtype='object')
small_train_df['PromoInterval'].cat.codes[:5]
267    -1
604    -1
983     0
1636   -1
2348   -1
dtype: int8
处理数据缺失

处理数据缺失

fill_missing = FillMissing(small_cat_vars, small_cont_vars)
fill_missing(small_train_df)
fill_missing(small_test_df, test=True)
small_train_df[small_train_df['CompetitionDistance_na'] == True]
.dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
Store DayOfWeek PromoInterval CompetitionDistance Mean_Humidity Sales CompetitionDistance_na
185749 622 2 NaN 2300.0 93 4508 True

Preparing full data set

从pickle文件里提取全部训练和测试数据

从pickle文件里提取全部训练和测试数据

train_df = pd.read_pickle(path/'train_clean')
test_df = pd.read_pickle(path/'test_clean')
len(train_df),len(test_df)
(844338, 41088)
设置预处理,和全部的cat_vars, cont_vars

设置预处理,和全部的cat_vars, cont_vars

procs=[FillMissing, Categorify, Normalize]
cat_vars = ['Store', 'DayOfWeek', 'Year', 'Month', 'Day', 'StateHoliday', 'CompetitionMonthsOpen',
    'Promo2Weeks', 'StoreType', 'Assortment', 'PromoInterval', 'CompetitionOpenSinceYear', 'Promo2SinceYear',
    'State', 'Week', 'Events', 'Promo_fw', 'Promo_bw', 'StateHoliday_fw', 'StateHoliday_bw',
    'SchoolHoliday_fw', 'SchoolHoliday_bw']

cont_vars = ['CompetitionDistance', 'Max_TemperatureC', 'Mean_TemperatureC', 'Min_TemperatureC',
   'Max_Humidity', 'Mean_Humidity', 'Min_Humidity', 'Max_Wind_SpeedKm_h', 
   'Mean_Wind_SpeedKm_h', 'CloudCover', 'trend', 'trend_DE',
   'AfterStateHoliday', 'BeforeStateHoliday', 'Promo', 'SchoolHoliday']
设置训练数据成分

设置训练数据成分

dep_var = 'Sales'
df = train_df[cat_vars + cont_vars + [dep_var,'Date']].copy()
找出测试数据长度

找出测试数据长度

test_df['Date'].min(), test_df['Date'].max()
('2015-08-01', '2015-09-17')
根据测试数据两计算需要多少验证数据

根据测试数据两计算需要多少验证数据

cut = train_df['Date'][(train_df['Date'] == train_df['Date'][len(test_df)])].index.max()
cut
41395
valid_idx = range(cut)
df[dep_var].head()
0     5263
1     6064
2     8314
3    13995
4     4822
Name: Sales, dtype: int64
用TabluarList和df构建databunch

用TabluarList和df构建databunch

data = (TabularList.from_df(df, path=path, cat_names=cat_vars, cont_names=cont_vars, procs=procs,)
                .split_by_idx(valid_idx)
                .label_from_df(cols=dep_var, label_cls=FloatList, log=True)
                .add_test(TabularList.from_df(test_df, path=path, cat_names=cat_vars, cont_names=cont_vars))
                .databunch())
查看FloatList 与 log的使用

查看FloatList 与 log的使用

doc(FloatList)

Model

计算y range

计算y range

max_log_y = np.log(np.max(train_df['Sales'])*1.2)
y_range = torch.tensor([0, max_log_y], device=defaults.device)
构建tabular learner

构建tabular learner

learn = tabular_learner(data, layers=[1000,500], ps=[0.001,0.01], emb_drop=0.04, 
                        y_range=y_range, metrics=exp_rmspe)
查看模型

查看模型

learn.model
TabularModel(
  (embeds): ModuleList(
    (0): Embedding(1116, 81)
    (1): Embedding(8, 5)
    (2): Embedding(4, 3)
    (3): Embedding(13, 7)
    (4): Embedding(32, 11)
    (5): Embedding(3, 3)
    (6): Embedding(26, 10)
    (7): Embedding(27, 10)
    (8): Embedding(5, 4)
    (9): Embedding(4, 3)
    (10): Embedding(4, 3)
    (11): Embedding(24, 9)
    (12): Embedding(9, 5)
    (13): Embedding(13, 7)
    (14): Embedding(53, 15)
    (15): Embedding(22, 9)
    (16): Embedding(7, 5)
    (17): Embedding(7, 5)
    (18): Embedding(4, 3)
    (19): Embedding(4, 3)
    (20): Embedding(9, 5)
    (21): Embedding(9, 5)
    (22): Embedding(3, 3)
    (23): Embedding(3, 3)
  )
  (emb_drop): Dropout(p=0.04)
  (bn_cont): BatchNorm1d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (layers): Sequential(
    (0): Linear(in_features=233, out_features=1000, bias=True)
    (1): ReLU(inplace)
    (2): BatchNorm1d(1000, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.001)
    (4): Linear(in_features=1000, out_features=500, bias=True)
    (5): ReLU(inplace)
    (6): BatchNorm1d(500, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): Dropout(p=0.01)
    (8): Linear(in_features=500, out_features=1, bias=True)
  )
)
查看数据内细节

查看数据内细节

len(data.train_ds.cont_names)
16
作图,寻找学习率最优值

作图,寻找学习率最优值

learn.lr_find()
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
learn.recorder.plot()

output_58_0

训练模型 1e-3, wd=0.2

训练模型 1e-3, wd=0.2

learn.fit_one_cycle(5, 1e-3, wd=0.2)

Total time: 11:27

epoch train_loss valid_loss exp_rmspe
1 0.023587 0.020941 0.140551
2 0.017678 0.023431 0.132211
3 0.017453 0.016929 0.120169
4 0.012608 0.016296 0.109245
5 0.010222 0.011238 0.105433
learn.save('1')
画出损失值走势

画出损失值走势

learn.recorder.plot_losses(last=-1)

output_63_0

learn.load('1');
连续训练两次,每次5epochs,lr=3e-4

连续训练两次,每次5epochs,lr=3e-4

learn.fit_one_cycle(5, 3e-4)

Total time: 11:32

epoch train_loss valid_loss exp_rmspe
1 0.012223 0.014312 0.116988
2 0.012001 0.017789 0.117619
3 0.011402 0.035596 0.114396
4 0.010067 0.015125 0.113652
5 0.009148 0.031326 0.116344
learn.fit_one_cycle(5, 3e-4)

Total time: 11:31

epoch train_loss valid_loss exp_rmspe
1 0.011840 0.013236 0.110483
2 0.010765 0.057664 0.129586
3 0.010101 0.042744 0.111584
4 0.008820 0.116893 0.135458
5 0.009144 0.017969 0.126323
预测,生成submission

预测,生成submission

(10th place in the competition was 0.108)

test_preds=learn.get_preds(DatasetType.Test)
test_df["Sales"]=np.exp(test_preds[0].data).numpy().T[0]
test_df[["Id","Sales"]]=test_df[["Id","Sales"]].astype("int")
test_df[["Id","Sales"]].to_csv("rossmann_submission.csv",index=False)

Jeremy 在课程中是如何讲解学习率的

lesson 2

如何从图中读取最优学习率区间?

[quote=“Daniel, post:44, topic:39325”]

why do we need learning rate at all?

Lesson 3

How to pick learning rate carefully?

Lesson 5

What is fit-one-cycle
1 Like

第七课: 手写Resnets; U-net; Generative (adversarial) networks

Overview 综述

In the final lesson of Practical Deep Learning for Coders we’ll study one of the most important techniques in modern architectures: the skip connection. This is most famously used in the resnet, which is the architecture we’ve used throughout this course for image classification, and appears in many cutting edge results. We’ll also look at the U-net architecture, which uses a different type of skip connection to greatly improve segmentation results (and also for similar tasks where the output structure is similar to the input).

在这最后一节课里,我们将学到最新结构设计中最重要的一个技术:skip connection 。这项技术诞生于resnet, 也就是我们在做图片分类时一直在用的模型框架,为我们带来了顶级的表现。我们还会学习U-net结构,这项技术采用另一种skip connection,极大地改进了segmentation的效果(这项技术同样适用于其他任务,只要他们的输入值和输出值结构相似)。

We’ll then use the U-net architecture to train a super-resolution model. This is a model which can increase the resolution of a low-quality image. Our model won’t only increase resolution—it will also remove jpeg artifacts, and remove unwanted text watermarks.

之后我们将用U-net训练 super-resolution 模型。这个模型能够提升低像素图片的清晰度。我们的模型不仅可以提升像素,还能去除jpeg格式的杂物,以及消除文字水印。

In order to make our model produce high quality results, we will need to create a custom loss function which incorporates feature loss (also known as perceptual loss), along with gram loss. These techniques can be used for many other types of image generation task, such as image colorization.

为了让模型生成高质量输出,我们需要设计一款特制损失函数来将特征损失值(也就是所谓的perceptual loss)和gram 损失值做融合。这些技巧同样适用于很多其他的图片生成任务,比如图片上色。

Finally, we’ll learn about a recent loss function known as generative adversarial loss (used in generative adversarial networks, or GANs), which can improve the quality of generative models in some contexts, at the cost of speed.

最后,我们讲学习一个很新的损失函数,被称为generative adversarial loss (被用于生成对抗网络模型,也就是GANs),它能改进生成模型在某任务背景下的表现,但速度上会有所牺牲。

The techniques we show in this lesson include some unpublished research that:

  • Let us train GANs more quickly and reliably than standard approaches, by leveraging transfer learning
  • Combines architectural innovations and loss function approaches that haven’t been used in this way before.

The results are stunning, and train in just a couple of hours (compared to previous approaches that take a couple of days).

我们在本课中展示但还未正式做学术发表的技巧,如下:

  • 利用迁移学习,做到比常规方法更快更可靠的训练GANs
  • 在结构设计创新与损失函数使用方法的融合上,也做出了史无前例的创新

对比之前常规方法需要数天时间,我们只是训练了数小时,模型表现已经非常靓丽。

Lesson Resources 课程资源

Other Resources 其他资源


编辑此页面.

Hi @PegasusWithoutWinds

I will translate all the 7 lesson md files here. could you help me contribute to the official course-v3 repo without me doing the PR (which I found the process a bit tedious)? Is it possible?
thanks!

Lesson 6: Regularization; Convolutions; Data ethics

第六课:正则化,卷积,数据伦理

Overview 综述

Today we discuss some powerful techniques for improving training and avoiding over-fitting:

  • Dropout: remove activations at random during training in order to regularize the model
  • Data augmentation: modify model inputs during training in order to effectively increase data size
  • Batch normalization: adjust the parameterization of a model in order to make the loss surface smoother.

今天我们学习讨论一些帮助我们改进训练和避免过拟合的强大技巧:

  • Dropout: 随机去除一些激活层上的值,目的是对模型做正则处理
  • Data augmentation数据增强:对模型输入值做调整,目的是显著增加数据尺寸
  • Batch normalization批量正常化:规整模型参数值,目的是训练时让损失值变化更平滑

Next up, we’ll learn all about convolutions, which can be thought of as a variant of matrix multiplication with tied weights, and are the operation at the heart of modern computer vision models (and, increasingly, other types of models too).

接下来,我们会学习convolutions卷积,这个概念可以被理解为一种数组乘法与多个捆绑参数扫描器合作的变体,是当下机器视觉的核心算法(同时也在不断为其他类别的模型所使用)。

We’ll use this knowledge to create a class activated map, which is a heat-map that shows which parts of an image were most important in making a prediction.

我们将基于此创建一个类叫 activated map, 其功能是对图片做热力图处理,从而凸显展示图片中对预测最重要的部位。

Finally, we’ll cover a topic that many students have told us is the most interesting and surprising part of the course: data ethics. We’ll learn about some of the ways in which models can go wrong, with a particular focus on feedback loops, why they cause problems, and how to avoid them. We’ll also look at ways in which bias in data can lead to biased algorithms, and discuss questions that data scientists can and should be asking to help ensure that their work doesn’t lead to unexpected negative outcomes.

最后,我们将学习数据伦理, 许多学员认为这是一个非常有趣且出乎意料的课程内容。我们会了解到模型在什么情况下会出问题,会着重讲到feedback loops 反馈循环, 以及为什么这些会是问题,以及如何规避。我们还会看到数据中的偏差是如何导致算法偏差(歧视)的,还会探讨关于数据科学家能做什么,以及是否应该力争确保他们的工作不会导致意想不到的负面结果。

Lesson Resources 课程资源

Other Resources 其他资源


编辑此页面.

Lesson 5: Back propagation; Accelerated SGD; Neural net from scratch

第五课:反向传递,加速版随机梯度下降,手写神经网络

Overview 综述

In lesson 5 we put all the pieces of training together to understand exactly what is going on when we talk about back propagation. We’ll use this knowledge to create and train a simple neural network from scratch.

在本课中我们会深入训练环节细节来讲解什么是反向传递。在此基础上,我们会手写一个简单的神经网络

We’ll also see how we can look inside the weights of an embedding layer, to find out what our model has learned about our categorical variables. This will let us get some insights into which movies we should probably avoid at all costs…

我们还将深入观察embedding层的参数,看看模型学到了哪些关于类别变量的知识。这些知识将帮助我们识别那些需要权利回避的电影…

Although embeddings are most widely known in the context of word embeddings for NLP, they are at least as important for categorical variables in general, such as for tabular data or collaborative filtering. They can even be used with non-neural models with great success.

尽管embeddings的知名度在自然语言word embeddings领域里是最高的,但在广义的类别变量问题的背景下,如表格数据问题或者是推荐算法问题里,他们的重要性不容小觑。他们甚至在非神经网络模型里也有杰出表现。

Resources 资源

Lesson resources 课程资源

Other resources 其他资源


编辑此页面.

1 Like

Hey @Daniel, I could walk you through the PR process if you would like to. If you still find it too complicated, I can certainly submit the PR for you.

Lesson 1: Image classification

You can click the blue arrow buttons on the left and right panes to hide them and make more room for the video. You can search the transcript using the text box at the bottom. Scroll down this page for links to many useful resources. If you have any other suggestions for links, edits, or anything else, you’ll find an “edit” link at the bottom of this (and every) notes panel.

点击左侧和右侧的蓝色箭头按钮来隐藏panel给你更多观看视频的空间。你可以屏幕下方搜索字幕并进行视频时间跳跃。页面下方还有大量有用资源。如果你有任何建议,可以在最下方的“编辑”链接,增加你想添加的链接或编辑。

Overview 综述

To follow along with the lessons, you’ll need to connect to a cloud GPU provider which has the fastai library installed (recommended; it should take only 5 minutes or so, and cost under $0.50/hour), or set up a computer with a suitable GPU yourself (which can take days to get working if you’re not familiar with the process, so we don’t recommend it). You’ll also need to be familiar with the basics of the Jupyter Notebook environment we use for running deep learning experiments. Up to date tutorials and recommendations for these are available from the course website.

跟随课程,你需要有一个云端GPU能运行fastai(推荐,目前最便宜的是每小时0.5美元),或者在本地设置自己的GPU(非常费时费事,不推荐)。你需要熟悉Jupyter Notebook的使用环境来做深度学习实验。更多最新的GPU指南可以在课程官网中查看。

The key outcome of this lesson is that we’ll have trained an image classifier which can recognize pet breeds at state of the art accuracy. The key to this success is the use of transfer learning, which will be a key platform for much of this course. We’ll also see how to analyze the model to understand its failure modes. In this case, we’ll see that the places where the model is making mistakes is in the same areas that even breeding experts can make mistakes.

本课的核心目标是训练一个图片分类器,将宠物种类识别做到最专业级的精确度。实验成功的关键是迁移学习 transfer learning,也是本课程的核心平台或模型模版工具之一. 我们会学习如何分析模型以理解错误发生所在。在此过程中,我们将看到模型犯错的地方,就连宠物种类鉴定专家也会判断出错。

We’ll discuss the overall approach of the course, which is somewhat unusual in being top-down rather than bottom-up. So rather than starting with theory, and only getting to practical applications later, instead we start with practical applications, and then gradually dig deeper and deeper in to them, learning the theory as needed. This approach takes more work for teachers to develop, but it’s been shown to help students a lot, for example in education research at Harvard by David Perkins.

我们还将探讨本课程的授课模式,即自上而下,而非自下而上。也就是说,我们是从实验开始,根据需求,逐步深入学习理论,而非传统方式,讲完理论,才慢慢开始实践。这种方法对老师挑战较大非常耗时,但对学生受益颇丰,例如 education research at Harvard by David Perkins.

We also discuss how to set the most important hyper-parameter when training neural networks: the learning rate, using Leslie Smith’s fantastic learning rate finder method. Finally, we’ll look at the important but rarely discussed topic of labeling, and learn about some of the features that fastai provides for allowing you to easily add labels to your images.

我们还将讨论在训练模型时如何设置那些最重要的超参数hyper-parameter。我们将采用Leslie Smith’s fantastic learning rate finder method来设置学习率。最后,我们将研究很少讨论但非常重要的labeling数据标记, 并学习fastai 库提供的轻松添加图片标注的功能

If you want to more deeply understand how PyTorch really works, you may want to check out this official PyTorch tutorial by Jeremy—although we’d only suggest doing that once you’ve completed a few lessons.

如果你想要深入理解pytorch的实际工作,可以参看 this official PyTorch tutorial by Jeremy,但先别急,建议你在学完本课程的几节课后再学

Links 链接

Lesson resources 课程资源

Other resources 其他资源

How to scrape images 如何从网页爬取图片


编辑此页面

编辑此页面, 点击这里. 你会进入GitHub一个页面让你上交修改。它们会自动生成 pull request 然后经由管理员审核后发布。

Hello Yangdf, we have setup a fastai study group in Shanghai. See the QR code

.

We have also a slack channel for collaborative work. Here is an invitation to this slack channel.

Our first Meetup will take place in two Shanghai locations: Pudong and Hongqiao, on March 24th, 2019.

Welcome to the club! 欢迎来到俱乐部!

1 Like

Thanks a lot! I would like you to do the submission for me. Do you mind just use the what I posted here or you want me to send you translations in md files?

1 Like

Lesson 4: NLP; Tabular data; Collaborative filtering; Embeddings

第四课:自然语言,表格数据,推荐系统算法collab, 嵌入层embeddings

Overview 综述

In lesson 4 we’ll dive in to natural language processing (NLP), using the IMDb movie review dataset. In this task, our goal is to predict whether a movie review is positive or negative; this is called sentiment analysis. We’ll be using the ULMFiT algorithm, which was originally developed during the fast.ai 2018 course, and became part of a revolution in NLP during 2018 which led the New York Times to declare that new systems are starting to crack the code of natural language. ULMFiT is today the most accurate known sentiment analysis algorithm.
本课里我们将通过IMDb 电影评论数据集,深入学习自然语言NLP。我们的任务是预测影评的正负面情绪;也就是情绪分析。我们将采用ULMFiT 算法,这个算法是我们最初在2018年课程中开发的,随后成为了自然语言中的一个革命性变化的一部分。纽约时报还因此发文称新系统正在揭秘自然语言。如今ULMFiT已经成为最准确的情绪分析算法。

The basic steps are:

  1. Create (or, preferred, download a pre-trained) language model trained on a large corpus such as Wikipedia (a “language model” is any model that learns to predict the next word of a sentence)
  2. Fine-tune this language model using your target corpus (in this case, IMDb movie reviews)
  3. Extract the encoder from this fine tuned language model, and pair it with a classifier. Then fine-tune this model for the final classification task (in this case, sentiment analysis).

基本步骤:

  1. 创建(下载预先训练好的)language model语言模型,这个模型是在一个巨大的语言数据集如维基百科上训练而来的。(所谓的"语言模型" 就是能够学习预测句子中下一个词的模型)
  2. 用你的目标数据集微调这个语言模型(在我们的案例中,目标数据集是IMDb影评数据)
  3. 从这个微调的语言模型中提取encoder, 再给配上一个分类器。然后为最后的分类任务(也就是情绪判断)来微调模型。

After our journey into NLP, we’ll complete our practical applications for Practical Deep Learning for Coders by covering tabular data (such as spreadsheets and database tables), and collaborative filtering (recommendation systems).

完成NLP后,我们还会覆盖表格数据问题如excel和数据库中的表格,以及解决推荐系统问题的collaborative filtering 算法。到此为止,我们覆盖了全课程所有的深度学习应用。

For tabular data, we’ll see how to use categorical and continuous variables, and how to work with the fastai.tabular module to set up and train a model.

就表格数据而言,我们会学到如何使用类别和连续变量,如何使用fastai.tabular模块来设置和训练模型。

Then we’ll see how collaborative filtering models can be built using similar ideas to those for tabular data, but with some special tricks to get both higher accuracy and more informative model interpretation.

随后我们将用表格数据问题所学来构建collaborative filtering模型,但是在使用了特殊技巧后,模型的准确度不仅更高,而且更具解释性。

This brings us to the half-way point of the course, where we have looked at how to build and interpret models in each of these key application areas:

  • Computer vision
  • NLP
  • Tabular
  • Collaborative filtering

到此,我们已完成了一半的课程,覆盖了全部的应用领域:

  • 机器视觉
  • 自然语言
  • 表格数据
  • 推荐系统的 Collaborate filtering

For the second half of the course, we’ll learn about how these models really work, and how to create them ourselves from scratch. For this lesson, we’ll put together some of the key pieces we’ve touched on so far:

  • Activations
  • Parameters
  • Layers (affine and non-linear)
  • Loss function.

在课程的后半段,我们讲学习这些模型到底是如何工作的,以及如何手写这些模型。本节课,我们将对以下核心概念做梳理:

  • 激活层(值)
  • 参数(权重)
  • 层(affine线性 和非线性)
  • 损失函数

We’ll be coming back to each of these in lots more detail during the remaining lessons. We’ll also learn about a type of layer that is important for NLP, collaborative filtering, and tabular models: the embedding layer. As we’ll discover, an “embedding” is simply a computational shortcut for a particular type of matrix multiplication (a multiplication by a one-hot encoded matrix).

我们会在后续的课时中进一步探索以上概念的相关细节。我们会学到对自然语言,Collaborative filtering, 以及表格数据模型都很重要的一种神经网络层设计:嵌入层 embedding layer。 我们会发现,其实“嵌入层”就是一种特殊数组乘法matrix multiplication (基于one-hot encoded 数组乘法)的简化算法。

Resources 资源

Lesson resources 课程资源

Other resources 其他资源


编辑此页面.

2 Likes

Lesson 3: Data blocks; Multi-label classification; Segmentation

第三课 data blocks, 多标签分类,图片像素隔离

Overview 综述

Lots to cover today! We start lesson 3 looking at an interesting dataset: Planet’s Understanding the Amazon from Space. In order to get this data in to the shape we need it for modeling, we’ll use one of fastai’s most powerful (and unique!) tools: the data block API. We’ll be coming back to this API many times over the coming lessons, and mastery of it will make you a real fastai superstar! Once you’ve finished this lesson, if you’re ready to learn more about the data block API, have a look at this great article: Finding Data Block Nirvana, by Wayde Gilliam.

本节课内容很多!一开始我们要看一个非常有趣的数据集:Planet’s Understanding the Amazon from Space. 为了让数据能“喂给”模型,我们需要用fastai强大且独特的data block API工具来处理数据。在后续的课时中,我们也会反复使用这个API,熟练掌握它能让你成为真正的fastai超级明星!当你完成本节课,如果你准备好学习更多data block API,可以看看这篇很棒的文章Finding Data Block Nirvana, 作者是 Wayde Gilliam.

One important feature of the Planet dataset is that it is a multi-label dataset. That is: each satellite image can contain multiple labels, whereas previous datasets we’ve looked at have had exactly one label per image. We’ll look at what changes we need to make to work with multi-label datasets.

planet数据集一个重要特征是多标签multi-label。也就是说:每张卫星图片可以包含多个标签/标注,而之前的数据集我们面对的是一张图对应一个标注。我们会学到需要做哪些调整来处理这个多标签问题。

Next, we will look at image segmentation, which is the process of labeling every pixel in an image with a category that shows what kind of object is portrayed by that pixel. We will use similar techniques to the earlier image classification models, with a few tweaks. fastai makes image segmentation modeling and interpretation just as easy as image classification, so there won’t be too many tweaks required.
接下来,我们将学习image segmentation 图片像素隔离,也就是对图片中每一个像素做类别标注,从而知道哪个像素对应哪个物体。我们会对前期所学的技巧做一些调整。fastai将图片像素隔离建模和解读做得跟图片分类一样简单,因此不会有太多需要调整的地方。

We will be using the popular Camvid dataset for this part of the lesson. In future lessons, we will come back to it and show a few extra tricks. Our final Camvid model will have dramatically lower error than an model we’ve been able to find in the academic literature!
我们将用著名的Camvid数据集来做图片像素隔离。后续课时中,还会回头学习更多技巧。我们最终Camvid模型对比所能找到的已发表的最优学术水平,将进一步大幅降低错误率。

What if your dependent variable is a continuous value, instead of a category? We answer that question next, looking at a keypoint dataset, and building a model that predicts face keypoints with high accuracy.
如果你的目标变量是连续的,而非类别,怎么办?我们将用下一个数据集keypoint来回答,我们将构建一个模型做高精度的脸部关键点预测。

Resources资源

Lesson resources 课程资源

Other resources 其他资源

Further reading 深入阅读


编辑此页面.

Lesson 2: Data cleaning and production; SGD from scratch

第二课:数据清洗和模型云端应用;手写SGD

Overview 综述

We start today’s lesson learning how to build your own image classification model using your own data, including topics such as:

  • Image collection
  • Parallel downloading
  • Creating a validation set, and
  • Data cleaning, using the model to help us find data problems.

今天我们要用自己的数据构建属于你的图片分类器,涉及内容包括:

  • 图片搜集
  • 并行下载
  • 创建一个验证集
  • 数据清洗,让模型帮助我们找出数据内的瑕疵

I’ll demonstrate all these steps as I create a model that can take on the vital task of differentiating teddy bears from grizzly bears. Once we’ve got our data set in order, we’ll then learn how to productionize our teddy-finder, and make it available online.
我会演示以上步骤,通过一个创建一个模型来区分泰迪熊,棕熊和黑熊。一旦我们的模型训练到位,我们将让这个模型能在云端被调用。

We’ve had some great additions since this lesson was recorded, so be sure to check out:

  • The production starter kits on the course web site, such as this one for deploying to Render.com
  • The new interactive GUI in the lesson notebook for using the model to find and fix mislabeled or incorrectly-collected images.

课程录制之后,我们增加很多内容,请关注:

  • 云端调用模型平台,例如,Render.com使用指南
  • Notebook中的新互动界面能帮助我们寻找和修正错误标注的图片

In the second half of the lesson we’ll train a simple model from scratch, creating our own gradient descent loop. In the process, we’ll be learning lots of new jargon, so be sure you’ve got a good place to take notes, since we’ll be referring to this new terminology throughout the course (and there will be lots more introduced in every lesson from here on).
在本节课的后半段,我们将手动创建和训练一个简单模型,并手写我们自己的梯度下降循环。在这个过程中,我们将学到很多新名词,请确保做好笔记,因为之后我们会反复使用这些名词。(之后还会学到更多新名词)

Resources 资源

Lesson resources 课程资源

Other resources 其他资源


编辑此页面.

自下而上教学的7大理论依据

Harvard Graduate School of Education Professor David Perkins, his 2008 book, Making Learning Whole: How Seven Principles of Teaching Can Transform Education.

  1. Play the whole game

教育界存在两种非常不幸的教育趋势或势力: 一种可以叫做 “elementitis” ,也就是大量时间用于学习一件事物的零部件组成,但却不重视他们的整体意义和价值。另一种趋势,是重视学习事物背后的理论,而忽视了所学知识的现实应用和价值。就好比 “练习挥棒一年不代表你会打棒球,” 同理, 学生被要求投入大量时间学习和解决设置好的有唯一解的习题,但对数学的现实价值却毫无体会。

那么如何实践 play the whole game?
- 设计初级版的全套棒球游戏:用最少的人,最简单的环境,却能涵盖棒球的精要规则,并感受棒球运动的个性与快乐
- 与其让孩子做算术习题,不如给他们钱,让他们计算是否足够购买购物清单中的食物

  1. Make the game worth playing
    挖掘知识背后的现实价值,进而激发学生的好奇心,从而使学习知识本身充满动力

  2. Work on the hard parts
    耐住寂寞,铁杵磨针的专研艰深技能,是造就专业性的关键,不容忽视

  3. Play out of town
    变更应用场景,检验是否让知识深入骨髓,知识迁移的能力

  4. Play the hidden game
    看穿游戏背后的本质,探索更深层次的游戏,好比挖掘棒球背后的统计数据协助决策,能帮助棒球队在现有水平上获取更多的胜利

  5. Learn from the team
    社交学习,团队学习,也非常重要

  6. Learn the game of learning
    让学生主动掌控自己的学习规划,认识到这是自己的战斗,自己的主场。

Lesson 7 Resnet Mnist

Jeremy的视频解读1:59-11:02

MNIST CNN

三行魔法代码
%reload_ext autoreload
%autoreload 2
%matplotlib inline
所需library
from fastai.vision import *

Data

下载完整MNIST

下载完整MNIST

path = untar_data(URLs.MNIST)
path.ls()
[PosixPath('/home/ubuntu/.fastai/data/mnist_png/training'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_png/testing')]
利用data block api 分布实现DataBunch构建

利用data block api 分布实现DataBunch构建

il = ImageList.from_folder(path, # recursively 提取文件夹中图片和label信息
                           convert_mode='L' # 图片按照黑白取色)
ImageList 的内容可以通过items查看

ImageList 的内容可以通过items查看

il.items[0] 
PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/4/44688.png')
将fastai的图片颜色设置为二元

将fastai的图片颜色设置为二元

defaults.cmap='binary' # 通常是RGB
展示ImageList的内容

展示ImageList的内容

il 
ImageList (70000 items)
[Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28)]...
Path: /home/ubuntu/.fastai/data/mnist_png
展示ImageList的第一个内容(图片)

展示ImageList的第一个内容(图片)

il[0].show()

output_18_0

如何查分数据集成训练,验证和测试集

如何查分数据集成训练,验证和测试集

sd = il.split_by_folder(train='training', valid='testing')
# 这里的'testing' 文件夹里是含有标注的数据(也就是验证集),不是真正意义上的无标注测试集
查看分割后的ImageList

查看分割后的ImageList

sd
ItemLists;

Train: ImageList (60000 items)
[Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28)]...
Path: /home/ubuntu/.fastai/data/mnist_png;

Valid: ImageList (10000 items)
[Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28)]...
Path: /home/ubuntu/.fastai/data/mnist_png;

Test: None
查看‘training’文件夹内容

查看‘training’文件夹内容

(path/'training').ls() # 一个类别一个子文件夹
[PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/4'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/6'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/8'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/0'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/9'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/1'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/3'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/2'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/5'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_png/training/7')]
如何基于文件夹提取标注

如何基于文件夹提取标注

ll = sd.label_from_folder() 
ll # 注意查看 哪些是LabelLists, CategoryList, ImageList
LabelLists;

Train: LabelList
y: CategoryList (60000 items)
[Category 4, Category 4, Category 4, Category 4, Category 4]...
Path: /home/ubuntu/.fastai/data/mnist_png
x: ImageList (60000 items)
[Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28)]...
Path: /home/ubuntu/.fastai/data/mnist_png;

Valid: LabelList
y: CategoryList (10000 items)
[Category 4, Category 4, Category 4, Category 4, Category 4]...
Path: /home/ubuntu/.fastai/data/mnist_png
x: ImageList (10000 items)
[Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28), Image (1, 28, 28)]...
Path: /home/ubuntu/.fastai/data/mnist_png;

Test: None
如何提取一个training 样本来查看

如何提取一个training 样本来查看

x,y = ll.train[0]
x.show()
print(y,x.shape)
4 torch.Size([1, 28, 28])

output_18_0

如何为训练和验证集图片做变形设置

如何为训练和验证集图片做变形设置

tfms = ([*rand_pad(padding=3, size=28, mode='zeros')], [])
# rand_pad = 做random padding, 注意args
# * 代表接受多个输出值
# 第一个[]针对训练集图片,第二个 []针对验证集图片
ll = ll.transform(tfms)
构建DataBunch时,在什么情况下不用Imagenet.stats

构建DataBunch时,在什么情况下不用Imagenet.stats

bs = 128
# 在不使用pre-trained model时,不用Imagenet.stats
data = ll.databunch(bs=bs).normalize()
如何展示一个DataBunch的训练样本

如何展示一个DataBunch的训练样本

x,y = data.train_ds[0]
x.show()
print(y)
4

output_39_1

如何展示训练样本的变形效果

如何展示训练样本的变形效果

  • 从LL到DataBunch,图片被加入了形变,批量和normalization
  • 因为是随机padding,每次变形都不太一样
def _plot(i,j,ax): data.train_ds[0][0].show(ax, cmap='gray')
plot_multi(_plot, 3, 3, figsize=(8,8))

output_41_0

如何获取一个批量的训练样本?

如何获取一个批量的训练样本?

xb,yb = data.one_batch()
xb.shape,yb.shape
(torch.Size([128, 1, 28, 28]), torch.Size([128]))
如何展示一个批量中的图片和标注

如何展示一个批量中的图片和标注

data.show_batch(rows=3, figsize=(5,5))

output_45_0

如何手动打造CNN

如何手动打造CNN

Basic CNN with batchnorm

特制conv层(特定kernel大小,stride,padding大小)

特制conv层(特定kernel大小,stride,padding大小)

def conv(ni,nf): return nn.Conv2d(ni, nf, kernel_size=3, stride=2, padding=1)
手动构建模型

手动构建模型

model = nn.Sequential(
    conv(1, 8), # num_input = 1 channel, num_filters = 8 (channels out), feature_map_size = 14
    nn.BatchNorm2d(8),
    nn.ReLU(),
    conv(8, 16), # num_filter自主选择,fm_size = 7, 因为stride=2, 14/2=7
    nn.BatchNorm2d(16),
    nn.ReLU(),
    conv(16, 32), # 4 = 7/2 = 3.5 四舍五入,得到4; 选择filter数量为32
    nn.BatchNorm2d(32),
    nn.ReLU(),
    conv(32, 16), # 2 = 4/2 = fm_size; 选择性降低filter的数量到16
    nn.BatchNorm2d(16),
    nn.ReLU(),
    conv(16, 10), # 1, 在下降到10,因为我们要分出10个类别, 获得(10,1,1),但损失函数只接受vector
    nn.BatchNorm2d(10),
    Flatten()     # remove (1,1) grid, 生成含有10个值的vector向量
)
如何从模型结构变为Learner

如何从模型结构变为Learner

learn = Learner(data, model, loss_func = nn.CrossEntropyLoss(), metrics=accuracy)
如何打印出Learner的内部细节

如何打印出Learner的内部细节

print(learn.summary())
# 注意shape,param数量,注意哪些层是可训练的
======================================================================
Layer (type)         Output Shape         Param #    Trainable 
======================================================================
Conv2d               [128, 8, 14, 14]     80         True      
______________________________________________________________________
BatchNorm2d          [128, 8, 14, 14]     16         True      
______________________________________________________________________
ReLU                 [128, 8, 14, 14]     0          False     
______________________________________________________________________
Conv2d               [128, 16, 7, 7]      1168       True      
______________________________________________________________________
BatchNorm2d          [128, 16, 7, 7]      32         True      
______________________________________________________________________
ReLU                 [128, 16, 7, 7]      0          False     
______________________________________________________________________
Conv2d               [128, 32, 4, 4]      4640       True      
______________________________________________________________________
BatchNorm2d          [128, 32, 4, 4]      64         True      
______________________________________________________________________
ReLU                 [128, 32, 4, 4]      0          False     
______________________________________________________________________
Conv2d               [128, 16, 2, 2]      4624       True      
______________________________________________________________________
BatchNorm2d          [128, 16, 2, 2]      32         True      
______________________________________________________________________
ReLU                 [128, 16, 2, 2]      0          False     
______________________________________________________________________
Conv2d               [128, 10, 1, 1]      1450       True      
______________________________________________________________________
BatchNorm2d          [128, 10, 1, 1]      20         True      
______________________________________________________________________
Flatten              [128, 10]            0          False     
______________________________________________________________________

Total params: 12126
Total trainable params: 12126
Total non-trainable params: 0
如何启用模型生成预测值

如何启用模型生成预测值

xb = xb.cuda() # 让数据在GPU上运行
model(xb).shape # 生成预测值
torch.Size([128, 10])
寻找学习率并作图

寻找学习率并作图

learn.lr_find(end_lr=100)
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
learn.recorder.plot()

output_61_0

训练模型 (挑选最优学习率)

训练模型 (挑选最优学习率)

learn.fit_one_cycle(3, max_lr=0.1)

Total time: 00:30

epoch train_loss valid_loss accuracy
1 0.223167 0.217859 0.930500
2 0.136179 0.078651 0.976400
3 0.072080 0.038664 0.988600

Refactor

如何将BatchNorm, ReLU 也融入到一个conv创建函数里?

如何将BatchNorm, ReLU 也融入到一个conv创建函数里?

def conv2(ni,nf): return conv_layer(ni,nf,stride=2) # fastai提供了conv_layer
refactor后,建模变得更简练

refactor后,建模变得更简练

model = nn.Sequential(
    conv2(1, 8),   # 14, 非常便捷,只用在意输入输出channel数量即可
    conv2(8, 16),  # 7
    conv2(16, 32), # 4
    conv2(32, 16), # 2
    conv2(16, 10), # 1
    Flatten()      # remove (1,1) grid
)
learn = Learner(data, model, loss_func = nn.CrossEntropyLoss(), metrics=accuracy)
训练10次后,效果很好,虽然中间准确率有波动,但损失值一直在下降

训练10次后,效果很好,虽然中间准确率有波动,但损失值一直在下降

learn.fit_one_cycle(10, max_lr=0.1)

Total time: 01:12

epoch train_loss valid_loss accuracy
1 0.228332 0.206325 0.937500
2 0.189966 0.192558 0.940800
3 0.156765 0.092810 0.969100
4 0.135871 0.083914 0.973300
5 0.108844 0.071582 0.978000
6 0.105887 0.128586 0.960200
7 0.080699 0.052754 0.983200
8 0.066007 0.037588 0.988600
9 0.047513 0.030255 0.990200
10 0.044705 0.028373 0.991600

如何手写Resnet

简易增大模型方法

简易增大模型方法

  • 每个conv2 layer后面加一个conv1 layer,这样不会改变feature map的大小,可以模型层数可以无限扩大
简易增大模型的弊端

简易增大模型的弊端

  • image.png
  • 对比plain 20 vs 45 层模型,居然浅层模型损失值降得更低,很反常,为什么?
Kaiming He的解决方案

Kaiming He的解决方案

  • image.png
  • 新设计让56层模型效果至少应该和20层模型一样
  • 所有人都应该尝试将Resnet之前有趣的论文,尝试套入Res-block设计,看看有多少增效
为什么res-block有奇效?

为什么res-block有奇效?

  • image.png

Resnet-ish

手写ResBlock

手写ResBlock

class ResBlock(nn.Module):
    def __init__(self, nf):
        super().__init__()
        self.conv1 = conv_layer(nf,nf)
        self.conv2 = conv_layer(nf,nf)
        
    def forward(self, x): return x + self.conv2(self.conv1(x))
fastai res_block函数只需填入filter 数量即可

fastai res_block函数只需填入filter 数量即可

help(res_block)
Help on function res_block in module fastai.layers:

res_block(nf, dense:bool=False, norm_type:Union[fastai.layers.NormType, NoneType]=<NormType.Batch: 1>, bottle:bool=False, **kwargs)
    Resnet block of `nf` features.
用resblock将模型层数提升到原来的3倍

用resblock将模型层数提升到原来的3倍

model = nn.Sequential(
    conv2(1, 8),
    res_block(8), # 不改变原feature mapd饿大小
    conv2(8, 16),
    res_block(16),
    conv2(16, 32),
    res_block(32),
    conv2(32, 16),
    res_block(16),
    conv2(16, 10),
    Flatten()
)
为什么要经常做refactor?

为什么要经常做refactor?

  • 减少出错可能
  • 代码更精炼简洁
如何将conv2 layer 与resblock合二为一

如何将conv2 layer 与resblock合二为一

def conv_and_res(ni,nf): return nn.Sequential(conv2(ni, nf), res_block(nf))
其他流程不变

其他流程不变

model = nn.Sequential(
    conv_and_res(1, 8),
    conv_and_res(8, 16),
    conv_and_res(16, 32),
    conv_and_res(32, 16),
    conv2(16, 10),
    Flatten()
)
learn = Learner(data, model, loss_func = nn.CrossEntropyLoss(), metrics=accuracy)
learn.lr_find(end_lr=100)
learn.recorder.plot()
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

output_90_1

learn.fit_one_cycle(12, max_lr=0.05)

Total time: 02:00

epoch train_loss valid_loss accuracy
1 0.248807 0.121582 0.972800
2 0.120927 0.360583 0.890300
3 0.104013 0.074916 0.977800
4 0.081181 0.065717 0.980000
5 0.068514 0.096448 0.967200
6 0.061274 0.087955 0.971800
7 0.051673 0.033911 0.989400
8 0.048090 0.033234 0.988800
9 0.039095 0.024638 0.992400
10 0.023670 0.021215 0.993400
11 0.019128 0.016159 0.994500
12 0.021365 0.016120 0.995200
ResNet在MNIST中的效果比较

ResNet在MNIST中的效果比较

  • image.png
  • 12次训练就打到数年前的顶级水平
  • 因为resent的简易和奇效,得到偏爱,主流library花费大量时间优化Resnet,让其速度更快;而新出炉的设计往往训练较慢
如何融合DenseBlock and Resblock?

如何融合DenseBlock and Resblock?

  • image.png
  • 什么是x.orig? -> original input
  • 什么是fastai的Resblock 设计
如何理解Dense net工作原理和特点?

如何理解Dense net工作原理和特点?

  • image.png
  • 占用很多内存
  • 但参数很少,需要较少图片
  • 适用于image segmentation
print(learn.summary())
======================================================================
Layer (type)         Output Shape         Param #    Trainable 
======================================================================
Conv2d               [128, 8, 14, 14]     72         True      
______________________________________________________________________
ReLU                 [128, 8, 14, 14]     0          False     
______________________________________________________________________
BatchNorm2d          [128, 8, 14, 14]     16         True      
______________________________________________________________________
Conv2d               [128, 8, 14, 14]     576        True      
______________________________________________________________________
ReLU                 [128, 8, 14, 14]     0          False     
______________________________________________________________________
BatchNorm2d          [128, 8, 14, 14]     16         True      
______________________________________________________________________
Conv2d               [128, 8, 14, 14]     576        True      
______________________________________________________________________
ReLU                 [128, 8, 14, 14]     0          False     
______________________________________________________________________
BatchNorm2d          [128, 8, 14, 14]     16         True      
______________________________________________________________________
MergeLayer           [128, 8, 14, 14]     0          False     
______________________________________________________________________
Conv2d               [128, 16, 7, 7]      1152       True      
______________________________________________________________________
ReLU                 [128, 16, 7, 7]      0          False     
______________________________________________________________________
BatchNorm2d          [128, 16, 7, 7]      32         True      
______________________________________________________________________
Conv2d               [128, 16, 7, 7]      2304       True      
______________________________________________________________________
ReLU                 [128, 16, 7, 7]      0          False     
______________________________________________________________________
BatchNorm2d          [128, 16, 7, 7]      32         True      
______________________________________________________________________
Conv2d               [128, 16, 7, 7]      2304       True      
______________________________________________________________________
ReLU                 [128, 16, 7, 7]      0          False     
______________________________________________________________________
BatchNorm2d          [128, 16, 7, 7]      32         True      
______________________________________________________________________
MergeLayer           [128, 16, 7, 7]      0          False     
______________________________________________________________________
Conv2d               [128, 32, 4, 4]      4608       True      
______________________________________________________________________
ReLU                 [128, 32, 4, 4]      0          False     
______________________________________________________________________
BatchNorm2d          [128, 32, 4, 4]      64         True      
______________________________________________________________________
Conv2d               [128, 32, 4, 4]      9216       True      
______________________________________________________________________
ReLU                 [128, 32, 4, 4]      0          False     
______________________________________________________________________
BatchNorm2d          [128, 32, 4, 4]      64         True      
______________________________________________________________________
Conv2d               [128, 32, 4, 4]      9216       True      
______________________________________________________________________
ReLU                 [128, 32, 4, 4]      0          False     
______________________________________________________________________
BatchNorm2d          [128, 32, 4, 4]      64         True      
______________________________________________________________________
MergeLayer           [128, 32, 4, 4]      0          False     
______________________________________________________________________
Conv2d               [128, 16, 2, 2]      4608       True      
______________________________________________________________________
ReLU                 [128, 16, 2, 2]      0          False     
______________________________________________________________________
BatchNorm2d          [128, 16, 2, 2]      32         True      
______________________________________________________________________
Conv2d               [128, 16, 2, 2]      2304       True      
______________________________________________________________________
ReLU                 [128, 16, 2, 2]      0          False     
______________________________________________________________________
BatchNorm2d          [128, 16, 2, 2]      32         True      
______________________________________________________________________
Conv2d               [128, 16, 2, 2]      2304       True      
______________________________________________________________________
ReLU                 [128, 16, 2, 2]      0          False     
______________________________________________________________________
BatchNorm2d          [128, 16, 2, 2]      32         True      
______________________________________________________________________
MergeLayer           [128, 16, 2, 2]      0          False     
______________________________________________________________________
Conv2d               [128, 10, 1, 1]      1440       True      
______________________________________________________________________
ReLU                 [128, 10, 1, 1]      0          False     
______________________________________________________________________
BatchNorm2d          [128, 10, 1, 1]      20         True      
______________________________________________________________________
Flatten              [128, 10]            0          False     
______________________________________________________________________

Total params: 41132
Total trainable params: 41132
Total non-trainable params: 0

谢谢分享,我目前也是运行视频上的notebook就在kaggle上,训练自己的模型就在gcp.目前b站上也有中英文字母的视频,如果翻墙不方便也可以在b站上看.

1 Like

Lesson 7 U-net story

30:52-48:32

Jeremy 更新了U-net之后在Camvid竞赛中一骑绝尘

Jeremy 更新了U-net之后在Camvid竞赛中一骑绝尘

做Image Segmentation的基本流程逻辑

Image Segmentation的难点在哪?

Image Segmentation的难点在哪?

  • 每个像素需要知道自己的物品归属,想象一下就知道难度很大
  • 模型要判断一个像素是属于行人,而非自行车者,需要模型真的能理解区分两者
  • 怎样的模型能做到这一步呢?
用Resnet 34 预先训练好的模型做成U-net模型

用Resnet 34 预先训练好的模型做成U-net模型

  • 没有pre-train=False,默认状态是True, 使用预训练好的模型

如何理解U-net的构成?

左侧第一个block情况

左侧第一个block情况

  • 输入值是原图尺寸 572x572
  • 同一个block的conv-layers生成的feature map不会缩小尺寸,但是没有add padding, 所以
左侧blocks逐个尺寸减半

左侧blocks逐个尺寸减半

右侧blocks逐个翻倍尺寸,如何做到的呢

右侧blocks逐个翻倍尺寸,如何做到的呢?

  • stride = 1/2,而不是 = 2
  • 这个过程叫deconvolution or transpose-convolution
常规deconvolution的方法

常规deconvolution的方法

  • 所有cells为0,很多计算都是浪费资源
  • 不同的filter扫描区的信息量不同,也不合理
新的deconvolution的方法

新的deconvolution的方法

  1. 先做Nearest Neighbor interpolation (不存在0浪费计算和信息量不均的问题)
  2. 再做conv1
另一种方法:bilinear-interpolation

另一种方法:bilinear-interpolation

fastai 的deconvolution方法

fastai 的deconvolution方法

  • part2 会教给大家的内容:
  1. pixel shuffle or subpixel convolution
downsampling-convolutions + deconvolution + ? = U-net

downsampling-convolutions + deconvolution + ? = U-net

  • 增加skip connection
  • skip between the same level
  • not add, but concat
  • 一直做到将第一层的input image 和最后一层output feature maps的联系到一起的skip connection, 带给模型做segmentation的优势
Unet source code

Unet source code

UnetBlock长什么样子

UnetBlock长什么样子

Unet是如何被发现的
  • 首先发表在了医学数据科学期刊里,不为深度学习社区所知
  • Kaggle竞赛有人使用,开始被Jeremy关注到
  • 后有人多次使用,进一步被先知先觉的人关注和研究

Pretrained GAN

46:00-76:00

image to image generation task 有哪些用途

image to image generation task 有哪些用途

  • 低像素转高像素
  • 黑白转彩色
  • 缺失转完整
  • 简单线条转名师作画
所需library

所需library

import fastai
from fastai.vision import *
from fastai.callbacks import *
from fastai.vision.gan import *
下载数据,准备文件夹路径

下载数据,准备文件夹路径

path = untar_data(URLs.PETS)
path_hr = path/'images'
path_lr = path/'crappy'

Crappified data

引入crappify函数

引入crappify函数

  • 打开图片
  • 缩小图片像素至 96x96 (低像素图片)
  • 通过bilinear方式缩小图片,保留RGB
  • 加入文字jpeg到图片中,文字清晰度随机在10-70间取值(很差,很清晰)
  • 文字jpeg位置也是随机

这里是展示创造力的地方

  • 从黑白到彩色
  • 老照片换新照片

Prepare the input data by crappifying images.

from crappify import *

穿插问题: 为什么concat在2个conv-layer之前进行

穿插问题: 为什么concat在2个conv-layer之前进行

  • 为了更多的interaction between downsampling and upsampling
穿插问题: downsampling 和 upsampling是如何能做到concat大小不变?

穿插问题: downsampling 和 upsampling是如何能做到concat大小不变?

  • concat只限制在同一个block之中,进入下一个block时,新的concat将重头开始
如何加速crappify图片

如何加速crappify图片

Uncomment the first time you run this notebook.

#il = ImageList.from_folder(path_hr)
#parallel(crappifier(path_lr, path_hr), il.items)

For gradual resizing we can change the commented line here.

bs,size=32, 128
# bs,size = 24,160
#bs,size = 8,256
arch = models.resnet34

Pre-train generator

创建DataBunch

Now let’s pretrain the generator.

创建DataBunch

  • 先在src中用crappy文件夹中图片,在分割成训练和验证集
  • 在data中用原图来做标注图
  • 再变形和转化为DataBunch
arch = models.resnet34
src = ImageImageList.from_folder(path_lr).random_split_by_pct(0.1, seed=42)
def get_data(bs,size):
    data = (src.label_from_func(lambda x: path_hr/x.name)
           .transform(get_transforms(max_zoom=2.), size=size, tfm_y=True)
           .databunch(bs=bs).normalize(imagenet_stats, do_y=True))

    data.c = 3
    return data
data_gen = get_data(bs,size)
展示图片(crappy和原图)

展示图片(crappy和原图)

data_gen.show_batch(4)

wd = 1e-3
y_range = (-3.,3.)
loss_gen = MSELossFlat()
为什么要用pretrained models

为什么要用pretrained models

  • 输入值:低像素,有杂质图片
  • label:高像素,无杂质
  • 目标:学会去除杂志,提升清晰度
  • 所以需要模型本身就知道图片中的所有物品包括杂志
arch = models.resnet34
part 2将讲解的内容

part 2将讲解的内容

  • norm_type, self_attention, y_range?
def create_gen_learner():
    return unet_learner(data_gen, arch, wd=wd, blur=True, norm_type=NormType.Weight,
                         self_attention=True, y_range=y_range, loss_func=loss_gen)
如何创建模型和训练

如何创建模型和训练

learn_gen = create_gen_learner()
learn_gen.fit_one_cycle(2, pct_start=0.8)

Total time: 01:35

epoch train_loss valid_loss
1 0.061653 0.053493
2 0.051248 0.047272
如何做全模型训练

如何做全模型训练

learn_gen.unfreeze()
learn_gen.fit_one_cycle(3, slice(1e-6,1e-3))

Total time: 02:24

epoch train_loss valid_loss
1 0.050429 0.046088
2 0.049056 0.043954
3 0.045437 0.043146
看看模型训练效果

看看模型训练效果

  • 去除水印目标实现了
  • 但提升像素并不成功
learn_gen.show_results(rows=4)

learn_gen.save('gen-pre2')
为什么提升像素不成功?

为什么提升像素不成功?

  • MSE loss: 无法体现低像素图(消除了水印)与原图的差异
  • 高清和纹路的差异,无法在MSE的差异中体现
  • 我们需要更新的损失函数
  • GAN是一种解决方案
GAN如何帮助提升生成图的像素?

GAN如何帮助提升生成图的像素?

为什么GAN 训练很痛苦?

为什么GAN 训练很痛苦?

  • 痛苦在起步,成功起步后就比较快了
  • 为什么起步痛苦?
    • 因为起步时生成器和分辨器都很无知,无法相互帮助促进
    • 好比两个刚刚失明的人要帮助对方行走一样无助
fastai 版本的GAN如何解决这个问题?

fastai 版本的GAN如何解决这个问题?

  • 提供预先训练好的生成器和辨别器,直接给出优秀的起步状态
  • 这是fast.ai首创(极可能)
保存生成图到新文件夹

保存生成图到新文件夹

  • 要训练critic (二元分类),需要原图和生成图对比
  • 原图已有,生成图需要新生成
learn_gen.load('gen-pre2'); # 准备生成器模型
name_gen = 'image_gen'
path_gen = path/name_gen # 准备路径
# shutil.rmtree(path_gen)
path_gen.mkdir(exist_ok=True) # 创建文件夹

part2 会有更多自己写源代码的机会(如下)

def save_preds(dl):
    i=0
    names = dl.dataset.items # 提取文件路径
    
    for b in dl: # 提取一个一个的批量
        preds = learn_gen.pred_batch(batch=b, reconstruct=True) # 生成图
        for o in preds:
            o.save(path_gen/names[i].name) # 提成和保存到指定文件名
            i += 1
save_preds(data_gen.fix_dl) # fix_dl ????

查看新生成文件夹中图片

PIL.Image.open(path_gen.ls()[0])

output_69_0

Train critic

如何释放空间,无需重启kernel?

如何释放空间,无需重启kernel?

learn_gen=None
gc.collect() # 能行,只是NVDIA msi 无法展示实际情况,因为Pytorch的设置
3755

Pretrain the critic on crappy vs not crappy.

如何生成critic DataBunch

如何生成critic DataBunch

def get_crit_data(classes, bs, size):
    src = ImageList.from_folder(path, include=classes).random_split_by_pct(0.1, seed=42)
    # path = PETS 总path, include=classes, subfolders 就是classes
    ll = src.label_from_folder(classes=classes) # classes = subfolders (images, image_gen)
    data = (ll.transform(get_transforms(max_zoom=2.), size=size)
           .databunch(bs=bs).normalize(imagenet_stats))
    data.c = 3
    return data
data_crit = get_crit_data([name_gen, 'images'], bs=bs, size=size)
data_crit.show_batch(rows=3, ds_type=DatasetType.Train, imgsize=3) # 注意 imgsize=3

BCE loss + AdaptiveLoss ??

BCE loss + AdaptiveLoss ??

loss_critic = AdaptiveLoss(nn.BCEWithLogitsLoss())
如何创建critic模型Learner?

如何创建critic模型Learner?

  • 模型框架需要spectral norm
  • Resnet 内无此设置,未来可能会植入
  • 目前采用gan_critic()模型框架, 内置了spectral norm
def create_critic_learner(data, metrics):
    return Learner(data, gan_critic(), metrics=metrics, loss_func=loss_critic, wd=wd)
learn_critic = create_critic_learner(data_crit, accuracy_thresh_expand)
# accuracy_thresh_expand = 为GAN critic定制的accuracy
learn_critic.fit_one_cycle(6, 1e-3) 

Total time: 09:40

epoch train_loss valid_loss accuracy_thresh_expand
1 0.678256 0.687312 0.531083
2 0.434768 0.366180 0.851823
3 0.186435 0.128874 0.955214
4 0.120681 0.072901 0.980228
5 0.099568 0.107304 0.962564
6 0.071958 0.078094 0.976239
learn_critic.save('critic-pre2')

GAN

再度释放内容

Now we’ll combine those pretrained model in a GAN.

再度释放内容

learn_crit=None
learn_gen=None
gc.collect()
15794
创建databunch, critic和generator

创建databunch, critic和generator

data_crit = get_crit_data(['crappy', 'images'], bs=bs, size=size)
learn_crit = create_critic_learner(data_crit, metrics=None).load('critic-pre2')
learn_gen = create_gen_learner().load('gen-pre2')
fastai 如何简化GAN建模流程

fastai 如何简化GAN建模流程

To define a GAN Learner, we just have to specify the learner objects for the generator and the critic. The switcher is a callback that decides when to switch from discriminator to generator and vice versa. Here we do as many iterations of the discriminator as needed to get its loss back < 0.5 then one iteration of the generator.

The loss of the critic is given by learn_crit.loss_func. We take the average of this loss function on the batch of real predictions (target 1) and the batch of fake predicitions (target 0).

The loss of the generator is weighted sum (weights in weights_gen) of learn_crit.loss_func on the batch of fake (passed throught the critic to become predictions) with a target of 1, and the learn_gen.loss_func applied to the output (batch of fake) and the target (corresponding batch of superres images).

switcher = partial(AdaptiveGANSwitcher, critic_thresh=0.65)
learn = GANLearner.from_learners(learn_gen, learn_crit, 
                                 weights_gen=(1.,50.), # MSEpixel loss set 50x larger, 
                                 # critic loss set just 1 scale
                                 show_img=False, switcher=switcher,
                                 # set momentum to 0 in betas, 因为GAN不喜欢momentum
                                 opt_func=partial(optim.Adam, betas=(0.,0.99)), wd=wd)
learn.callback_fns.append(partial(GANDiscriminativeLR, mult_lr=5.))
lr = 1e-4
learn.fit(40,lr)

Total time: 1:05:41

epoch train_loss gen_loss disc_loss
1 2.071352 2.025429 4.047686
2 1.996251 1.850199 3.652173
3 2.001999 2.035176 3.612669
4 1.921844 1.931835 3.600355
5 1.987216 1.961323 3.606629
6 2.022372 2.102732 3.609494
7 1.900056 2.059208 3.581742
8 1.942305 1.965547 3.538015
9 1.954079 2.006257 3.593008
10 1.984677 1.771790 3.617556
11 2.040979 2.079904 3.575464
12 2.009052 1.739175 3.626755
13 2.014115 1.204614 3.582353
14 2.042148 1.747239 3.608723
15 2.113957 1.831483 3.684338
16 1.979398 1.923163 3.600483
17 1.996756 1.760739 3.635300
18 1.976695 1.982629 3.575843
19 2.088960 1.822936 3.617471
20 1.949941 1.996513 3.594223
21 2.079416 1.918284 3.588732
22 2.055047 1.869254 3.602390
23 1.860164 1.917518 3.557776
24 1.945440 2.033273 3.535242
25 2.026493 1.804196 3.558001
26 1.875208 1.797288 3.511697
27 1.972286 1.798044 3.570746
28 1.950635 1.951106 3.525849
29 2.013820 1.937439 3.592216
30 1.959477 1.959566 3.561970
31 2.012466 2.110288 3.539897
32 1.982466 1.905378 3.559940
33 1.957023 2.207354 3.540873
34 2.049188 1.942845 3.638360
35 1.913136 1.891638 3.581291
36 2.037127 1.808180 3.572567
37 2.006383 2.048738 3.553226
38 2.000312 1.657985 3.594805
39 1.973937 1.891186 3.533843
40 2.002513 1.853988 3.554688
learn.save('gan-1c')
learn.data=get_data(16,192)
learn.fit(10,lr/2)

Total time: 43:07

epoch train_loss gen_loss disc_loss
1 2.578580 2.415008 4.716179
2 2.620808 2.487282 4.729377
3 2.596190 2.579693 4.796489
4 2.701113 2.522197 4.821410
5 2.545030 2.401921 4.710739
6 2.638539 2.548171 4.776103
7 2.551988 2.513859 4.644952
8 2.629724 2.490307 4.701890
9 2.552170 2.487726 4.728183
10 2.597136 2.478334 4.649708
learn.show_results(rows=16)

learn.save('gan-1c')

Lesson 7 WGAN

video crop on WGAN 2 mins

三行魔法代码

三行魔法代码

%reload_ext autoreload
%autoreload 2
%matplotlib inline
所需library

所需library

from fastai.vision import *
from fastai.vision.gan import *
下载数据(部分数据,源于Kaggle)

下载数据(部分数据,源于Kaggle)

LSun bedroom data

For this lesson, we’ll be using the bedrooms from the LSUN dataset. The full dataset is a bit too large so we’ll use a sample from kaggle.

path = untar_data(URLs.LSUN_BEDROOMS)
如何构建databunch

如何构建databunch

We then grab all the images in the folder with the data block API. We don’t create a validation set here for reasons we’ll explain later. It consists of random noise of size 100 by default (can be changed below) as inputs and the images of bedrooms as targets. That’s why we do tfm_y=True in the transforms, then apply the normalization to the ys and not the xs.

def get_data(bs, size):
    return (GANItemList.from_folder(path, noise_sz=100) # noise as inputs, image as targets
               .no_split()
               .label_from_func(noop) # what is noop?
               .transform(tfms=[[crop_pad(size=size, row_pct=(0,1), col_pct=(0,1))], []], 
                          size=size, 
                          tfm_y=True) # transform to y not x
               .databunch(bs=bs)
               .normalize(stats = [torch.tensor([0.5,0.5,0.5]), torch.tensor([0.5,0.5,0.5])], 
                          do_x=False, do_y=True)) # normalize y not x
从小尺寸数据开始训练

从小尺寸数据开始训练

We’ll begin with a small side and use gradual resizing.

data = get_data(128, 64)
data.show_batch(rows=5)

Models

如何理解GAN 的工作原理

如何理解GAN 的工作原理

GAN stands for Generative Adversarial Nets and were invented by Ian Goodfellow. The concept is that we will train two models at the same time: a generator and a critic. The generator will try to make new images similar to the ones in our dataset, and the critic will try to classify real images from the ones the generator does. The generator returns images, the critic a single number (usually 0. for fake images and 1. for real ones).

We train them against each other in the sense that at each step (more or less), we:

  1. Freeze the generator and train the critic for one step by:
  • getting one batch of true images (let’s call that real)
  • generating one batch of fake images (let’s call that fake)
  • have the critic evaluate each batch and compute a loss function from that; the important part is that it rewards positively the detection of real images and penalizes the fake ones
  • update the weights of the critic with the gradients of this loss
  1. Freeze the critic and train the generator for one step by:
  • generating one batch of fake images
  • evaluate the critic on it
  • return a loss that rewards posisitivly the critic thinking those are real images; the important part is that it rewards positively the detection of real images and penalizes the fake ones
  • update the weights of the generator with the gradients of this loss

Here, we’ll use the Wassertein GAN.

We create a generator and a critic that we pass to gan_learner. The noise_size is the size of the random vector from which our generator creates images.

如何生成简单的generator and critic

如何生成简单的generator and critic

generator = basic_generator(in_size=64, n_channels=3, n_extra_layers=1)
critic    = basic_critic   (in_size=64, n_channels=3, n_extra_layers=1)
如何构建wgan learner

如何构建wgan learner

learn = GANLearner.wgan(data, generator, critic, switch_eval=False,
                        opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
learn.fit(30,2e-4)

Total time: 1:54:23

epoch train_loss gen_loss disc_loss
1 -0.842719 0.542895 -1.086206
2 -0.799776 0.539448 -1.067940
3 -0.738768 0.538581 -1.015152
4 -0.718174 0.484403 -0.943485
5 -0.570070 0.428915 -0.777247
6 -0.545130 0.413026 -0.749381
7 -0.541453 0.389443 -0.719322
8 -0.469548 0.356602 -0.642670
9 -0.434924 0.329100 -0.598782
10 -0.416448 0.301526 -0.558442
11 -0.389224 0.292355 -0.532662
12 -0.361795 0.266539 -0.494872
13 -0.363674 0.245725 -0.475951
14 -0.318343 0.227446 -0.432148
15 -0.309221 0.203628 -0.417945
16 -0.300667 0.213194 -0.401034
17 -0.282622 0.187128 -0.381643
18 -0.283902 0.156653 -0.374541
19 -0.267852 0.159612 -0.346919
20 -0.257258 0.163018 -0.344198
21 -0.242090 0.159207 -0.323443
22 -0.255733 0.129341 -0.322228
23 -0.235854 0.143768 -0.305106
24 -0.220397 0.115682 -0.289971
25 -0.233782 0.135361 -0.294088
26 -0.202050 0.142435 -0.279994
27 -0.196104 0.119580 -0.265333
28 -0.204124 0.119595 -0.266063
29 -0.204096 0.131431 -0.264097
30 -0.183655 0.128817 -0.254156

查看模型结果

查看模型结果

learn.gan_trainer.switch(gen_mode=True)
learn.show_results(ds_type=DatasetType.Train, rows=16, figsize=(8,8))