Have you tried looking at the updated version on walk with fastai? It has a vision example Lesson 3 - Cross-Validation | walkwithfastai
Let’s rewrite this a bit to make it more fastai-like:
train_df, test_df = train_test_split(full_df, test_size=0.1)
folds = 4
skf = StratifiedKFold(n_splits=folds, shuffle=True)
val_pct = []
test_pct = []
batch_size = 32
for train_index, val_index in skf.split(train_df.index, train_df['label']):
train_block = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_x=get_x,
get_y=get_y,
splitter=IndexSplitter(val_index), # added val_index
item_tfms=[
Resize(384),
FlipItem(p=0.4),
RandomCrop(300)
],
batch_tfms=[Normalize.from_stats(*imagenet_stats)]
)
dls = train_block.dataloaders(train_df, bs=batch_size)
test_dl = dls.test_dl(test_df, bs=batch_size)
test_dl.after_item = Pipeline([Resize(384), ToTensor()]) # to get rid of the need for another DataBlock
# train model
learn = train(train_dl, resnet101, epochs=10, freeze_epochs=7)
_, val = learn.validate()
_, test = learn.validate(dl = test_dl)
print('done, appending results.. \n')
val_pct.append(val)
test_pct.append(test)
So what did we do differently? We don’t need to write out a second DataBlock
. Since item_tfms
are done lazily, we can replace our test_dl
’s item transforms with whatever we want (in this case we just want to Resize
and apply ToTensor()
). If you want that last RandomCrop
to make your image (300,300), you can leave it in there safely as fastai by default will make that RandomCrop a CenterCrop on the validation set, similar to what Resize
is doing (you’ve also now introduced test-time disparity, as you’re not exactly recreating what you trained on in image size)
Along with this, learn.validate
can accept a dl
param, so we can pass that test_dl
in.
Hope this helps! You did a great job IMO, this is just my own preference towards writing a functionality like this