When using ImageDataBunch.from_name_re(), fastAI takes care of creating the categories by using the name of each image. We simply have to input a regex string like this:
file_parse = r'/([^/]+)_\d+\.(png|jpg|jpeg)$'
Is there a way to do this when using ImageDataBunch.from_folder()?
np.random.seed(42)
data = ImageDataBunch.from_folder('/content/Food-101/images', train='/content/Food-101/images/train', test='/content/Food-101/images/test', valid_pct=0.2, ds_tfms=get_transforms(), size=224)
data.normalize(imagenet_stats)
Because when I am running epocs, this is what fastAI is giving me:
epoch
train_loss
valid_loss
error_rate
time
0
0.000000
0.000000
0.000000
21:05
1
0.000000
0.000000
0.000000
20:43
2
0.000000
0.000000
0.000000
20:38
3
0.000000
0.000000
0.000000
19:38
4
0.000000
0.000000
0.000000
20:01
I believe this is because my file_parse variable isn’t used anywhere.
We use the same data block as before but we add ‘add_test_folder’ to get test data.
data = (ImageList.from_folder(path)
.split_by_folder()
.label_from_re(pattern)
.add_test_folder()
.databunch())
In fastai framework we won’t have any labels for test data because we want to get results for that,but if you want look at how good your model is doing with data for which you already have labels you consider it as validation set.
If it indeed is validation set then .from_folder() expects it to have name as ‘valid’ if it has other name you can do this,
.from_folder(train=name,valid=name)
It is not always a case that you want to use training data in validation set. Since you have a separate set with labeled data to look how your model is doing, you don’t need to take out some data from training set to make a validation set.