Validating a model on new data

Hi! Seriously impressed with FastAI so far.

The problem I’m trying to work with involves two sets of images, which I’ve put into two directories named for their class labels. Currently, I’ve trained a classifier to distinguish between them, like so (this is just a slight modification of the pets example in Lesson 1):

tfms = get_transforms()
data = (ImageList.from_folder(path)
    .split_by_rand_pct()
    .label_from_folder()
    .transform(tfms, size=224)
    .databunch(bs=bs))
learn = cnn_learner(data, models.resnet34, metrics=error_rate)
learn.fit_one_cycle(4)
learn.save('in-play-stage1')

Then, I can go through the usual post-hoc analysis:

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)
interp.plot_top_losses(9, figsize=(15,11))

So far, so good; I have a classifier that does exactly what I want! But what I’d like to do next is validate how the newly trained model performs on a dataset that’s different but similar, ideally using those same very convenient confusion matrix and top losses functions. (The new dataset has been divided into the same two classes, and has the same directory structure etc, but refers to a set of events clearly distinct from but fundamentally similar to the dataset the model was trained on). It seems likely the classifier will perform well on this new dataset too, but I need to check.

I’ve tried every suggestion I could find to make this happen, but nothing I’ve tried has successfully executed without throwing exceptions. That includes manually trying to pass PIL image objects to learn.predict().

How can I do these things? Thanks!

Got an answer! Sharing because I’ve seen at least one other similar question recently.
imgs = glob(“data/images/class1/*.jpg”)
img = open_image(imgs[0])
prediction = learn2.predict(img)
prediction

@ActionNeil see my post here for doing full datasets (and use interp etc like you want) Calculating the Accuracy for test set :slight_smile:

2 Likes