Picking the right level for a project for Part 1 questions

Hi all,

First post on this forum. I’ve binge-watched the 7 lessons, and I’m very curious about this technology. I’m an engineer from background, and know my way around programming, and industrial automating so I hope to upgrade my knowlegde hugely.
For working thru the Part 1 course, I would like to recognize parallel edges of regions. Ultimately I would like to work towards determining the rotation and position of the shape too. Since I have no data and needing something that touches on my background to play around with, I decided to write a notebook creating data in the first place, and saving relevant info to a labels file.

My data is a rectangular-ish region, with a label “has_parallel_sides” when the longest sides are within a certain threshold (2 degrees). I can create as much images as I would like and have experimented with 4000 images, and 16000 images.

Biggest pitfall would be doing everything at once, so I would like to focus first at predicting if the longest sides are parallel.
Setting up and picking stuff from the various lessons I get some unexpected results training the model. I think I’m doing something wrong. My learning rate finder shows this graph:

And then, training the model, the losses vary wildly.

I have not used a pre-trained model since I thought that these shapes have nothing to do with cats and dogs, cars and busses. Did I make a wrong assumption?
Any pointers (to lessons or terminology) would be appreciated.

For those interested, The notebook for creating the data, as well as my experimenting notebook are in a branch on my fork of the course-v3. https://github.com/luminize/course-v3/commits/part1-project
creating the data:
my notebook:


Indeed, it might be worth trying pretrained models, especially maybe unfrozen. The pretrained models have already learnt to identify simple features in earlier layers such as edges and lines so it is best to take advantage of that in your application. The later layers can probably be retrained as they are looking for more complicated features you don’t need.

Thanks, maybe you can clarify for me: a pretrained model, is that the same as using models.resnet34? I’m looking thru the cats and dogs example, and where exactly is the pretrained model itself that I can reuse?

yep, most of the models over here will work. Pretrained models will be when the pretrained argument is set to True, which it is by default.

You can also specifically pass in imagenetstats into the learner to use it.

I am not sure this is true… where is this written/mentioned?

Here’s an example. Apologies, it is with the creation of the image databunch.

data = ImageDataBunch.from_folder(path, valid_pct=0.1, size=sz).normalize(imagenet_stats)

See the .normalize where I pass in the imagenet_stats

I know what imagenet_stats does and how it works, but imagenet_stats is only a set of numbers regarding the statistics of the ImageNet dataset. How does that lead to loading a pretrained model? the loading of a pretrained model is determined by the pretrained=True argument

1 Like

you are correct, my apologies. Loading the pretrained = True looks to use the imagenet from what I’ve read on the docs here, as you said, correct?


Yep… glad I could clear things up for you…

1 Like

@ilovescience @muellerzr thanks for the help.
I’ve been looking at my notebook and I am instantiating a learner as cnn_learner with resnet34 as architecture. That should be a pre-trained model, right?


I’m having very big losses and I get a very strange recorder plot

Trying to get the heat graph from the losses gets me the error TypeError: int() argument must be a string, a bytes-like object or a number, not 'MultiCategory'
Am I doing something wrong with my data?

Looks like I did something wrong creating the databunch from an imagelist.from_csv.
I saved all the images with the label in the filename as in lesson1, forget about the labels csv file, and then working from lesson1 I get some more sane results, somewhere about 0.40 accuracy.
I now load the databunch as so:
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(do_flip=False, max_warp=0.), size=128, bs=bs//4).normalize(imagenet_stats)