Resources detailing how to train a new model from scratch

Hello everyone -

I’m wondering if anyone might have any pointers to tutorials or other resources on training a new model from scratch. I’m working with a corpus of 320x256 grayscale images. Each file is the raw output from a camera that allocates 16bits per pixel in a single layer. While I could scale the 16bits down into a 8bit image, I’d likely lose a lot of information in the process.

I’ve built other CNN based classifiers in the past, though they’ve always dealt with color RGB images and have been based off of a previously trained model. Starting from the beginning is something entirely new for me, and I’d appreciate any tips people might have!

Hopefully some folks can weigh in here - but just FYI we’ll be covering this topic later in the course.

2 Likes

Thanks, Jeremy!

Do you have any estimates as to which lesson we might cover this material in?

In the 2nd half of the course - not sure beyond that.

interesting, the first idea that comes to mind it to split 16 bits into two 8 bit channels perhaps?
are you building a classifier?

We use floats as inputs to our models, so the 16 bit data should work fine. Have a look at the threads about the iceberg comp to see some examples of how people are handling non 3 channel data at the moment.

1 Like

Ultimately it will be a classifier, yes, though as a preceding step, I was hoping to extract a region of interest (ie. a CNN outputting a bounding box around the ROI.), that would then in turn be classified.

HI Jeremy -

To visualize the data I’d been collecting from our sensors, I had been scaling the data in much the same way as is being done in “Kaggle Iceberg Challenge Starter Kit” thread, except that I was outputting a 1-channel grayscale image. To make it easier to use existing models that expect 3-channel RGB images, I suppose that I could set each of the RGB channels to the same scaled value? This wouldn’t add any new information, but would allow me to reuse an existing model to start from.

I am concerned that if I’m squashing 16 bits of info (though I believe it’s actually only 14 bits output by the sensor) down into 8 bits, that I’ll be losing some granularity, though perhaps this is a “good enough” starting point…

Yes that’s a common approach for reusing a 3-channel model with one channel. There are others, but we won’t cover them until part 2.

There’s no reason to squash down to 8 bit. What makes you think that might be needed? We use floats as our inputs.

Yes that’s a common approach for reusing a 3-channel model with one channel. There are others, but we won’t cover them until part 2.

Great, I’ll start there then.

There’s no reason to squash down to 8 bit. What makes you think that might be needed? We use floats as our inputs.

I must have missed something…I thought that for the models we had been using thus far (ie. the dogs and cats) the inputs were 3-channel RGB images, which consisted of 8-bit values? Can you point me towards one that takes floats as input?

If you feed in floats between 0 and 255 all the models we’ve seen should work fine with no code changes. Let me know if you find this not to be the case - I haven’t tried it, but I know that it’s all floats internally.

If you feed in floats between 0 and 255 all the models we’ve seen should work fine with no code changes. Let me know if you find this not to be the case - I haven’t tried it, but I know that it’s all floats internally.

Ah, ok! I can continue doing the scaling I’d been using in the past then, but instead of converting the final output to a uint8, i’ll just leave it as a floating point. That’s great, thanks!

1 Like

Hi Cory,
Here is some example on how to take a grayscale images and will save them with 3 channels.

4 Likes

@yinterian just curious would it also be ok/effective to use open cv’s function to convert grayscale to RGB like below?
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)

Anything that works. (-:

Sorry to clarify I meant that I see you used cv2.COLOR_GRAY2BGR (instead of RGB) is that better than just converting to RGB directly?

I am not sure. I remember just looking around on the web to try to find a solution.

1 Like

BGR is the format OpenCV uses internally. Doing a conversion to RGB would also work.

I have a feeling however (unsubstantiated, though? I haven’t tested), that OpenCV will expect all of the values of the Numpy array to be of uint8, as opposed to floats that Jeremy and I were discussing above.