I have a problem where I am trying to optimize a regression problem with images. I saw in the source code in the vision/learner.py file this particular note about regression based CNN’s:
However, I think I might be close to getting my model to train anyhow.
First I created a toy dataset recycled from before to see if I can just get it to run. I created a DataBunch from a csv that looks like the following where I just took a particular class and made the label a float:
I noticed the
create_cnn function accepts a keyword argument of
custom_head. So matching up the dimensions of the last layer of resnet34 flattened, I applied just another Dense layer with output 1. Then hunting down where the loss function is placed on the learner, I override it with L1_loss because it’s regression based. My code is like so:
I then go to run a cycle on the learner, but end up with the following error:
Is there a preferred place in the code to fix this? If it’s just a matter of putting an if statement somewhere in the code for doing a tensor type conversion and placing another keyword argument on learner instantiation signaling regression (to then add the custom head / new loss function), I’d be more than happy to try a go at getting this implemented in the next day or so. I just didn’t want to hack away without verification that that is the particular fix that is the preferred way to solve this and then have a pull request rejected.
Upon further digging, it looks like the data bunch is controlling all of this.
I’m going to try and hack around with implementing a bare bones ImageRegressionDataset as a counterpart to the ImageClassificationDataset and put the appropriate default loss / type casting in there.
Yeah you’re on the right track. You’ll need an ImageRegressionDataset to get your y’s the right type.
If you have or can find a good example of an image regression problem and public dataset (preferably with some academic or competition baseline), I’d be happy to create an example notebook and add the necessary functions to the library.
Sounds good, I’ll try and hunt around a bit more for an open dataset with a baseline.
I’m gathering a fairly small dataset for a pet project that I hope to have deployed in a small web app by next week.
Will keep you updated, thanks!
Alright, I found a number of different academic datasets and benchmarks.
The first one is related to satellite imagery of the sun, trying to predict the flux of solar flares from images at previous time stamps. Dataset and benchmarks can be found here: https://i4ds.github.io/SDOBenchmark/#current-state
There are also a couple of datasets related to trying to predict items such as yaw and pitch of peoples heads or body parts from images. The three datasets related to this along with appropriate benchmarks can be found in this paper: https://arxiv.org/pdf/1803.08450.pdf
With the three datasets being available at the following:
I can also think of a couple of other projects that are going on at my work right now as well that are trying to do similar things. Such as based on images taken from a camera mounted on a robot, can we predict the power that is going to be required to drive over particular land features or rocks to then inform if we are ok with driving in that particular direction.