Isn’t a clear sign that I am overfitting because my error is wiggling around 0.085 and 0.089 even though train error reduced from 0.34 to 0.28 and valid error from 0.2558 to 0.231
Ok… I did a complete refactor to include the user interface from the FileDeleter we learned about tonight. Now it’s a super clean interface for finding duplicate or near duplicate images as well as garbage images using intermediate representations of a pretrained network.
Hi i will be working with mammographies (the CBIS-DDSM dataset). For now i have extracted a subset of x-ray tiles in order to classify them as healthy or malignant tissue. I have converted the x-rays to 16bit png using pydicom and create a modifed open_image in order to read the 16bit png file. Furthermore to work with pretrained networks i have created a small function to convert the input layer of a resnet to accept 1-channel input.
I plan to use this dataset for dataaugmentation and segmentation throughout the course and combine it with some of the wonderfull ideas that have come up in the first lessons. There are lots of challenges with this dataset:)
Hey Guys, I decided to work with Architectural Heritage Elements image Dataset dataset. And I think have achieved something better than SOTA for this dataset. The authors claim 93.19% whereas I achieve 97.155% with Restnet50 after some finetuning. There is a lot of further potential for finetuning I think.
Link to the paper i am currently comparing my results to. not sure if there are any further papers on this improving their results.
Paper introducing the dataset and accuracies. I am working with 128x128.
Hey, I went through your notebook but didn’t exactly get how you passed in the 3 crops. Did you pass in all three crops and then take the most occurring result or concatenate the images in some manner? It seems to me like you still passed in one image at a time. (pls point to where you passed in different in code as well)
I believe there is a typo here. You said today that ‘when train loss > val loss means you have not fitted enough’. I think what you mean here is: ‘Val loss lower than train loss means you are underfitting ’.
After listening again to part2 v2 lectures, I realize that what I I’m trying to do may be done better with the U-Net architecture. It give per-pixel classification along with using multiple levels of detail to generate the result. The ground true would be trivial - all pixels are the same class (the artist of that painting). I’ll get back to this project later.
This is counter-intuitive. When we say val loss > training loss than it means my model did good on training (low train loss) but performed worse on testing (high val loss), it means it learned training data well but is not generalizing very well on a validation set so it is “overfitting”.
On the other end when we say val loss < train loss means I am doing good on validation but not so good in training so I have a scope of improvement. I am “underfitting”.
Am I missing something? (Sorry to tag you directly @jeremy)
Many apologies - not enough sleep and I didn’t notice I’d typed the opposite of what I meant! Fixed my post now, and removed most of the replies of people that I confused in the process, so as to avoid confusing people even more…