@lesscomfortable - Hi Francisco. Jeremy mentioned on Mon. that you’re going to put together a guide on downloading data from Google Images. I haven’t been able to find that - did I miss it or have you not finished it yet. Don’t mean to pressure you! thanks
Hey @ricknta! Haven’t finished yet, will be ready today or tomorrow before midday. Will post it in this thread’s header!
Look at the post at the very top, there’s a link at the bottom with some description of how to download data off Google. I personally used this repository https://github.com/hardikvasa/google-images-download.
Thanks @dreambeats I did see that but wanted to make sure I wasn’t overlooking Francisco’s guide.
Hi folks, I’ve created a small (approx 50 images per class) dataset of galaxies according to their high-level morphology (spiral, barred, elliptical, irregular). However the best I can do with them using the approach we learned in lesson 1 is about a 35% error rate with resnet50.
Doing the same for bears (grizzlies vs polars) gets me 0% on resnet34 after 3 cycles!
Is the difference down to the existing learned behaviour in the model? Would a larger dataset improve matters?
FWIW, I know there’s much prior art for classifying galaxies with ML that I’m yet to understand, including a Kaggle Competition and a great writeup from the winner and I’m looking forward to revisiting the problem properly once we learn multi-label classification later in the course.
Just keen to understand for now why transfer learning from resnet using a simple training set for one category of object has such different results from another.
Edit: Here’s my worked notebook.
Can anyone help me to use custom data for classification like Jeremy has used Url constant to load image data. what if we want to download some other dataset from web and use it for classfication. how could we do that?
Please share your notebooks so we can help you resolve this. The ‘gist it’ extension is perhaps the easiest way.
When I use
data = ImageDataBunch.froder(path) does it create validation and test set or should it be created before I use the method?
I would be interested as well.
Heh, having trouble installing gist-it, but will tackle that on GCP install forum.
In the meantime, here’s my notebook for classifying galaxies vs bears.
Repository is up! Please give feedback on problems you face.
I’m using GCP and lesson 1’s resnet-50 code’s batch size caused an “out of memory” error.
No problem, I changed the “bs” value, restarted, reran the necessary cells, but then got a new error:
ValueError: Expected more than 1 value per channel when training, got input size [1, 4096]
Actually, I’ve also had this problem with v0.7. My guess is that the total number of data samples modulo the batch size ends up with 1 remaining item to process at the end. OK, I’ll change my batch size again, but why can’t the library code be smarter about not ending up with 1 at the end?
I faced same issue when I tried to run the cell after changing the num_workers=0 in ImageDataBunch. I restarted the kernel then it worked fine
Thanks for question and answer, I am getting the same error, first I suspected it is due to batch size. I guess it should not stop and turn red, may be green.
I had some issue with pytorch not using GPU due to older version of driver. I think I updated to .410 (not sure about this but it is the latest from Nvidia), and it started using GPU.
While running my notebook, I run into the following error:
RuntimeError Traceback (most recent call last)
----> 1 learn.fit_one_cycle(5)
RuntimeError: CUDA error: out of memory`
I am using
n1-highmem-8 in GCP .
I have tried restarting the notebook kernel, and also restarted my gcp instance, still the problem exist.
Hey, I’ve fastai version :
'1.0.12', Pytorch version :
nvcc -V gives
release 9.0, V9.0.176
Is there any way to solve this?