I was running the lesson 2 download notebook, and when I reached the lines using the ImageCleaner, I ran the statement ‘ImageCleaner(ds, idxs)’ and got the following error.
It’s easy enough to fix as we defined path earlier, so I added path=path, but I didn’t know if this was something that needed to be fixed so thought I’d report it. Cheers to anyone who can help.
using the path i defined earlier in the notebook… im doing a set with birds
folder = ‘amkestrel’
file = ‘urlsamkestrel.txt’
path = Path(‘Raptors’)
dest = path/folder
It seems that with path = Path(‘data/bears’) it should work just as well or at least throw some meaningful error. But it freezes for me… Are you also working in Colab, by the way?
So, if it’s the path, where is your ‘Raptors’ path relative to your dataset?
Dataset is images in multipe folders like… folder = “amkestrel”
where the structure is
Raptors/amkestrel
Raptors/redtail
Raptors/osprey
etc…
im not sure if im doing it right, but it worked up until i wanted to improve my results and use the clean up tool.
I was able to get to:
Where the most confused were some difficult images, errors, or baby birds.
NOTE maybe dont listen to me… it worked to get past cleaner step… but im having errors in the next step… so my workaround is temp… and not the right solution.
about 1000 images.
the first time it hung…
i stopped and started the Kernel again.
I was working on a cheaper gpu, then switched to a faster one.
Warning… the clean up can take time… i wasnt sure if i could exit and restart somewhere again…
same as with the next duplicate step…
took longer than i thought to actually click through all the batches. it would be nice to have an index of where you are in the toplosses… and be able to restart the clean-up if you have to leave and shut down the gpu…
That’s what I also end up doing: restarting the kernel. Maybe I should wait for a bit longer, but so far I never managed to get the result.
How much time (at least the order of magnitude) did it take for you when it worked?
Well, so far it’s taking 16 minutes without any sign of progress. And I don’t even know what is it spending all its time on…
And Colab GPU apparently has 20GB of memory, although I’m not certain of that.
@AlexeyRB I am working in Paperspace Gradient trying a model on fruits and was able to recreate your problem. I ran “ImageCleaner(ds, idxs, Path(‘data/fruits’))” and it hangs for at least 5 minutes, and caused weird errors. I had the filesystem open in another window and when I tried to change folders got “server error: forbidden”. Also the top right corner of my notebook was displaying a red “Not Connected” message, a yellow “Forbidden”, and a white “Not Trusted”.
No idea what this means, but since we defined the path earlier, we can just use the line
“ImageCleaner(ds, idxs, path=path)” and it works fine for me. It must be something about the Path constructor you used. Let me know if that helps.
I would be surprised if it works, since path is defined exactly as Path(‘data/bears’). But I’ll try. Will take a few minutes to run the whole notebook again after restarting…
Apparently, ipywidgets like ImageCleaner do not work properly in Colab. I gathered this from searching around the forums for other problems I’m having with ImageCleaner.
For future reference, you may want to try searching the Platform: Colab thread for any issues you have.
That can be… but then no-one working with Colab should be able to run this line. I’ll check that tomorrow. Hope that’s not true: would be a pity to move to a different platform just because of these kinds of issues.
It’s true. No one using Colab seems to be able to run that line. If you go to that thread, open the search box, check “Search this topic,” and then search for “ImageCleaner”, you’ll see.
Hey, sorry for the false lead, that’s really frustrating. I hadn’t looked back at the declaration earlier in the notebook so I didn’t see that your way was already used earlier or it would’ve been obvious it wouldn’t work.
Very weird that it broke everything for me when I had no problems the first time. I rebooted the notebook, ran it again with the original code (path=path) and it worked fine. Paused for about 60 seconds before loading, 1000 images each for 4 classifications. Very strange. Please post here if you find a solution.
I’m able to see the images by simple indexing ds.x[idxs[0]]. But in case of a large dataset comparing images visually will become cumbersome.
Please share how did you retrieve the absolute paths.