I can download individual images, but when I run this it stalls. Been running for at least 5 minutes and nothing. I’ve also tried restarting the machine. How do I get it working and is there a way to see its progress? I’m running this on a Paperspace Gradient Pytorch
searches = 'forest','bird'
path = Path('bird_or_not')
from time import sleep
for o in searches:
dest = (path/o)
dest.mkdir(exist_ok=True, parents=True)
download_images(dest, urls=search_images(f'{o} photo'))
sleep(10) # Pause between searches to avoid over-loading server
download_images(dest, urls=search_images(f'{o} sun photo'))
sleep(10)
download_images(dest, urls=search_images(f'{o} shade photo'))
sleep(10)
resize_images(path/o, max_size=400, dest=path/o)
Unfortunately, there’s no “easy” way to get progress bars in this case. Normally, you create progress bars at the point where the iteration happens, i.e., in your case, where each url is downloaded. But this happens inside the download_images function.
So, to get progress bars, you would need to change download_images a tiny bit. Luckily, this is not that complicated!
You only need to import fastprogress, and wrap the iterator in download_images inside a progressbar.
Here’s a full working example. If you don’t understand everything, feel free to just use the code. That’s totally fine when starting out!