Hi, I’m having issues when attempting to download images. I’m using Google Cloud Platform. Using the provided code, I downloaded .csv files for three different labels:
urls = Array.from(document.querySelectorAll('.rg_di .rg_meta')).map(el=>JSON.parse(el.textContent).ou);
window.open('data:text/csv;charset=utf-8,' + escape(urls.join('\n')));
But the instructions are a little vague in the notebook. I’m assuming that after downloading those, I need to convert them to .txt files, right? Given that the file
variables are named things like urls_grizzly.txt
. Is there some best practice for converting these files to .txt? When I try to download the images using the provided code, I get the following errors:
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-11-e85756baeaa4> in <module>
----> 1 download_images(path/file, dest, max_pics=200)
/opt/anaconda3/lib/python3.7/site-packages/fastai/vision/data.py in download_images(urls, dest, max_pics, max_workers, timeout)
192 def download_images(urls:Collection[str], dest:PathOrStr, max_pics:int=1000, max_workers:int=8, timeout=4):
193 "Download images listed in text file `urls` to path `dest`, at most `max_pics`"
--> 194 urls = open(urls).read().strip().split("\n")[:max_pics]
195 dest = Path(dest)
196 dest.mkdir(exist_ok=True)
/opt/anaconda3/lib/python3.7/codecs.py in decode(self, input, final)
320 # decode input (taking the buffer into account)
321 data = self.buffer + input
--> 322 (result, consumed) = self._buffer_decode(data, self.errors, final)
323 # keep undecoded input until the next call
324 self.buffer = data[consumed:]
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
Any help would be greatly appreciated.