I wanted to attach an RTF document detailing an issue I am having with download_images in Lesson 2 of the current course but this interface doesn’t allow that, so I am copy&pasting the doc contents here. Unfortunately this has lost all the highlighting I employed to link it all together - sorry.
I am at a dead stop on Lesson 2 in my Copy of 02_production.ipynb notebook code at the cell that performs the fastai.vision.utils.download_images call. Here is the code
print(“Downloading " + str(len(valid_urls)) + " URLs”)
download_images(dest, urls=results.attrgot(‘contentUrl’))
I have added strong proofing code prior to this call to ensure that the results list is not null and has size greater than zero and all the URLs it contains have been validated with HTTP status of 200, so I am very confident in the data I am passing it
When I execute I get the following error for which there appears to be no forum entries about, so I am lost
Cell execution output:
http://wallsdesk.com/wp-content/uploads/2017/01/Grizzly-Bear-Computer-Wallpaper.jpg
Downloading 19 URLs
AttributeError Traceback (most recent call last)
in <cell line: 13>()
37 else:
38 print(“Downloading " + str(len(valid_urls)) + " URLs”)
—> 39 download_images(dest, urls=results.attrgot(‘contentUrl’))
40
11 frames
/usr/local/lib/python3.10/dist-packages/fastai/vision/utils.py in _download_image_inner(dest, inp, timeout, preserve_filename)
27 def _download_image_inner(dest, inp, timeout=4, preserve_filename=False):
28 i,url = inp
—> 29 url = url.split(“?”)[0]
30 url_path = Path(url)
31 suffix = url_path.suffix if url_path.suffix else ‘.jpg’
AttributeError: ‘NoneType’ object has no attribute ‘split’
Here is the inner code for that execution
def download_images(dest, url_file=None, urls=None, max_pics=1000, n_workers=8, timeout=4, preserve_filename=False):
“Download images listed in text file url_file
to path dest
, at most max_pics
”
if urls is None: urls = url_file.read_text().strip().split(“\n”)[:max_pics]
dest = Path(dest)
dest.mkdir(exist_ok=True)
parallel(partial(_download_image_inner, dest, timeout=timeout, preserve_filename=preserve_filename),
list(enumerate(urls)), n_workers=n_workers, threadpool=True)
Calls =>
def _download_image_inner(dest, inp, timeout=4, preserve_filename=False):
i,url = inp
url = url.split(“?”)[0]
url_path = Path(url)
suffix = url_path.suffix if url_path.suffix else ‘.jpg’
name = _get_downloaded_image_filename(dest, url_path.stem, suffix) if preserve_filename else str(uuid.uuid4())
try: download_url(url, dest/f"{name}{suffix}“, show_progress=False, timeout=timeout)
except Exception as e: f"Couldn’t download {url}.”
The offending code is highlighted above and matches the cell output trace. Clearly url has not been initialized making it a None object.
The line
28 i,url = inp
seems to be the problem and inp is passed into this function from this call
parallel(partial(_download_image_inner, dest, timeout=timeout, preserve_filename=preserve_filename),list(enumerate(urls)), n_workers=n_workers, threadpool=True)
The called function is expecting inp to be passed in as a non null value. However the caller appears to be passing nothing, which I assume is the issue – can someone confirm this?
This code is inside an auto generated file
AUTOGENERATED! DO NOT EDIT! File to edit: …/…/nbs/09b_vision.utils.ipynb.
and I have no idea how this is being manufactured.
Can someone assist as this is completely stopping me from finishing Lesson 2?
Thank you
Regards Jon (biguls)