URLs.some_url is just a url of type String, and untar_data needs a url to download the data and then extract it. Since you already have the data it’s just one line of code to extract it:
If I’m understanding you correctly, I can enter a URL into untar_data() (i.e. as a string) and it will download and extract. The argument does not need to be of the class ‘URLs’. When I tried that previously, though, I received an error “not a zipped file” even though it was a .tgz file. Any idea why this would happen? I used URLs taken from the fastai website.
It sounds like a workaround is to download the data and then use tarfile.open(), as you suggest. But should it be possible to use untar_data to download and unzip? I understood the docs to be suggesting that you needed to use the fastai class “URLs” as the argument.
Just came across this thread while looking for a solution myself.
(This may not be particularly useful since it has been a long time since your question.)
Here’s an example of how one could use untar_data if the dataset comes from an external source. Here we use the download_data helper function from fastai to download the flowers dataset.
url = 'http://download.tensorflow.org/example_images/flower_photos.tgz'
path = download_data(url)
path.as_posix()
# '/home/gg/.fastai/archive/flower_photos.tgz'
data = untar_data(path.as_posix()) # or pass str(path)
data
# Path('/home/gg/.fastai/data/flower_photos')