Cannot download https://s3.amazonaws.com/fast-ai-imageclas/oxford-iiit-pet

This is probable user error, but I cannot download the Oxford-IIIT pet database from this URL: https://s3.amazonaws.com/fast-ai-imageclas/oxford-iiit-pet.

I was previously able to download the dataset earlier today.

The error I get is:

~/SageMaker/envs/fastai/lib/python3.6/gzip.py in read(self, size)
    480                 break
    481             if buf == b"":
--> 482                 raise EOFError("Compressed file ended before the "
    483                                "end-of-stream marker was reached")
    484 

EOFError: Compressed file ended before the end-of-stream marker was reached

Likewise, I am unable to wget the link directly: https://s3.amazonaws.com/fast-ai-imageclas/oxford-iiit-pet

Does anyone have any pointers?

You first have to delete the dataset that was downloaded partially. For that go in the partially downloaded dataset folder. For v1.0.11 I found datasets are downloaded to be in /home/<user>/.fastai/data/ (not sure. Please have a look at this path). Delete the Oxford dataset that is downloaded partially and re run the cell, it will download again.

3 Likes

That works perfectly, thank you!

It would be great if there was some sort of help message suggesting this. This is tricky to debug for new users :slight_smile:

1 Like

Yeah, you can posixpath’s method resolve() to get absolute path of where the files where downloaded. Then go into that directory and delete it. And re run the cell. I had this same issue with v1.0.11