In the first lesson, we’re encouraged to try a classification problem using another dataset to get a feel for how the library works. I was interested in the Google Landmarks dataset – I was thinking I could try to train off of it and then see if it could recognize photos from my recent vacation. It’s absolutely gigantic, though: 500GB just for the training set.
This seems like it’d be a common problem while learning, so I’m curious what you all would recommend I do. Can I take some random subsampling of the data set? Downscale the images? Do I just need to bite the bullet and sign up for a pricey Paperspace plan?