Tips for building large image datasets

Nice projects and thanks for the details. I’ve been thinking about building out a dataset for artist/style classification in the hopes that it’ll produce more interesting embeddings for style transfer from those pretrained weights. I’ll give your method a try. You mentioned search by date range reduces duplicate photos. The google_images_download cli supports specific site searches so I’ll probably just target wikimedia. Thanks for the great suggestions here.