The images are of 5 types of flowers (Tulips, sunflowers etc), all of them in 5 different folders
All tulip pictures in the tulip folder, all rose pictures in rose folder and so on
What I want to know is that if there is a way to split this group of pictures, randomly into train(80%) and test(20%) data sets?
(To just get it running I manually arbitrarily split them, but I want to know if there’s a way to do it by code)
I know how to split data, if the data is structured, but I can’t find any resources for images.
The first task I perform is usually to turn the folders of images into a structured index so they can be better managed as a dataframe. Essentially put all the images in a single folder, with a csv labelling each image as to the folder it was previously in. There are many ways to do this, but here is one that has been posted on this forum. You can then split this into whatever sets you need using the structured techniques you already know.