Understanding how batches are generated from directories

pdoerschuk · March 23, 2017, 7:59pm

I’m new to DL. Trying to understand how batches are generated from directories. I’ve read through the image modules in keras, but some things are hard to follow. I’m making several assumptions (some based on what I think would be desirable), but would appreciate knowing if they are correct:

Each batch includes an equal number of examples from each class, even if the number of examples in the different classes varies
Examples in each batch are selected in the order in which they appear in the class directory (perhaps this is not the case if shuffle is true???)
If no shuffle, the generator starts over with the first example in the class folder after it has exhausted all the examples in the class folder
One randomly transformed version (rotated, etc.) of each selected image is added to the batch
Batches continue to be generated until all examples have been included (e.g., if batch size is 10, class 1 has 100 examples and class 2 has 50 examples, 20 batches will be created, each with 5 examples from class 1 and 5 examples from class 2; class 2 examples will be used twice as often as class 1 )
Each epoch trains each batch for one iteration, until all training examples have been covered
Batches/examples are not stored but are generated on the fly, so with each epoch batches are generated again and may be different due to different transformations applied in item 4
Thanks very much for your help!!