I have a structured dataset of around 100 gigs, and I am using DNN for classification. Because of this huge dataset, I cannot load entire data in memory for training. So, I’ll be reading data in batches to train the model.
Now, the input to the network should be normalized and for that, I need training dataset mean and SD . I have read many articles on normalization and all of them assume that data fit into the memory and conveniently calculate the mean and SD for each feature. But for most real-world datasets that is not the case.
So, How should one go about normalizing input features with training dataset mean and SD, while loading data in batches and training model?