How to achieve this difficult task

I have formed group of each training sample with group size say X ,I can pass currently only one group at a time selected randomly to model due to GPU limitation . But My loss can be better if I calculate the loss over averaged predictions of the groups formed rather than calculating it over prediction of single group every time .Assume number of such groups to be N ,each group size X .

Pseudo logic
Model (g1) + model (g2) +model(g3) /N ->loss
Where g1 g1 g3 belong to single training sample.

You’ll have to define your custom model and fit functions. Once you do that, you can create separate models and collect the losses from each model, and find derivatives with respect to each parameter that you’re concerned with.
I don’t know how important it is for you to be able to do this…but if you’re relatively new to Deep learning, and/or still trying to grasp concepts from Part 1 of the course…I’ll rather recommend you to wait and grasp the broader concepts…
Otherwise, if you wish to learn how to do this sooner, head over to lesson 8 and 9 from Part 2…it has a very good explanation for how to do all of this.
You can also consider waiting for this year’s course, to get introduced to latest techniques, rather than outdated concepts.
hope this helps…All the best!
Stay safe