As you can see after some training the validation loss starts to increase (like in the picture) or doesn’t decrease anymore. When the validation loss is at its minimum, training more won’t help and might even be harmful to your model.
Test set is to see how good your model would work in practice (well as close as we can get to knowing that ).
Shuffling the data between validation and training, while training, defeats their purpose (you won’t know when to stop training).
What you could try:
If the order of the data doesn’t matter then you can have the validation set as some percent of your data. Of course it wouldn’t change while training. It might or might not help but it probably won’t help a lot.