I don’t think that adding more data can cause overfitting. Because the problem of overfitting comes from weak generalization capabilities of your model, when low training loss doesn’t reflect low test loss anymore, and the model performs bad on the unseen data.
I think you need to analyze your training/learning curves, and verify the process of splitting your data into training and validation subsets to make sure that you don’t do any data snooping.