I think this can be done and would make for a really cool blog post to write
The easiest and slightly naive way to do it, one could take multiple small steps away from the solution in various directions of the weight space and evaluate the cost. Assuming a 3d weightspace, this would be like projecting a grid onto it from the z-direction and taking a measurement within some distance from the solution at the intersections.
We could sum the squares of the differences in costs or sum absolute values, and compare that against the differences in train cost vs validation cost.
We could then do two things:
- See if there are certain training methods that tend end up in less spiky areas.
- More interestingly, see if indeed less spiky areas generalize better.
I wonder what would be a good dataset to experiment with this. I would be inclined to use the MNIST, but not sure if it is not too simple? BTW I started playing around with FashionMNIST and I think it is generally perceived to be harder, but I do not know much about it.
Either way, I would be interested in working on this Sounds like a really cool idea to explore.