Hi @fyber, there might be a misunderstanding here: The architecture of models (i.e. the layers, convolutions, etc.) in keras are stored as JSON files, while the weights are stored as HDF5 (.h5), see here. The weights are what the model has learned. You need both to run VGG16.
The VGG net used here has been trained on Imagenet.org data. The 1000 classes include dogs and cats, but are by no means specific to just these two classes. Moreover, there is food in the Imagenet data, so VGG will be good at recognizing this as well.
The big take home, however, of the lessons on convolutional models is, that, even though a model, like VGG16, may not have been trained explicitely on your kind of data, it may still be used to classify it. This is called transfer learning and the adaptation of a model to a new task is finetuning, i.e. adding a few more layers on top specific for your classification problem, while leaving the VGG weights untouched (or training only a few layers).
The reason why this works is visualized very well in the papers bei Mat Zeiler and Yosinksi (see lesson 3 I believe). In short: Features learned by the model may be general enough (edges, gradients, patterns) to be applicable to your food dataset in spite of the pictures looking quite different. Your custom top layers merely pick the right mixture of those features which are relevant to detect different food classes.