Hierarchical nature of ImageNet

Do any pretrained models capture the heirarchical nature of WordNet and ImageNet? For example does VGG some how encode the heirarchies like a bunk bed is type of a bed?

1 Like

I find this question interesting, I don’t think VGG considers the hierarchical nature of the dataset but there are many applications where the data are hierarchically structured, for example species in taxonomical classification. It would be interesting to learn about DL approaches for hierarchical structured data. I’ve been trying to google for papers dealing with that but I haven’t found anything, maybe I haven’t found the correct keywords.

I don’t believe most CNNs would learn the hierarchies of the classes, however I do believe that they would learn concepts/features that are shared across multiple classes.

For example, if the CNN finds a mattress in the image, it may increase the probabilities of there being any sort of mattress-related class such as a bunk-bed, king size bed, etc.

Whether that information can be considered hierarchical is up for debate.

If you could create a loss function that included some metric of distance between two classes in a hierarchy the network would probably learn some of the hierarchical nature of the data. It would need to have an easy derivative though…

This paper builds a model that is evaluated with (and does well on) just such a hierarchical loss function: http://papers.nips.cc/paper/5204-devise-a-deep-visual-semantic-embedding-model . We’ll be studying it in part 2.

I’m not aware of any papers that explicitly use a hierarchical loss function, however. I see no reason it shouldn’t work - would be an interesting project!