Unsupervised feature learning with deep learning

Deep neural networks are now the state-of-the-art in most of the problems in the supervised learning.
Consider the problem of feature learning – in the supervised setting, deep learning eliminates the need of hand crafting features like SIFT. However, in the unsupervised case, neural architectures still cannot catch up with SIFT + Fisher vectors. (Reference: ICML 2017: Unsupervised Learning by Predicting Noise ). Any insights on the problem of eliminating feature engineering in the unsupervised setting ?

I don’t see why you’d need to use unsupervised learning to create features, when things like imagenet activations and fine tuning works so well?

Is unsupervised fine tuning possible ? With features like SIFT ( + Fisher vector encoding ), we did not need labels. The ICML 2017 paper that I referenced is an attempt to do a similar thing with deep learning.

I was talking about creating the features, not using them.

(You can use imagenet features pretty much anywhere you use SIFT features.)