What are CNNs not good for?

james_goldfarb · August 16, 2017, 5:50pm

We have seen many examples of excellent CNN performance for segmentation, image classification, style transfer, etc.
I wanted to start this thread to explore projects where CNNs fail or are not a good choice.

My current understanding is that a CNN is a great choice for computer vision tasks.
So, can an end-to-end CNN perform a Fourier Transform?
An FT is close to a vision task. My idea is that frequencies are features/channels.

I’ve searched and not found any references for Neural Networks performing FTs.

The Universal approximation theorem states that a feed forward NN should be able to perform any computation.
Is a feed forward NN a better choice?

james_goldfarb · August 16, 2017, 5:54pm

Replying to my own post…
CNNs and NNs are probably not good choices for the FT.
But, if I had a NN or CNN that was trained for the FT, I could use transfer learning or other techniques on that CNN.
Then it become interesting.

So what do people think?

msp · August 16, 2017, 7:44pm

It sounds like you want to learn an approximate Fourier Transform; why would you do that when there are known efficient algorithms (FFT) that provide the exact answer?

I think that’s a great idea – you should preprocess your input images with FFT, and feed the result as input to your CNN.

james_goldfarb · August 16, 2017, 8:41pm

One answer to why, is to do a better FT (or at this point, I should say a better inverse FT).
In magnetic resonance imaging, we collect frequency domain data, then use an IFT to reconstruct an image.
A CNN (that has input of frequency data and output of spatial data) could be easily trained to be immune to imperfections in the input (frequency data).

I’m well aware that one can do the IFT and use the reulting image as input to a CNN.
Then train the CNN to change/restore the MRI image like we would a photo.

But of course, that is too easy… If I have a end-to-end CNN (that I can train to perform an approximation to an IFT), I could train it to alter the IFT process to be better.

msp · August 16, 2017, 9:38pm

I see! In that case I would be very curious to hear the result

machinethink · August 17, 2017, 8:10am

Note that convolutions (in CNNs) themselves are often implemented using FFTs.

msp · August 17, 2017, 8:56am

I would not have guessed that, thanks for pointing it out.

I’ve since googled around, and found this interesting: [1601.06815] Very Efficient Training of Convolutional Neural Networks using Fast Fourier Transform and Overlap-and-Add

james_goldfarb · August 17, 2017, 1:47pm

Is this the case in tensorflow and pytorch?
If this was the case, I would think that periodicity would be assumed and there would be no issues (or different issues) at the borders. (ie border_mode = valid or same).

In lesson 9, Jeremy uses “reflection” at the edges. I would think that this would be unnecessary if the convolutions were performed using an FFT.

machinethink · August 17, 2017, 2:20pm

I don’t know about pytorch but TensorFlow uses CUDA and cuDNN under the hood if you’re running on the GPU (as do many other training packages).

Here is (an older) paper about how convolutions are implemented in cuDNN, which mentions FFT (but does not actually appear to use them): https://arxiv.org/pdf/1410.0759.pdf

The cuDNN paper refers to this one about implementing convolutions using FFTs: https://arxiv.org/pdf/1312.5851.pdf

james_goldfarb · August 17, 2017, 3:04pm

When I first started, I was building tensorflow from source. At that stage, there was an option to use cuDNN. Now it’s so easy to get up and running with pip or conda, I’m not really sure if I am using cuDNN or not.

lateralplacket · August 19, 2017, 7:01pm

One reason that comes to mind you might want to do this is the “phase problem”: certain physical experiments such as X-ray crystallography essentially perform a Fourier transform on the object that is being studied, but lose phase information in the process (only intensity information remains). That means the inverse transform - reconstructing molecular shapes given X-ray diffraction patterns - is difficult. In the past tricks, often involving additional experiments, such as “anomalous dispersion”, were used as ways to reconstruct the original “real-space” geometry.

A quick search does turn up somebody applying DL to the phase problem. I didn’t read the paper but I imagine they are doing something like an (inverse) FT in their net…

machinethink · August 20, 2017, 10:00am

I guess we’re way off topic now, but doesn’t the complex (or imaginary) part of the FFT give you the phase? (I should probably brush up on my Fourier analysis. )

lateralplacket · August 20, 2017, 2:15pm

That’s right, but that’s not what’s measured in a physical diffraction experiment. In the experiment what you normally have is not the full transform, but rather just the X-ray spot intensities (or the intensities of some other wave). That’s the square of the transform (the square of the amplitude), which lacks the phase information.

Caveat: it’s a long time since I studied this too!

Some nice pictures to illustrate the problem:

http://www.ysbl.york.ac.uk/~cowtan/fourier/fourier.html

Looking at those examples (especially the duck/cat examples) you can see why the “inverse problem” (reconstructing molecular structure from X-ray diffraction pattern) is not trivial.

james_goldfarb · August 22, 2017, 6:31pm

I found a recent reference for the FT.

https://arxiv.org/abs/1704.08841

They use a fully connected network, followed by autoencoder/decoder.

saladi · November 7, 2018, 10:47pm

@lateralplacket, sorry for digging up this old thread, but I’m interested in taking a closer look at that work you had found applying DL to the phase problem in crystallography. I’ve tried searching this in a number of ways but all to no avail. If you happen to remember how/where you found this work, would you be able to let me know please?

lateralplacket · November 23, 2018, 3:04pm

I just did an internet search for deep learning phase problem I think saladi!

I did that again just now and the top hit I see is:

Phase recovery and holographic image reconstruction using deep learning in neural networks (at nature.com)

It doesn’t look like they’re applying it to X-ray crystallography there, but they do mention they reckon you could apply it to the X-ray phase problem.

Here’s a direct application to XRD, in this case not the full phase problem (which would give the full 3D geometry of the crystal unit cell) but just determining the symmetry group of the crystal structure: