What are CNNs not good for?

(James W Goldfarb) #1

We have seen many examples of excellent CNN performance for segmentation, image classification, style transfer, etc.
I wanted to start this thread to explore projects where CNNs fail or are not a good choice.

My current understanding is that a CNN is a great choice for computer vision tasks.
So, can an end-to-end CNN perform a Fourier Transform?
An FT is close to a vision task. My idea is that frequencies are features/channels.

I’ve searched and not found any references for Neural Networks performing FTs.

The Universal approximation theorem states that a feed forward NN should be able to perform any computation.
Is a feed forward NN a better choice?

(James W Goldfarb) #2

Replying to my own post…
CNNs and NNs are probably not good choices for the FT.
But, if I had a NN or CNN that was trained for the FT, I could use transfer learning or other techniques on that CNN.
Then it become interesting.

So what do people think?


It sounds like you want to learn an approximate Fourier Transform; why would you do that when there are known efficient algorithms (FFT) that provide the exact answer?

I think that’s a great idea – you should preprocess your input images with FFT, and feed the result as input to your CNN.

(James W Goldfarb) #4

One answer to why, is to do a better FT (or at this point, I should say a better inverse FT).
In magnetic resonance imaging, we collect frequency domain data, then use an IFT to reconstruct an image.
A CNN (that has input of frequency data and output of spatial data) could be easily trained to be immune to imperfections in the input (frequency data).

I’m well aware that one can do the IFT and use the reulting image as input to a CNN.
Then train the CNN to change/restore the MRI image like we would a photo.

But of course, that is too easy… If I have a end-to-end CNN (that I can train to perform an approximation to an IFT), I could train it to alter the IFT process to be better.


I see! In that case I would be very curious to hear the result :slight_smile:

(Matthijs) #6

Note that convolutions (in CNNs) themselves are often implemented using FFTs. :smiley:


I would not have guessed that, thanks for pointing it out.

I’ve since googled around, and found this interesting: https://arxiv.org/abs/1601.06815

(James W Goldfarb) #8

Is this the case in tensorflow and pytorch?
If this was the case, I would think that periodicity would be assumed and there would be no issues (or different issues) at the borders. (ie border_mode = valid or same).

In lesson 9, Jeremy uses “reflection” at the edges. I would think that this would be unnecessary if the convolutions were performed using an FFT.

(Matthijs) #9

I don’t know about pytorch but TensorFlow uses CUDA and cuDNN under the hood if you’re running on the GPU (as do many other training packages).

Here is (an older) paper about how convolutions are implemented in cuDNN, which mentions FFT (but does not actually appear to use them): https://arxiv.org/pdf/1410.0759.pdf

The cuDNN paper refers to this one about implementing convolutions using FFTs: https://arxiv.org/pdf/1312.5851.pdf

(James W Goldfarb) #10

When I first started, I was building tensorflow from source. At that stage, there was an option to use cuDNN. Now it’s so easy to get up and running with pip or conda, I’m not really sure if I am using cuDNN or not.

(lateralplacket) #11

One reason that comes to mind you might want to do this is the “phase problem”: certain physical experiments such as X-ray crystallography essentially perform a Fourier transform on the object that is being studied, but lose phase information in the process (only intensity information remains). That means the inverse transform - reconstructing molecular shapes given X-ray diffraction patterns - is difficult. In the past tricks, often involving additional experiments, such as “anomalous dispersion”, were used as ways to reconstruct the original “real-space” geometry.

A quick search does turn up somebody applying DL to the phase problem. I didn’t read the paper but I imagine they are doing something like an (inverse) FT in their net…

(Matthijs) #12

I guess we’re way off topic now, but doesn’t the complex (or imaginary) part of the FFT give you the phase? (I should probably brush up on my Fourier analysis. :wink: )

(lateralplacket) #13

That’s right, but that’s not what’s measured in a physical diffraction experiment. In the experiment what you normally have is not the full transform, but rather just the X-ray spot intensities (or the intensities of some other wave). That’s the square of the transform (the square of the amplitude), which lacks the phase information.

Caveat: it’s a long time since I studied this too!

Some nice pictures to illustrate the problem:


Looking at those examples (especially the duck/cat examples) you can see why the “inverse problem” (reconstructing molecular structure from X-ray diffraction pattern) is not trivial.

(James W Goldfarb) #14

I found a recent reference for the FT.


They use a fully connected network, followed by autoencoder/decoder.