What is the difference between 2d vs 3d convolutions?

One way to think about it is via the „movement“ of the convolutional filter. In an image the conv filter gets moved horizontally and vertically (so in x and y) across the image, so in 2 dimensions, hence a Conv2D, no matter whether it is a greyscale image (1 channel), color image (3 channels) or medical or sattelite image with 4 or more channels. The only thing that changes is that the conv2D filter needs to have a matching amount of in-channels in the third dimension.

In a 3D conv, the 3rd dimension is the depth and the conv filter gets moved along that dimension too, so for example a 3x3x3 filter also gets moved in x, y and z across the volume. The input in that case has more than 3 dimensions, for example x,y,z and reflectivity for some lidar data.

3 Likes