I’m a (neuro-)radiology resident and researcher. I’m a huge fan of fast.ai and doing v2 of the course back in 2018 (?) was a real eye opener for me.
Currently I’m planning a deep learning image classification project with a dataset of CT brain scans. The thing is that the slice thickness of the brain scan is 5 mm, which means the voxels won’t be isometric (in x and y direction the voxel size is 1 mm, so one voxel is 1 x 1 x 5 mm). The brain scan is therefore more a collection of 2D images than a real 3D image volume. I recon that a CNN with 3D convolutions will probably work nonetheless, but I want to try a 2D (or 2.5D) approach.
How do I feed the brain scans as a collection of axial slices (2D images) to the network? I want to make a classification on a per patient basis and not on a per slice basis. Is it sufficient to put the different slices in the third dimension (the “rgb channel dimension” so to speak)? Logically this doesn’t make sense because usually in the channel dimension the network learns different representations of the locally same thing (like the different rgb colors or different contrasts in the case of MRI sequences), but I guess it is something that the network can learn, isn’t it?
What is the best practice here?