I am working on a project to study ML supported diagnosis based on flow cytometry. Here’s a description of how these instruments work (from http://flowbook.denovosoftware.com/):
‘flow cytometry is the measurement of cells in a flow system, which delivers the cells singly past a point of measurement… Typically, light scatter at two different angles and from one to six or more fluorescences will be measured.’
In simple terms, a number of scatter plots are used to diagnose a disease. One way this could be served up as input to a convolutional net is a stack of gray scale images ‘glued together’, the way color channels are combined in an image (if there’s a better way to do this, I’m absolutely open to suggestions!).
Here’s what I’ve got so far: I have created 6 scatter plots per sample that I am storing in a numpy array (200 x 200 x 6) and written to a file (xxx.npy). In order to get the array processed by the fastai ImageClassifierData, I have hacked dataset.py, changing
‘im = cv2.imread(str(fn), flags).astype(np.float32)/255’ in the ‘open_image’ function to
‘im = np.load(fn)’.
So far, so good, but when I start training the network (learn = ConvLearner.pretrained(arch, data, precompute=True)) I get the following error:
C:\MyPy\fastai\flow_cytometry\fastai\transforms.py in call(self, x, y)
163
164 def call(self, x, y=None):
–> 165 x = (x-self.m)/self.s
166 if self.tfm_y==TfmType.PIXEL and y is not None: y = (y-self.m)/self.s
167 return x,y
ValueError: operands could not be broadcast together with shapes (200,200,6) (3,)
I’ve gone through transforms.py and it crashes when it tries to normalize the data because the imagenet stats have only three dimensions (RGB, I assume, this is way over my head at the moment) - so it seems that in order to run pretrained models require images with 3 color channels?
I know my issue might seem exotic, but wouldn’t a similar capability also be required for stuff like brain scans with multiple images in a lot of layers?