Training with Jpeg , Png and Bmp Images

Ghiya6548 · November 19, 2018, 7:31am

Hi all,

I have a medical dataset consisting of all three different formats in different volumes. A quick google search returns Jpeg is a lossy compressed format and png is a lossless compressed format. How to deal with it in terms of training ? Are there are any known consequences?

nissan · November 19, 2018, 8:24am

Is the size of the image dataset too large for it to be unreasonable to do preprocessing on the images to get them all in one common format before bringing them into the databunch?

Ghiya6548 · November 19, 2018, 8:55am

Hi Nissan! The size of dataset is approximately 3-4 k. So it should be possible to preprocess them. But the query i have which image format is preferred for training and why? What is the impact on CNN due to different image formats? So what is the usual workflow for handling such dataset? And please let me know any known references for preprocessing!

digitalspecialists · November 19, 2018, 9:50am

No it won’t have any effect unless the images are severely over compressed. You can read more here if you wish https://arxiv.org/abs/1604.04004

… though it does make me wonder about the potential of a ‘severe compression augmentation’ method.

Ghiya6548 · November 19, 2018, 10:49am

Thanks @digitalspecialists. What kind of preprocessing is needed? And the ideal format will be png right as it is lossless?

digitalspecialists · November 19, 2018, 10:56am

you should be able to read in a folder full of mixed jpg and png as far as I know

Ghiya6548 · November 19, 2018, 10:58am

Understood. Thnks