Lesson 3 / Book Chapter 4: Dividing by 255

AI-Fox · August 23, 2023, 2:04am

Sorry if this is a dumb question, but I’m still getting started with my AI journey.

I just finished Lesson 3 / Chapter 4 of the book: MNIST Basics, which was a doozy. I was wondering though, when we initially load our MNIST images into tensors of floats, why do we divide them by 255?

I understanding dividing by 255 restricts the range of the resulting floating point values between 0 and 1, by why? The book just says: “Generally when images are floats, the pixel values are expected to be between 0 and 1, so we will also divide by 255 here:”

If I don’t divide by 255 it screws up training later on, so it must be important, can someone elaborate on it?

Thanks.

galopy · August 23, 2023, 2:43am

Hi

I might be wrong, but I think it has to do with normalization. When calculating gradients, some gradients can be very big and overpower other values. Also, numbers are not accurate when they are too big on computers. That’s why we start with numbers between 0 and 1.

Let me know if I am wrong.

Conwyn · August 23, 2023, 7:47pm

Hi AI-Fox
Traditional gray(grey) scale pictures use 8 bits to represent black to white. Red Green Blue uses 24 bits and RGB with alpha transparency use 32 bits. The 8 bits represents an whole number between 0 and 255. We call this an integer and we perform integer mathematics on integers so 1+2 = 3 but 3/2 = 1 not 1.5. Machine Learning works with floating point number so 1.1 * 1.1 = 1.21 and 1.0/9.0 = 0.111111111111111111111… So it is necessary to convert the 8 bits into a floating point number 0.0 to 255.0 but computers are most accurate at storing small numbers near 1. Therefore dividing by 255.0 makes the calculation sit in a narrow range. Often called normalization. It is a bit like Goldilocks neither too small nor too large makes the mathematics better.
Regards Conwyn

ksanand · August 24, 2023, 5:35am

Hi AI-Fox,
A digital information is stored in a unit called as Byte (1 Byte = 8 Bits). Now each pixel can have values either 0 and 1 that makes it 2^8 = 256 or information could be stored between 0 and 255 values. Hence we divide the bits by 255 to convert into float.
Hope this helps.