Hi Anand,
let’s take a simple example for tabular data. Let’s say you have dataset describing people, and you describe each person by age, height in cm and weight in kg. Then your dataset could look sth like
[[30, 170, 80],
[80, 160, 65],
[25, 190, 90],
[25, 180, 80]]
Here, 30,80,25,25
are the ages, 170,160,190,180
are the heights, and 80,65,90,80
are the weights.
The “features” are what information you have about each data point. In this case, you initially know 3 things about each data point (age, height and weight), so you have 3 features in your input.
Now let’s say you have a weight matrix like
[[1,0],
[0,1],
[0,1]]
After running the initial data through this weight matrix (i.e., matrix multiplying them), you will have data of shape (4,2)
(because multiplying a 4,3
matrix with a 3,2
matrix gives a 4,2
matrix).
[[30,250],
[80,225],
[25,280],
[25,260]]
So, at this stage, you only know 2 things about each data point. This means you have 2 features.
Note: It might look like it’s bad we “know less” now, as we only know 2 things about each data point instead of 3 things, but the magic of neural nets is that they learn better features! I.e., the 2 things might be more informative than the initial 3 things.
Hope that helps!