Unable to understand the jargon "Feature"

Hi Team,
In chapter 13 of the book there is an explanation of the term feature as in

Jargon: Channels and Features
These two terms are largely used interchangeably and refer to the size of the second axis of a weight matrix, which is the number of activations per grid cell after a convolution.

What exactly is a feature? Kindly help me understand the above explanation.


Hi Anand,

let’s take a simple example for tabular data. Let’s say you have dataset describing people, and you describe each person by age, height in cm and weight in kg. Then your dataset could look sth like

[[30, 170, 80],
 [80, 160, 65],
 [25, 190, 90],
 [25, 180, 80]]

Here, 30,80,25,25 are the ages, 170,160,190,180 are the heights, and 80,65,90,80 are the weights.

The “features” are what information you have about each data point. In this case, you initially know 3 things about each data point (age, height and weight), so you have 3 features in your input.

Now let’s say you have a weight matrix like


After running the initial data through this weight matrix (i.e., matrix multiplying them), you will have data of shape (4,2) (because multiplying a 4,3 matrix with a 3,2 matrix gives a 4,2 matrix).


So, at this stage, you only know 2 things about each data point. This means you have 2 features.

Note: It might look like it’s bad we “know less” now, as we only know 2 things about each data point instead of 3 things, but the magic of neural nets is that they learn better features! I.e., the 2 things might be more informative than the initial 3 things.

Hope that helps!

1 Like

Thanks a lot Umer, I was able to understand the term feature, which is basically information about our input data and which could be modified by passing through the weight matrix.
Your explanation is clear, just one follow up question, what does "size of the second axis of weight matrix mean? I got confused by this sentence.


Sure: “axis” is only another word for dimension of a matrix. E.g., if you have a, say, 13x9 matrix, then you have two axes. The first axis is size 13 and the second axis is size 9.

So, the size of the second axis would be 9.

In the example in my previous post, you can see that the weight matrix has shape 3x2. So the “size of the second axis of weight matrix” is 2.

This makes sense, because whatever shape your data has, after running through the weight matrix, it will have shape n_data_points x 2nd axis of weight matrix.

1 Like

Got it, now it makes perfect sense because of the context you gave and it helped me understand the whole thing. Thank You Umer.