In Video 2 and 3 we are discussing orthonormality, orthogonality a lot. I don’t think I understand them in context of matrices. Can someone point to a resource which explains it in terms of matrices?

I’m not sure which video you are referring to, but orthogonal/orthonormal matrix is just a square matrix with all of its columns/rows being a set of orthogonal/orthonormal vectors. eg. matrix with 1 on its diagonal and 0 everywhere else.

orthonormal matrix: The dot product of any two column vector is 0, means they are independent. (also norm of each column is 1)

A square matrix (nxn) which has all element as real values are called orthogonal. These matrix has lots of good properties.

Lets say we have a 3x3 orthogonal matrix.

You can think of the column as x,y,z direction in 3d co-ordinate systems and we try to project our data in the space generated by these x,y,z independent direction.

(actually in euclidian space it is [1,0,0], [0,1,0], [0,0,1])

Learning an orthogonal matrix is much easier than the matrix which has correlated columns.

In the above example if xyz directions are not independent to each other then any update in x vector affect y vector etc. (think about the case when update in x and y flip-flops in opposite direction).

If x y and z are independent then update on any vector won’t affect any other vector.

That is also the idea behind orthogonal initialization of weights in deep leaning.

Here is the wikipedia page: https://en.wikipedia.org/wiki/Orthogonal_matrix

Confused about the following

- Which of these is true or a better definition of ortho* matrix
- row need to be orthogonal/orthonormal with rows and columns need to be orthogonal/orthonormal with columns. row need not be orthogonal/orthonormal with columns and vice versa.
- rows need to be orthogonal/orthonormal with rows and columns. Same for columns.

- Are these correct?
- orthogonal matrix means that each row’s dot product with itself is 0
- orthonormal matrix means that dot product of each row with itself is 1 and dot product with row is 0 (which also makes it orthogonal)

- If we say 2 ortho* matrices do we mean each of them is ortho* individually or ortho* to each other? If the second one is true then how is that mathematically defined?

I will try to make it simple.

Lets say there is a matrix A .

We call it an **Orthogonal Matrix** if A.A^T=I

Here I is the Identity Matrix.

Now there are some nice properties for an orthogonal matrix. One is that, the rows of an orthogonal matrix forms an **orthonormal basis** which just means they are mutually perpendicular ( linearly independent ) and are of unit length.