the SVD: alternative SVD

Alas, alack. To judge from the books i own, almost no one other than a mathematician writes the SVD the way i did. It’s a damned shame. (i own one exception, Belsley et al.) 

 

What do non-mathematicians write? they use u1 and v1 instead of u and v, and \Sigma  instead of the w matrix, writing

 

X = u_1\ \Sigma\ v_1^T 

 

instead of

 

X = u\ w\ v^T 

 

where 

 

w=\left(\begin{array}{cc}\Sigma&O\\O&O\end{array}\right)

 

and  in both cases \Sigma is square with positive elements and u1 and v1 are, of course, orthonormal. Incidentally, u1 is the same shape as X.

 

It is important that the size of \Sigma is dictated by the number of nonzero singular values. i may change the size of \Sigma, but i don’t change the size of the w matrix which contains it.

 

What have they lost? 

  • invertibility of u and v
  • orthogonality of u and v
  • zeroes on the diagonal of w
  • the legitimacy of changing a nonzero element  of \Sigma to zero.

what have they gained?

 

  • X and u1 are somehow related, since they are the same size
  • they don’t keep all those extra zeroes in  \Sigma  

Why do they do this?

i don’t know.

i do know that this is the way “numerical recipes” presented it, and i wouldn’t be surprised if most of the world followed their lead. If so, kudos for popularizing the SVD, but i wish they had preserved orthogonality of u and v.

the exception was Belsley et. al. Published in 1980, it predates the first edition of “numerical recipes” in some programming language, possibly the first. (yes, i’m quibbling.)

maybe i should summarize the distinction a couple of ways:

  •  a matrix is orthogonal iff it is square and orthonormal
  • unlike u and v, u1 and v1 are not rotations

of course, computers were slower and smaller back then, and it may very well be that carrying around all those extra zeroes was the deciding factor.

finally, all the statistics packages available to people may do it this way, without recourse to the full SVD. Users may not have much choice.

how to get around the limitations of your package? (i do intend to illustrate the following, when we get to examples.)

if your SVD routine only gives you u1 and v1, you can get all of u and v from the eigenvectors of XX^T and X^T X, respectively. (but you might have to change the order of the eigenvalues and of the columns of u and v.)

 

Advertisements

One Response to “the SVD: alternative SVD”

  1. rip94550 Says:

    if you need to get all of u and v (i.e. square, orthogonal matrices rather than rectangular orthonormal matrices), in addition to lining up the common eigenvalues with corresponding columns, you may have to change some signs of the columns of v to correspond to the signs of u, or vice versa.

    that is, if u and v are computed independently using eigendecomposition, any columns can be multiplied by -1 without affecting orthonormality of the vectors; but if the signs don’t correspond, the nonzero elements of w need not be positive. if one isn’t, change the sign of the appropriate column of v. rinse, lather, and repeat: i.e. repeat until all nonzero elements of w are positive.

    (in an eigendecomposition, changing the sign of a column of P effects the corresponding change in P^{-1} . but if we use eigendecomposition to get u and v independently, well then, they may be independent! in the SVD, u and v are not independent: the SVD has chosen the signs to make w non-negative.)

    i haven’t actually tried this, but it makes perfect sense to me.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: