## the SVD: alternative SVD

Alas, alack. To judge from the books i own, almost no one other than a mathematician writes the SVD the way i did. It’s a damned shame. (i own one exception, Belsley et al.)

What do non-mathematicians write? they use u1 and v1 instead of u and v, and $\Sigma$ instead of the w matrix, writing

$X = u_1\ \Sigma\ v_1^T$

$X = u\ w\ v^T$

where

$w=\left(\begin{array}{cc}\Sigma&O\\O&O\end{array}\right)$

and  in both cases $\Sigma$ is square with positive elements and u1 and v1 are, of course, orthonormal. Incidentally, u1 is the same shape as X.

It is important that the size of $\Sigma$ is dictated by the number of nonzero singular values. i may change the size of $\Sigma$, but i don’t change the size of the w matrix which contains it.

What have they lost?

• invertibility of u and v
• orthogonality of u and v
• zeroes on the diagonal of w
• the legitimacy of changing a nonzero element  of $\Sigma$ to zero.

what have they gained?

• X and u1 are somehow related, since they are the same size
• they don’t keep all those extra zeroes in  $\Sigma$

Why do they do this?

i don’t know.

i do know that this is the way “numerical recipes” presented it, and i wouldn’t be surprised if most of the world followed their lead. If so, kudos for popularizing the SVD, but i wish they had preserved orthogonality of u and v.

the exception was Belsley et. al. Published in 1980, it predates the first edition of “numerical recipes” in some programming language, possibly the first. (yes, i’m quibbling.)

maybe i should summarize the distinction a couple of ways:

•  a matrix is orthogonal iff it is square and orthonormal
• unlike u and v, u1 and v1 are not rotations

of course, computers were slower and smaller back then, and it may very well be that carrying around all those extra zeroes was the deciding factor.

finally, all the statistics packages available to people may do it this way, without recourse to the full SVD. Users may not have much choice.

how to get around the limitations of your package? (i do intend to illustrate the following, when we get to examples.)

if your SVD routine only gives you u1 and v1, you can get all of u and v from the eigenvectors of $XX^T$ and $X^T X$, respectively. (but you might have to change the order of the eigenvalues and of the columns of u and v.)

(in an eigendecomposition, changing the sign of a column of P effects the corresponding change in $P^{-1}$ . but if we use eigendecomposition to get u and v independently, well then, they may be independent! in the SVD, u and v are not independent: the SVD has chosen the signs to make w non-negative.)