angle and axis of rotation 2: the two correct answers

Back in what has turned out to be my most popular post, axis and angle of rotation, I showed how to switch between the angle and axis of rotation and the rotation matrix.

Given the axis of rotation as a unit vector (a, b, c), and the angle of rotation, we construct the matrix N

N = \left(\begin{array}{lll} 0 & c & -b \\ -c & 0 & a \\ b & -a & 0\end{array}\right)

and then we construct

A = I + N sin \theta + N^2 \left(1-cos \theta \right)

and that’s a rotation. To be specific, it is an attitude matrix for a rotation of coordinate axes, about the axis (a, b, c) through the angle \theta\ .

Conversely, given a rotation matrix, we find the axis of rotation as the eigenvector whose eigenvalue is 1, and then we use the formula for the trace of a rotation matrix

trace[A] = 1 + 2\ cos \theta

to get the angle \theta\ .

The problem is that the eigenvector and the angle are each determined only to within a sign. There are 4 possible pairings, only 2 of which are correct. Given a candidate pair, we construct the rotation matrix from it, and compare it to the given rotation matrix.

I now know how to uniquely determine the two correct pairs. We need no longer test the possibilities.

The key was a sidebar comment on p. 74 in Kuipers’ “Quaternions and Rotation Sequences”. What he said was that given an arbitrary orthogonal 3×3 matrix A = a_{ij}\ , the axis of rotation was given by

i \left(a_{23} - a_{32}\right) +  j \left(a_{31} - a_{13}\right) +  k \left(a_{12} - a_{21}\right)

where i, j, k are the usual unit coordinate-axis vectors. That is, the components are

\left(\left(a_{23} - a_{32}\right),\   \left(a_{31} - a_{13}\right),\ \left(a_{12} - a_{21}\right)\right)\ .

I struggled with that for a while, silly me, before I realized that instead of trying to prove it for an arbitrary orthogonal 3×3, I should just use the magic formula for the rotation matrix in terms of the axis and angle.

Once I realized that, the obvious calculation is A - A^T\ . Let

A = I + N sin \theta + N^2 \left(1-cos \theta \right)

and we get

A = \left(\begin{array}{lll} \left(-b^2-c^2\right) (1-\cos (\theta ))+1 & a b (1-\cos (\theta ))+c \sin (\theta ) & a c   (1-\cos (\theta ))-b \sin (\theta ) \\ a b (1-\cos (\theta ))-c \sin (\theta ) & \left(-a^2-c^2\right) (1-\cos (\theta ))+1 & b c   (1-\cos (\theta ))+a \sin (\theta ) \\ a c (1-\cos (\theta ))+b \sin (\theta ) & b c (1-\cos (\theta ))-a \sin (\theta ) &   \left(-a^2-b^2\right) (1-\cos (\theta ))+1\end{array}\right)

and then A - A^T\ is

A - A^T = \left(\begin{array}{lll} 0 & 2 c \sin (\theta ) & -2 b \sin (\theta ) \\ -2 c \sin (\theta ) & 0 & 2 a \sin (\theta ) \\ 2 b \sin (\theta ) & -2 a \sin (\theta ) & 0\end{array}\right)

Just look at those off-diagonal terms! They each involve one of a, b, c, and a common factor of sin \theta\ . Your first reaction might be that we don’t know sin \theta\ , but that doesn’t matter. We can write the vector 2\ sin \theta\ (a,\ b,\ c) from the off-diagonal terms, and then we can get (a, b, c) just by making a unit vector from it. And once we have any one of a, b, c, we can get sin \theta\ .

So Kuipers was right, and we can get the axis of rotation from the skew part of the rotation matrix. What he never said was that we could get the sine of the angle, too. And that, combined with the cosine, determines the quadrant.

Given a rotation matrix (representing a rotation of the coordinate axes), there are two ways to proceed. One is to enhance the existing algorithm:

  • Get the rotation axis (a, b, c) as the unit eigenvector with eigenvalue 1,
  • Get cos \theta\ from the trace of the rotation matrix,
  • Get sin \theta\ from a (or b or c) and the appropriate off-diagonal term in A - A^T\ .

The other way to go is to forget about the eigenvector calculation.

  • Get the vector 2\ sin \theta\ (a, b, c) from the off-diagonal terms,
  • Get the vector (a, b, c) as a unit vector,
  • Get cos \theta\ from the trace of the rotation matrix.
  • Get sin \theta\ from a (or b or c) and the appropriate off-diagonal term in A - A^T\ ,

In either case, having both the cosine and the sine of an angle determines it (mod 2\ \pi\ , of course, but that’s not the issue). We know which quadrant we’re in.

Oh, we get one or the other of the two correct pairs of axis and angle by the two possible choices of the sign of (a, b, c). Pick one sign for the vector, get sin \theta\ ; pick the other sign for the vector, get -sin \theta\ . These are the two correct pairs.

Let me emphasize that we work with two matrices: the rotation matrix A and the skew matrix A - A^T\ .

Let’s do it both ways. Here’s the attitude matrix to get to Mars’ coordinate frame. (It rotates the solar system coordinate axes to Mars coordinate axes.)

A = \left(\begin{array}{lll} 0.90956 & -0.414415 & -0.0310051 \\ 0.414851 & 0.909845 & 0.00899314 \\ 0.0244829 & -0.0210423 & 0.999479\end{array}\right)

I ask Mathematica® for the eigendecomposition, and select the eigenvector with eigenvalue 1. It is:

\{-0.0361149,\ -0.0667194,\ 0.997118\}

and that is a unit vector, whose components we call a, b, c, respectively.

We compute that the trace of the rotation matrix is 2.81888 and that’s 1 + 2\ cos \theta\ , so we get the cosine of the angle:

\cos \left(\theta \right)\to 0.909442

Next we compute A - A^T\ :

\left(\begin{array}{lll} 0. & -0.829266 & -0.055488 \\ 0.829266 & 0. & 0.0300354 \\ 0.055488 & -0.0300354 & 0.\end{array}\right)

We know that the (2,3) location is 2\ a\ sin \theta\ , so we have…

0.0300354=2 a \sin (\theta )

Plug in a…

0.0300354=-0.0722298 \sin (\theta )

which gives us

\{\sin (\theta )\to -0.415831\}

Now, we want the angle whose tangent is \frac{\sin \theta}{\cos \theta}\ , with the correct signs of the sine and cosine. I think that most software, nowadays, has a two-argument arctangent function which does exactly that, and Mathematica® certainly does. I get that the angle is (in radians)

\theta = -0.428857

And that is what we found originally: a valid answer was the eigenvector returned by Mathematica® and the negative angle.

Let’s try it the other way. From the general skew matrix

\left(\begin{array}{lll} 0 & 2 c \sin (\theta ) & -2 b \sin (\theta ) \\ -2 c \sin (\theta ) & 0 & 2 a \sin (\theta ) \\ 2 b \sin (\theta ) & -2 a \sin (\theta ) & 0\end{array}\right)

and the specific skew matrix

\left(\begin{array}{lll} 0. & -0.829266 & -0.055488 \\ 0.829266 & 0. & 0.0300354 \\ 0.055488 & -0.0300354 & 0.\end{array}\right)

we construct the vector

2\ sin \theta\ (a,\ b,\ c) = \{0.0300354,\ 0.055488,\ -0.829266\}

which happens to be the negative of the eigenvector we had found. Make a unit vector out of it, which gives us values for (a, b, c):

(a,\ b,\ c) = \{0.0361149,\ 0.0667194,\ -0.997118\}

That’s pedagogically very nice: we just happen to have made the other sign choice for the axis.

As before, we know that the (2,3) location is 2\ a\ sin \theta\ , so we have…

0.0300354=2 a\ \sin (\theta )

Plug in the value of a…

0.0300354=0.0722298 \sin (\theta )

which gives us

\{\sin (\theta )\to 0.415831\}

which, as it must be, is the negative of the previous value for the sine.

The cos \theta is still the same, so we find the new two-argument arctangent:

\theta = 0.428857

This time we got the positive angle. Of course, because we changed the sign on the rotation axis.

We have both of the two correct answers, and only the two correct answers. Having one, of course, we could write the other. I didn’t work the problem twice in order to get two solutions; I worked it twice to illustrate the two possible algorithms. I was very glad that they just happened to give the two different correct answers.

We no longer have to pick one of two unit vectors from column A and one of two angles from column B, and test pairs to see if they reproduce the rotation matrix.

As for the two algorithms, take your pick. My preference is to get the eigenvector, but Mathematica® makes that almost trivial for me to do. If it’s easier for you to get (a, b, c) from A - A^T\ , go for it.

rotating coordinate systems: examples 2 & 3

2: released in rotating frame (linear motion, inertial frame)

(See the previous post, example 1, for notation.)

Suppose we are holding an object fixed on the merry-go-round at a distance R on the x-axis: it is stationary in the rotating frame. Now suppose that the surface is frictionless. We release the object at t= 0. What is it’s motion?

At t = 0 we have initial values of \rho and \nu:

\rho0 = \{R,\ 0,\ 0\}

\nu0 = \{0,\ 0,\ 0\}

We apply the transition matrix (setting t = 0; note that all axes coincide) to get initial r and v from initial \rho and \nu:

r0 = T\ \rho0 = \{R,\ 0,\ 0\}

v0 = T\ \nu0 - \omega\ N\ T\ \rho0 = \{0,\ R \omega ,\ 0\}

(Yes, wrt the fixed frame, the initial position r0 is on the x-axis, and the initial velocity v0 is tangential, i.e. having only a y-component.)

We know that the acceleration a is zero, so v is constant, and r is linear in time:

a = 0

v = v0 = \{0,\ R \omega ,\ 0\}

r = r0 + v0\ t = \{R,\ R t \omega ,\ 0\}

To see what it looks like in the rotating frame, I want \rho\ , so I apply the attitude matrix to r (because A = T^{-1}\ ):

\rho = A\ r = \left(\begin{array}{lll} \cos (t \omega ) & \sin (t \omega ) & 0 \\ -\sin (t \omega ) & \cos (t \omega ) & 0 \\ 0 & 0 & 1\end{array}\right) \left(\begin{array}{l} R \\ R t \omega  \\ 0\end{array}\right)

=

\left(\begin{array}{l} R \cos (t \omega )+R t \omega  \sin (t \omega ) \\ R t \omega  \cos (t \omega )-R \sin (t \omega ) \\ 0\end{array}\right)

Draw it! Choose numerical values: R\to 1, \omega \to 1.

Viewed from the rotating frame, it appears to spiral out.

3: linear motion in rotating frame

Suppose that a person starts (t=0) at the center of the merry-go-round and walks straight out along his (new, rotating) x-axis at constant speed \nu0. Then the new components of his position \rho are…

\{t\ \nu0,\ 0,\ 0\}

and \nu and \alpha are…

\nu = \{\nu0,\ 0,\ 0\}

and

\alpha = \{0,\ 0,\ 0\}

We compute the old components a.

a = T\ \alpha - 2\ \omega\ N\ T\ \nu + \omega^2\ N^2\ T\ \rho

The three terms are

T\ \alpha = \{0,0,0\}

- 2\ \omega\ N\ T\ \nu = \{-2\ \nu0\ \omega\  \sin (t \omega ),\ 2\ \nu0\ \omega\ \cos (t \omega ),\ 0\}

\omega^2\ N^2\ T\ \rho = \left\{-t\ \nu0\ \omega ^2\ \cos (t \omega ),\ -t\ \nu0\ \omega ^2\ \sin (t \omega ),\ 0\right\}

and their sum is

a = \left(\begin{array}{l} -\nu0\ \omega\  (t\ \omega\  \cos (t \omega )+2 \sin   (t \omega )) \\ \nu0\ \omega  (2 \cos (t \omega )-t\ \omega\  \sin   (t \omega )) \\ 0\end{array}\right)

Now, what are the components of a in the rotating frame? it want to see the radial and tangential components of a, hence implicitly the radial and tangential components of force exerted by our walker as he moves outward in a straight line at constant speed on the merry-go-round.

The new components of a are found by applying A to a:

A\ a = \left(\begin{array}{lll} \cos (t \omega ) & \sin (t \omega ) & 0 \\ -\sin (t \omega ) & \cos (t \omega ) & 0 \\ 0 & 0 & 1\end{array}\right) \left(\begin{array}{l} -\nu0\ \omega\  (t\ \omega\  \cos (t \omega )+2 \sin   (t \omega )) \\ \nu0\ \omega  (2 \cos (t \omega )-t\ \omega\  \sin   (t \omega )) \\ 0\end{array}\right)

=

\left(\begin{array}{l} -t\ \nu0\ \omega ^2 \\ 2\ \nu0\ \omega  \\ 0\end{array}\right)

We have a component -t\ \nu0\ \omega ^2 along the new x-axis (i.e. radial): the negative sign says it points inward. It grows with time, i.e. as he moves outward. (It reflects only his increasing distance from the center, because he is walking out at constant speed \nu = \nu0\ .)

We also have a constant positive acceleration 2\ \nu0\ \omega in the tangential direction: as he moves out, he must increase his tangential speed wrt the fixed frame.

I say again that I really like having unambiguous definitions of, for example, a and \alpha\ , and then being able to confidently ask for the new components of a.

rotating coordinate systems: example 1

conventions and setup

As far as possible, I am going to stay with my notation. r and \rho are the old and new (fixed and rotating) components of the position vector; v and \nu are derivatives wrt time of r and \rho respectively; a and \alpha are derivatives wrt time of v and \nu respectively. (But R is a convenient scalar value, and will no longer denote the position vector whose components are r and \rho\ .)

v = \dot{r}

\nu = \dot{\rho}

a = \dot{v}

\alpha = \dot{\nu}

The rotating frame is the same in all these problems, so get its matrices early (hence not often). The z-axis is our axis of rotation.

The attitude matrix for a CCW rotation of the axes (about the z-axis) is…

A = \left(\begin{array}{lll} \cos (t \omega ) & \sin (t \omega ) & 0 \\ -\sin (t \omega ) & \cos (t \omega ) & 0 \\ 0 & 0 & 1\end{array}\right)

The transition matrix is…

T = \left(\begin{array}{lll} \cos (t \omega ) & -\sin (t \omega ) & 0 \\ \sin (t \omega ) & \cos (t \omega ) & 0 \\ 0 & 0 & 1\end{array}\right)

And N is the derivative of A, at zero, divided by \omega\ (to “get it” from a unit vector):

N = \frac{1}{\omega}\ \frac{\partial A}{\partial t}\ \left(0\right) = \left(\begin{array}{lll} 0 & 1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & 0\end{array}\right)

I like to calculate N from first principles, in order to be certain I have the right signs. Of course, my notation for those has not changed: T is the transition matrix, A is the attitude matrix: T = A^T\ , and because A is orthogonal, A^{-1} = A^T \left(= T\right)\ .

fixed in the rotating frame

We imagine something held fixed in a rotating coordinate system. We might think of an object on a merry-go-round. If it’s at a distance R from the center, we put it on the new (rotating) x axis (or, equivalently, we put the new x-axis thru it!). The position vector has new components \rho, and we differentiate to get \nu and again to get \alpha\ :…

\rho = \{R,\ 0,\ 0\}

\nu = \{0,\ 0,\ 0\}

\alpha = \{0,\ 0,\ 0\}

Now recall

v = T\ \nu - \omega\ N\ T\ \rho\ .

(I note that I have an appalling tendency to write the utterly false –
falsev = T\ \nu - \omega\ N\ T\ \nu\ . false)

but since \nu is zero, T\ \nu is zero…

T\ \nu = \{0,\ 0,\ 0\}

so the other term is v…

v = \{-R \omega  \sin (t \omega ),\ R \omega  \cos (t \omega ),\ 0\}

Next, we compute a

a = T\ \alpha - 2\ \omega\ N\ T\ \nu + \omega^2\ N^2\ T\ \rho

Since \nu and \alpha are zero, T\ \nu and T\ \alpha are zero, so the only nonzero term is the last(and i’ve marked it “a”)… but we could compute them all, and add them up:

T\ \alpha = \{0,\ 0,\ 0\}

2\ \omega\ N\ T\ \nu = \{0,\ 0,\ 0\}

a = \left\{-R \omega ^2 \cos (t \omega ),-R \omega ^2 \sin (t \omega ),0\right\}

Let me write out the pieces of that nonzero term. We have

N = \left(\begin{array}{lll} 0 & 1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & 0\end{array}\right)

Then

\omega^2\ N^2\ T\ \rho

=
\left(\begin{array}{lll} -\omega ^2 & 0 & 0 \\ 0 & -\omega ^2 & 0 \\ 0 & 0 & 0\end{array}\right) \left(\begin{array}{lll} \cos (t \omega ) & -\sin (t \omega ) & 0 \\ \sin (t \omega ) & \cos (t \omega ) & 0 \\ 0 & 0 & 1\end{array}\right) \left(\begin{array}{c} R \\ 0 \\ 0\end{array}\right)

So, we have the acceleration a. Of course, those are old components.

What are the new components of a? (Strictly speaking, what are the new components of the vector whose old components are a?) They are

T^{-1}\ a = T^T\ a = A\ a

=

\left(\begin{array}{lll} \cos (t \omega ) & \sin (t \omega ) & 0 \\ -\sin (t \omega ) & \cos (t \omega ) & 0 \\ 0 & 0 & 1\end{array}\right) \left(\begin{array}{l} -R \omega ^2 \cos (t \omega ) \\ -R \omega ^2 \sin (t \omega ) \\ 0\end{array}\right)

= \left(\begin{array}{l} -R \omega ^2 \\ 0 \\ 0\end{array}\right)

The acceleration is along the negative x-axis, i.e. radial. Good. In fact, perfect.

If we assume that our fixed frame is inertial, then a radial physical force is required to hold that mass in circular motion at constant speed.

Did it confuse you that I found the new components of a? Well, a in an inertial frame tells me what force is required, but it’s in the rotating frame that the answer takes a simple form.

Using polar coordinates, instead, is a different way of coming at that. I think I prefer my way: compute the new, rotating, components of a.

It’s so nice to be clear about a and \alpha\ , and the new components of a.

We could check the cross product form – and I have – but I’ll leave it to you. For the \omega vector, just take scalar \omega times the unit vector (0, 0, 1) along the z-axis.

rotating coordinate systems: background

I owe you derivations of three assertions. We will need a fourth one, too.

  1. matrix multiplication by N is equivalent to some vector cross product
  2. the transition matrix is T =  1 - \sin (\theta )\ N + (1-\cos (\theta ))\ N^2
  3. \dot{T} = - \omega\ N\ T\
  4. N^3 = -N

matrix multiplication by N

Let’s take two arbitrary vectors:

a = \{a_1,\ a_2,\ a_3\}

b = \{b_1,\ b_2,\ b_3\}

Replace the vector a by a skew matrix N_a\ , using the same sign convention as for constructing N from a unit vector.

N_a = \left(\begin{array}{lll} 0 & a_3 & -a_2 \\ -a_3 & 0 & a_1 \\ a_2 & -a_1 & 0\end{array}\right)

I hope you’re ok that N_a is what I would get if I made a unit vector \hat{a} from a, made N from \hat{a} in the usual way, and then multiplied N by the magnitude |a|: i.e. what we have is N_a = |a|\ N\ , with N from the unit vector \hat{a}\ .

Now compute the matrix product

N_a\ b = \{a_3\ b_2-a_2\ b_3,\ a_1\ b_3-a_3\   b_1,\ a_2\ b_1-a_1\  b_2\}

and the cross product

a \times b = \{a_2\ b_3-a_3\ b_2,\ a_3\ b_1-a_1\ b_3,\ a_1\ b_2-a_2\  b_1\}

One is the negative of the other:

a \times b = -N_a\ b = - |a|\ N\ b

so the minus sign in the matrix form

v = T\ \nu - \omega\ N\ r

becomes a plus sign when we write the cross product form

v = T\ \nu + \omega \times r\ .

Now compute the matrix product

N_a\ N_a\ b = \left(\begin{array}{l} -b_1\ a_2^2+a_1\ b_2\  a_2-a_3^2\   b_1+a_1\ a_3\ b_3 \\ -b_2\ a_1^2+a_2\ b_1\ a_1-a_3^2\  b_2+a_2\ a_3\ b_3 \\ -b_3\ a_1^2+a_3\ b_1\ a_1+a_2\ a_3\   b_2-a_2^2\  b_3\end{array}\right)

and the cross product

a \times \left(a \times b\right) = \left(\begin{array}{l} -b_1\  a_2^2+a_1\ b_2\ a_2-a_3^2\   b_1+a_1\ a_3\ b_3 \\ -b_2\ a_1^2+a_2\ b_1\  a_1-a_3^2\   b_2+a_2\ a_3\ b_3 \\ -b_3\ a_1^2+a_3\ b_1\ a_1+a_2\ a_3\    b_2-a_2^2\ b_3\end{array}\right)

They are the same:

a \times \left(a \times b\right) = N_a^2\ b = |a|^2 N^2 b

(That’s not surprising: two negatives give a positive.) Again, the minus sign in the matrix form

a = T\ \alpha - 2\ \omega\ N\ T\ \nu + \omega^2\ N^2\ r

becomes a plus sign in the cross product form

a = T\ \alpha + 2\ \omega \times \left(T\ \nu \right) + \omega \times \left(\omega \times r\right)

but the N^2 term has the same sign as the double cross product. It is interesting that although the cross product is not associative, it corresponds to an associative matrix product. (Maybe I should remind us that \omega \times \omega = 0\ , so the cross product in the other order is zero: \left(\omega \times \omega\right)\times r = 0\ .

show that N^3 = -N

(We are speaking of the N used to construct the attitude matrix for a rotation, not of the N_a which corresponds to a cross product a \times\ .)

There are a couple of ways to see that. We could just compute N^3, or we could guess why it’s true and confirm the guess. And my guess is that it comes from the characteristic polynomial of N.

So what is the characteristic polynomial of N? Write N as before:

N = \left(\begin{array}{lll} 0 & \text{a3} & -\text{a2} \\ -\text{a3} & 0 & \text{a1} \\ \text{a2} & -\text{a1} & 0\end{array}\right)

The characteristic equation for N (for any N) is

\text{Det} \left(N - \lambda\ I\right) = 0\ .

For this N we get

\lambda ^3+\text{a1}^2 \lambda +\text{a2}^2 \lambda +\text{a3}^2 \lambda

which simplifies to

\lambda ^3+\lambda =0

because (a1, a2, a3) is a unit vector.

That’s the characteristic equation for N; we could solve it for \lambda to find the eigenvalues, if that’s what we wanted. For our purpose, it is more important that the Cayley-Hamilton Theorem says that any matrix satisfies its own characteristic equation, so we have

N^3+N = 0

i.e.

N^3 = -N\ ,

and we will also need

N^4 = -N^2\ .

Alternatively, we could simply compute N^3 for an arbitrary unit vector \hat{a}\ . It’s not a bad idea anyway, especially since I knew before I worked out the algebra that the answer was supposed to be N^3 = -N. Algebra is perilous for me when I know the answer I want to get.

I’ll spare you the explicit computation: it works.

the transition matrix

We know the following. If we describe the rotation axis by a unit vector \hat{a} = \{a1, a2, a3\}

and if we let N be defined as

N = \left(\begin{array}{lll} 0 & \text{a3} & -\text{a2} \\ -\text{a3} & 0 & \text{a1} \\ \text{a2} & -\text{a1} & 0\end{array}\right)

then the attitude matrix for the rotation is given by

A = 1 + \sin (\theta )\ N + (1-\cos (\theta ))\ N^2

I claim that the transition matrix \left(T = A^T = A^{-1}\right) is

T = 1 - \sin (\theta )\ N + (1-\cos (\theta ))\ N^2

First, all I did was change the sign of \theta\ : physically that’s the inverse rotation, hence also the transpose. It changed the sign of the sine but not of the cosine.

Second, we could get the same answer by reasoning instead that transposing A amounts to transposing N, but N is skew symmetric: N^T = -N and N^{2T} = N^2\ .

Third, we can confirm that A and the presumed T are inverses by direct computation.

A\ T = \left((1-\cos (\theta )) N^2-\sin (\theta ) N+I\right) \left((1-\cos (\theta )) N^2+\sin (\theta ) N+I\right)

which expands to

\cos ^2(\theta ) N^4-2 \cos (\theta ) N^4+N^4-\sin ^2(\theta ) N^2-2 \cos   (\theta ) N^2+2 N^2+I

and since N^4 = -N^2\ that simplifies to the identity matrix:

-\cos ^2(\theta ) N^2-\sin ^2(\theta ) N^2+N^2+I = I

the derivative of T

Let \theta = \omega\ t\ (we are rotating at constant angular speed \omega about a constant axis of rotation); then T is

T = 1 - \sin (\omega\ t )\ N + (1-\cos (\omega\ t ))\ N^2

and its derivative is, by direct computation,

\dot{T} = -N \omega\ \cos[\omega\ t]+N^2\ \omega\ \sin[\omega\ t]

We want to show that

 -N \omega\ \cos[\omega\ t]+N^2\ \omega\ \sin[\omega\ t] = - \omega\ N\ T\ .

Where did that RHS come from? From the cross product formula: if the matrix form and the cross product form are the same (and they are!) that’s what \dot{T} must be. So we verify the conjecture by computing the RHS (and using N^3 = -N).

 - \omega\ N\ T = -N \omega  \left((1-\cos (\omega\ t)) N^2-\sin (\omega\ t)   N+I\right)

=

  -\omega  N^3+\omega  \cos (\omega\ t) N^3+\omega  \sin (\omega\ t) N^2-\omega  N

=

  N^2 \omega  \sin (\omega\ t)-N \omega  \cos (\omega\ t)

and that is the LHS.

backstage

When I began working on this, I completely forgot to use the transition matrix; I used the attitude matrix, even while calling it the transition matrix! Still, I think all that did was mess up a sign.

Once I was finally looking at the two equations

v = T\ \nu + \omega \times r

and

v = T \nu + \dot{T}\ \rho

it was easy to conjecture that \dot{T} \approx \omega\ N\ T\ . Attempting to confirm the equality immediately led me to N^3= -N; and the signs worked out beautifully when I remembered to compute and use the transition matrix. Oh, along the way I wrote the utterly false – false, I say! – v = T\ \nu more times than I care to count.

So the order in which I did things was

  • the derivative to within a sign: \dot{T} = \pm\ \omega\ N\ T\
  • N^3 = -N
  • the transition matrix
  • confirm that one multiplication by N corresponds to the negative cross product; and two multiplications has the same sign as the cross product

Not the order in which they needed to be presented.

If it bothers you that this derivation and the resulting matrix forms are new to me, bear in mind that the cross product forms are tried-and-true; you can find them all over the place. They were the touchstone against which my matrix forms were tested.

Next, I hope to work some examples.

rotating coordinate systems: equations

velocity

There are three key equations for rotational mechanics. Let me refer to them as “the equations”. Goldstein writes a general equation for “some arbitrary vector G”:

\left(\frac{\text{dG}}{\text{dt}}\right)_{\text{space}}=\left(\frac{\text{dG}}{\text{dt}}\right)_{\text{rotating}} + \omega \times G

a specific equation for velocity:

v_s=v_r + \omega \times r

(that’s the first equation with G = position vector r) and an equation for acceleration:

a_s=a_r + 2\ \omega \times v_r +\omega \times (\omega \times r)

Let’s look at all that. I will derive these 3, and a general second-derivative equation, but I will have to return and derive some auxiliary facts.

I’m going to take a very special case. Some things should pop off the page.

Let’s suppose we have two coordinate systems, with a common origin, and one of them (new) is rotating CCW wrt the other (old). Their relationship can be described by a transition matrix T which is a rotation. I will also assume that the axis of rotation is fixed, and that the speed of rotation is constant.

  • a common origin
  • new is CCW rotation of axes wrt old
  • fixed axis of rotation
  • constant speed of rotation

It may be that the equations are valid under less restrictive assumptions; conversely, it is certainly true that we can write more general equations for non-constant axis of rotation and/or non-constant speed of rotation.

Let me not beat around the bush. The equations are used for gravity problems near the surface of the earth, with two coordinate systems: an “inertial frame” = “old” at the center of the earth with z axis along the axis of rotation; and a rotating frame on the surface of the earth, with its z-axis pointing “up”. I do not yet see why the equations should be applicable. What bothers me is that the displacement vector between the two origins is a function of time.

Later. For now, the simple – clear – case: the one that hammers home what’s going on, for me anyway.

If we have a vector R, let r and \rho be its old and new components. Then I can write a matrix-vector equation for the components:

r = T\ \rho

Differentiate wrt time.

\dot{r}=\rho\ \dot{T}+T \dot{\rho }

What the heck is \dot{T} ? For now, please trust me: just as the attitude matrix A of a rotation can be written

A = 1 + \sin (\theta )\ N + (1-\cos (\theta ))\ N^2

the transition matrix T can be written

T =  1 - \sin (\theta )\ N + (1-\cos (\theta ))\ N^2

where we change one sign and N is a skew matrix constructed from a unit vector along the axis of rotation.

Now let \theta = \omega\ t\ , compute the derivative of T, and we get (really!)

\dot{T} = - \omega\ N\ T\ .

I will prove all that, starting with that equation for A which we discussed here. In addition, I will prove that - \omega\ N\ T\ corresponds to a vector cross product.

Granting my formula for \dot{T}\ , we have

old & new components:

r = T\ \rho

differentiate:

\dot{r} = T\ \dot{\rho} + \dot{T}\ \rho

substitute for \dot{T}\ :

\dot{r} = T\ \dot{\rho} - \omega\ N\ T\ \rho

define v = \dot{r} and \nu = \dot{\rho}\ :

v = T\ \nu  - \omega\ N\ T\ \rho

and finally (the “large equation” because I don’t know how to color it) T\ \rho = r\ gives me a form I find useful:

v = T\ \nu  - \omega\ N\ r

and I claim that turns out to be (note the sign change)

v = T\ \nu + \omega \times r

(I have changed notation: I started with \omega a scalar; for the cross product form, I need \omega to be the angular velocity vector; in addition, I am using r for what I called the position vector R, because r is almost universally used.)

Let’s look at what we got. First, I do get the usual cross product formula, even though I sometimes, but not always, prefer the N matrix to the cross product. The crucial starting point was that r and \rho are components of a common vector. We compute the time derivatives of those components, calling them v and \nu\ .

The key implication of the large equation is that, by contrast, v and \nu\ are not the components of a common vector. (If they were, we would have v = T\ \nu\ !)

In fact, I can’t even be sure that either set of components v or \nu\ is the derivative of the position vector R.

In practice we may assume that the vector V whose old components are v is the derivative of the vector R. That is, we usually assume that the old basis vectors are constant.

We usually call the old frame “inertial”, but the derivation is perfectly valid for two coordinate systems both rotating wrt a third, with the new one rotating faster than the old. In that case, the equation

v = T\ \nu - \omega\ N\ r

is still valid, even though both coordinate systems are rotating; it still describes the relationship between the derivatives of components, but we didn’t really get the derivative of the vector R: our basis vectors for the two rotating systems are functions of time.

I don’t want to get too far afield here. It is crucial that v and \nu\ are the components of distinct vectors, called the velocities wrt the old and new coordinate systems; in fact, usually called the velocities wrt the space and body, or wrt fixed and rotating, coordinate systems.

Finally, the T\ \nu term in

v = T\ \nu - \omega\ N\ r

merely says that to write equations for the components of vectors, we’d better be using components wrt one coordinate system: \nu\ are new components, and must be transformed to old, T\ \nu\ . In fact, if we go back to

v = T\ \nu - \omega\ N\ T\ \rho

the T\ \rho term tells us the same thing: \rho are new components and must be transformed to old.

I find it enlightening to have components r and \rho, v and \nu\ , where

v = \dot{r}

and

\nu = \dot{\rho}\

I know exactly what “the velocity wrt the body axes” is: it is a vector whose old components are T\ \nu\ and whose new components are \nu = \dot{\rho}\ . Similarly, “the velocity wrt the space axes” is a vector whose old components are v = \dot{r} and whose new components are  T^{-1}\ v = T^T\ v = A\ v\ (where A is the attitude matrix corresponding to the transition matrix T.)

If I have to construct vectors from components, well, I know how to do that.

I think I should warn you that sooner or later, in practice, you’re going to substitute v for T\ \nu\ . Check for that, if ever something’s gone wrong in your calculations. I do.

acceleration

Let’s take another derivative.

start with:

v = T\ \nu - \omega\ N\ r

differentiate:

\dot{v} = T\ \dot{\nu} + \dot{T}\ \nu - \omega\ N\ \dot{r}

let \dot{r}\rightarrow v\ :

\dot{v} = T\ \dot{\nu} + \dot{T}\ \nu - \omega\ N\ v

let \dot{T}\rightarrow - \omega\ N\ T\ :

\dot{v} = T\ \dot{\nu} - \omega\ N\ T\ \nu - \omega\ N\ v

let v \rightarrow T\ \nu - \omega\ N\ r \ :

\dot{v} = T\ \dot{\nu} - \omega\ N\ T\ \nu - \omega\ N\ \left(  T\ \nu - \omega\ N\ r \right)

collect terms:

\dot{v} = T\ \dot{\nu} - 2\ \omega\ N\ T\ \nu + \omega^2\ N^2\ r

define \alpha = \dot{\nu}\ \text{and } a = \dot{v}\ :

a = T\ \alpha - 2\ \omega\ N\ T\ \nu + \omega^2\ N^2\ r

and that, I claim, is equivalent to (note the sign change, one not two)

a = T\ \alpha + 2\ \omega \times \left(T\ \nu\right) + \omega \times \left(\omega \times r\right)

We have the same kind of result. a is the derivative of the old components v, hence a is the second derivative of the old components r of the vector R.

\alpha is the second derivative of \rho\ , hence it is new components, and we convert to the old coordinate system: T\  \alpha\ . \nu is still new components, so we convert to old: T\  \nu\ .

general, quickly

There was nothing special about our initial vector being the position vector. What was special about it was that we had two sets of components for one vector. If G is any vector, and g and \gamma are its components wrt old and new coordinate systems with T the transition matrix, then we proceed exactly as before. (This time I left out all the commentary in order to emphasize just how short the derivations are.)

g = T\ \gamma

\dot{g} = T\ \dot{\gamma} +\dot{T}\ \gamma

\dot{g} = T\ \dot{\gamma} - \omega\ N\ T\ \gamma

\dot{g} = T\ \dot{\gamma} - \omega\ N\ g

\ddot{g} = T\ \ddot{\gamma} +\dot{T}\ \dot{\gamma}- \omega\ N\ \dot{g}

\ddot{g} = T\ \ddot{\gamma} - \omega\ N\ T\ \dot{\gamma}- \omega\ N\ \left(T\ \dot{\gamma} - \omega\ N\ T\ \gamma\right)

\ddot{g} = T\ \ddot{\gamma} - 2\ \omega\ N\ T\ \dot{\gamma} + \omega^2\ N^2\ T\ \gamma

So, we have derived all three of goldstein’s equations (for v, a, and \dot{g}\ ), and one more, for \ddot{g}\ . The equations for v and a, of course, are the special case G = R, the position vector. And the general equations could be written

\dot{g} = T\ \dot{\gamma} + \omega \times g

\ddot{g} = T\ \ddot{\gamma} + 2\ \omega \times \left(T\ \dot{\gamma}\right) + \omega \times \left(\omega \times g\right)

Next, I owe you some supporting detail.

axis and angle of rotation

edit: I have solved the problem of the sign ambiguity. see 28 Sept 2008.

I was going to call this “rotations 2″, but I decided to put the key computations in the name.

from rotation matrix to axis and angle of rotation

Having gotten the rotation (attitude) matrix for mars coordinates here, can we find its axis of rotation? (not of mars, of the attitude matrix!)

Sure, that’s just the eigenvector with eigenvalue 1!

Every 3D rotation has an eigenvalue of 1: there is a line in space which is left fixed under the rotation. Any vector on that line is an eigenvector. It has eigenvalue 1 because the line is not being stretched or compressed: nothing has been done to it. The other two eigenvalues are complex, and so are their eigenvectors, because there are no other subspaces which are left fixed by the rotation.

When I ask Mathematica® to find the eigenvalues and eigenvectors of the mars rotation matrix, the eigenvector with eigenvalue 1 is:

\left(\begin{array}{lll} -0.0361149 & -0.0667194 & 0.997118\end{array}\right)

That’s very nearly the z-axis, as it should be.

There’s just one little problem. The negative of that eigenvector – the negative z-axis – is every bit as good an answer. Any multiple of it is every bit as good an answer. We know the rotation axis, but we don’t know the direction of it. What we really know is the line in space which is left fixed by the rotation.

Call the answer we got “the positive eigenvector” (because its largest component is positive).

What about the angle? Well, there is some coordinate system in which any given rotation is a rotation exactly about the z-axis, and the trace of a (3D) rotation matrix is 2 cos \theta + 1. And the trace is invariant under similarity transformations (because it’s the sum of the eigenvalues, and they are invariant under similarity transformations).

Too convoluted? The trace of any rotation matrix is invariant under similarity transform because it’s the sum of the eigenvalues; therefore it has the value 2 cos \theta + 1 which we would obtain by taking the axis of rotation as a coordinate axis.

So we compute the trace of the mars rotation matrix, set it equal to 2 cos \theta + 1,

2.81888 = 2 \cos (\theta )+1

and solve for the angle…

\{\{\theta \to -0.428857\},\{\theta \to 0.428857\}\}

Here we have another problem. There are two possible answers, but only one of them is right – and I know of no simple way to tell which it is. edit: yes I do. 28 Sept 2008. We have the original rotation, so we can see what it does to a test vector; we have two candidates for axis-angle, and we could work them both out to see which is right.

We don’t actually have 4 distinct answers for axis and angle, but only 2 distinct answers: taking one eigenvector and one angle is the same as taking the other eigenvector and the other angle.

In this case, we know that the negative angle is correct for the eigenvector we chose: we know the mars coordinate system is rotated about something close to the z-axis thru approximately 335° = -25°. If we choose the other unit eigenvector, then the positive angle would be correct.

Let me be clear about that. The attitude matrix for mars’ coordinates is

\left(\begin{array}{lll} 0.90956 & -0.414415 & -0.0310051 \\ 0.414851 & 0.909845 & 0.00899314 \\ 0.0244829 & -0.0210423 & 0.999479\end{array}\right)

Either from its construction or by inspection, it is very nearly a rotation thru a small negative angle about the +z-axis, and the “positive eigenvector”

\left(\begin{array}{lll} -0.0361149 & -0.0667194 & 0.997118\end{array}\right)

is nearly the z-axis; so we should go with the negative angle, too. Here it is in radians… -0.428857 and in degrees… -24.5717.

But in general, we won’t be able to say, it’s almost a rotation about one of the axes. What then? Well, what if we could compute a rotation matrix from a given axis and angle? That would be a wonderful in general, and in this case it would let us construct the 2 possible answers and see which one matches the original. (In fact, it will let me construct the 4 possible answers and confirm that only 2 are distinct, but one would rarely do that.)

Before we do that, however, we should ask if we need it. We do, after all, have the original rotation matrix, for any computational needs. Do we need consistent signs for the axis and angle, or would it suffice to say that the axis of rotation is given by any vector parallel or anti-parallel to

\left(\begin{array}{lll} -0.0361149 & -0.0667194 & 0.997118\end{array}\right)

and that the rotation is very nearly 25°, sign unknown? Maybe it would suffice, maybe not.

Let’s suppose that, for some reason, we need to know the sense of the rotation; we have to know whether the angle is 25° or -25° with the positive eigenvector.

from axis and angle to rotation matrix

Can we construct a matrix for a rotation given the axis of rotation and the angle?

No problem. The following recipe shows up in many different forms. Here’s one. Write the axis of rotation as a unit vector. (that’s what I forget when I rush into this calculation: to make it a unit vector.)

hang on: what kind of rotation? because I wrote my rotation matrices for CCW rotations of coordinate axes, i.e. for CW rotations of vectors, I assume that I have been given a vector for the axis of rotation, and I have been given an angle which is to rotate test vectors CW.

if our given axis is not a unit vector, “make it so”, suppose our unit vector is (a, b, c).

here’s what we do. We recast that vector as a specific skew-symmetric matrix. That is, we define N

N = \left(\begin{array}{lll} 0 & c & -b \\ -c & 0 & a \\ b & -a & 0\end{array}\right)

Then there is a magical formula: the rotation matrix R is given by

R = I + sin \theta\ N + (1-cos \theta)  (N^2),

where I is a 3×3 identity matrix.

Let me illustrate that for a rotation about the z axis (0,0,1). N becomes…

N = \left(\begin{array}{lll} 0 & 1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & 0\end{array}\right)

We get R

R = \left(\begin{array}{lll} \cos (\theta ) & \sin (\theta ) & 0 \\ -\sin (\theta ) & \cos (\theta ) & 0 \\ 0 & 0 & 1\end{array}\right)
=

\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{array}\right) + \left(\begin{array}{lll} 0 & \sin (\theta ) & 0 \\ -\sin (\theta ) & 0 & 0 \\ 0 & 0 & 0\end{array}\right)

+ \left(\begin{array}{lll} \cos (\theta )-1 & 0 & 0 \\ 0 & \cos (\theta )-1 & 0 \\ 0 & 0 & 0\end{array}\right)

which is my CCW rotation of the coordinate axes.

As usual, there is a sign convention: where are the negative signs in N? If you see a version of this formula with all the signs switched in N, you would end up with

\left(\begin{array}{lll} \cos (\theta ) & -\sin (\theta ) & 0 \\ \sin (\theta ) & \cos (\theta ) & 0 \\ 0 & 0 & 1\end{array}\right)

for your rotation about the z-axis. And if you wanted a positive angle to correspond to a CCW rotation of a vector (instead of to a CCW rotation of the axes), that rotation matrix would be right.

I did confirm privately that I recover my answers for Ry and Rx, rotations about the y and x axes, too.

selecting the answer from the candidates

Now that we know how to get from axis-angle to a rotation matrix, we can test our two candidates for the mars rotation.

I want to create two matrices using the positive eigenvector and the two angle solutions. First I want to make sure that the positive eigenvector

\left(\begin{array}{lll} -0.0361149 & -0.0667194 & 0.997118\end{array}\right)

is a unit vector. (yes, it is.)

I create N:

N = \left(\begin{array}{lll} 0 & 0.997118 & 0.0667194 \\ -0.997118 & 0 & -0.0361149 \\ -0.0667194 & 0.0361149 & 0\end{array}\right)

Then I get the rotation matrix using the negative angle -0.428857:

R =\left(\begin{array}{lll} 0.90956 & -0.414415 & -0.0310051 \\ 0.414851 & 0.909845 & 0.00899314 \\ 0.0244829 & -0.0210423 & 0.999479\end{array}\right)

=

\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{array}\right) + \left(\begin{array}{lll} 0 & -0.414633 & -0.027744 \\ 0.414633 & 0 & 0.0150177 \\ 0.027744 & -0.0150177 & 0\end{array}\right)

+ \left(\begin{array}{lll} -0.0904401 & 0.000218206 & -0.00326108 \\ 0.000218206 & -0.0901551 & -0.00602457 \\ -0.00326108 & -0.00602457 & -0.000521231\end{array}\right)

And yes, that is the mars rotation matrix.

Next, try the positive angle (with the same, “positive”, eigenvector). we set N

N = \left(\begin{array}{lll} 0 & 0.997118 & 0.0667194 \\ -0.997118 & 0 & -0.0361149 \\ -0.0667194 & 0.0361149 & 0\end{array}\right)

We get the rotation matrix

\left(\begin{array}{lll} 0.90956 & 0.414851 & 0.0244829 \\ -0.414415 & 0.909845 & -0.0210423 \\ -0.0310051 & 0.00899314 & 0.999479\end{array}\right)

=

\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{array}\right) + \left(\begin{array}{lll} 0 & 0.414633 & 0.027744 \\ -0.414633 & 0 & -0.0150177 \\ -0.027744 & 0.0150177 & 0\end{array}\right)

+ \left(\begin{array}{lll} -0.0904401 & 0.000218206 & -0.00326108 \\ 0.000218206 & -0.0901551 & -0.00602457 \\ -0.00326108 & -0.00602457 & -0.000521231\end{array}\right)

and that is not the mars rotation matrix. We have confirmed that the positive eigenvector and the negative angle are the axis-angle representation of the mars rotation matrix.

In addition, I have privately confirmed that if we take the negative eigenvector, and the two angle solutions, we get the same two rotation matrices; this time the negative eigenvector and positive angle reproduce the mars rotation matrix.

So, we now know two magical things:

  • how to extract the axis and angle of rotation from a 3D rotation matrix;
  • how to build a 3D rotation matrix given the axis and angle of rotation.

The only problem is that we get extra candidates for the answers for the axis and angle of rotation. If it matters, we eliminate the wrong answer the same way we dealt with extra square roots in high school: plug each solution back in to the original equations. In this case, reconstruct two distinct rotation matrices from the possible answers, and see which one matches the original.

casually: comments and connections

I am finding it all too easy to get hung up here on precise details. To heck with it. Let me try to be interesting rather than exact.

I have chosen this for my sign convention for the matrix N corresponding to a unit vector (a, b, c):

N = \left(\begin{array}{lll} 0 & c & -b \\ -c & 0 & a \\ b & -a & 0\end{array}\right)

because I want positive angles to be CW rotations of vectors. I suspect that most presentations of this formula use the opposite convention, so that positive angles are CCW rotations of vectors. (at the risk of boring you to death: it isn’t enough to say that a positive angle is, for example, a CW rotation; you must say which is rotated CW: a vector or a coordinate axis.)

There are versions of the formula

R = I + sin \theta\ N + (1-cos \theta)  (N^2),

using a unit vector and dot and cross products. How does a cross product get in there? That’s what the matrix N does, although my sign makes it correspond to the negative cross-product.

if I apply the matrix N to a vector (x,y,z), I get…

\left(\begin{array}{lll} c y-b z & a z-c x & b x-a y\end{array}\right)

whereas if I take the cross product of (a,b,c) with the vector (x,y,z), i.e. (a,b,c) x (x,y,z), I get

\left(\begin{array}{lll} b z-c y & c x-a z & a y-b x\end{array}\right)

That is, I get the negative answer. OK, I did that by changing the convention for the sign of a rotation matrix. The skew symmetric matrix N can be viewed as a stand-in for the cross product.

Now, i’m going to show you another piece of magic. It’s called the matrix exponential, and for a matrix A it’s defined by a power series that corresponds to the power for the exponential of a real or complex variable:

exp(A) = I + A + \frac{A^2}{2} + \frac{A^3}{3!} + \cdot\ \cdot\ \cdot

It is very well behaved: that series converges for any square matrix A. We will talk more about this real soon, when we get back to quantum mechanics, but what I want to show you is that the rotation matrix is the matrix exponential of N \theta. That is, the formula

R = I + sin \theta\ N + (1-cos \theta)  (N^2)

is the summation of the infinite series exp( N \theta).

Now, we had four possible choices for (N, \theta) but they lead to two possible rotation matrices. First, the positive eigenvector and the positive angle, I ask for MatrixExp[N \theta] and get:

\left(\begin{array}{lll} 0.90956 & 0.414851 & 0.0244829 \\ -0.414415 & 0.909845 & -0.0210423 \\ -0.0310051 & 0.00899314 & 0.999479\end{array}\right)

and that is not the right answer.

then, using the positive eigenvector and the negative angle, I ask for MatrixExp[N \theta] and get:

\left(\begin{array}{lll} 0.90956 & -0.414415 & -0.0310051 \\ 0.414851 & 0.909845 & 0.00899314 \\ 0.0244829 & -0.0210423 & 0.999479\end{array}\right)

and that is, indeed, the mars rotation matrix.

There’s another connection, the inverse operation to the matrix exponential. Take the rotation Rx(\theta) about the z-axis, differentiate wrt \theta, and set \theta to zero. (find the derivative of Rx(\theta) at 0.) we get

dRx = \left(\begin{array}{lll} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & -1 & 0\end{array}\right)

the derivative of Ry(\theta) at 0:

dRy = \left(\begin{array}{lll} 0 & 0 & -1 \\ 0 & 0 & 0 \\ 1 & 0 & 0\end{array}\right)

And the derivative of Rz(\theta) at 0:

dRz = \left(\begin{array}{lll} 0 & 1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & 0\end{array}\right)

The set of skew-symmetric 3×3 matrices is a vector space, and those matrices are a basis. Physicists multiply them by d\theta and call them infinitesimal rotations. OK – so long as we realize they’re not rotations.

Now, take our unit vector (a, b, c) and compute

a dRx + b dRy + c dRz

and that gives us

\left(\begin{array}{lll} 0 & c & -b \\ -c & 0 & a \\ b & -a & 0\end{array}\right)

which is our skew matrix N.

The signs are consistent because I got the infinitesimal rotations from my rotation matrices. This gives me a way of finding N rather than looking it up in my notes.

Let me emphasize that connection.

  • for rotation matrices about the coordinate axes…
  • I can construct a set of skew-symmetric matrices by computing derivatives at 0,
  • and from them I can construct a matrix N for an axis of rotation;
  • then the matrix exponential of N \theta gives me a rotation about N thru \theta.

The matrix exponential appears to be inverse to taking derivatives at 0.

yup.

And to put it in more complicated terms, namely lie groups and lie algebras:

  • our rotations are elements of a lie group;
  • the skew symmetric matrices are elements of its lie algebra;
  • the lie algebra is the tangent space at the origin of the lie group;
  • the matrix exponential maps from the lie algebra to the lie group.

But we can go a long way just treating the matrix exponential as a useful operator, without knowing about lie groups and lie algebras. when we come to study them, we will already be comfortable with the matrix exponential. (And one reason for studying matrix lie groups before lie groups in general is the existence of the matrix exponential.)

rotations 1

here, have some rotations

Let us suppose that someone hands us these three rotation matrices about the z, y, and x axes respectively.

Rz(\theta) = \left(\begin{array}{lll} \cos (\theta ) & \sin (\theta ) & 0 \\ -\sin (\theta ) & \cos (\theta ) & 0 \\ 0 & 0 & 1\end{array}\right)

Ry(\theta) = \left(\begin{array}{lll} \cos (\theta ) & 0 & -\sin (\theta ) \\ 0 & 1 & 0 \\ \sin (\theta ) & 0 & \cos (\theta )\end{array}\right)

Rx(\theta) = \left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & \cos (\theta ) & \sin (\theta ) \\ 0 & -\sin (\theta ) & \cos (\theta )\end{array}\right)

around the z-axis

Now, just what are they? Better, what do they do to vectors?

Answer: viewed as active transformations (“alibi”), they do CW rotations of vectors. The keys: active transformation, CW.

Alternatively, viewed as passive transformations (“alias”), they are attitude matrices for CCW rotations of coordinate axes. There are three keys to that answer: passive transformation, CCW, and attitude matrix. (in more general cases, one of which I will look at below, they are inverse transition matrices instead of attitude matrices.)

We are looking specifically at rotations. (how do I know that? They are orthogonal matrices with determinant +1.)

In a sense, the active transformation is more fundamental when i’m simply given a matrix. How do I know which is the old coordinate system and which is the new? “Old or new coordinate system” only makes sense for the passive interpretation, but we always have the active interpretation: applied to an old vector, the matrix gives us a new one. In this case, Rz rotates a vector CW about the z axis. That’s what it does. Its input is the old vector, its output is the new vector.

If you want, remember that a matrix has an implicit coordinate system – it’s the representation of a linear operator wrt some basis, under the active interpretation – and if we’re given the matrix then we’re given its implicit coordinate system. When we investigate a matrix by applying it to a vector, the matrix and the vector are in one coordinate system, and we are about to compute something new.

I hate the feeling that i’m beating a dead horse, but…. it’s one thing to figure out what a matrix (such as Rz) does, when I have no additional information. It’s something else entirely if someone hands me a matrix (such as Rz) and says that it’s the transition matrix.

Let’s just get on with it. We apply Rz to some vector. Here’s a choice I like. Consider Rz…

\left(\begin{array}{lll} \cos (\theta ) & \sin (\theta ) & 0 \\ -\sin (\theta ) & \cos (\theta ) & 0 \\ 0 & 0 & 1\end{array}\right)

we get…for a +45° rotation… applied to a +45° vector…

\left(\begin{array}{l} \sqrt{2} \\ 0 \\ 0\end{array}\right)\ = \left(\begin{array}{lll} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 \\ -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 \\ 0 & 0 & 1\end{array}\right)\ \left(\begin{array}{l} 1 \\ 1 \\ 0\end{array}\right)

The result of the rotation – the result of applying Rz to a 45° vector – is unambiguously a vector which lies on the x-axis. But we have two choices for “the x-axis”. Viewed as an active transformation, the rotation maps that 45° vector to the stationary x-axis, our given vector has been rotated 45° CW to lie on the old x-axis. Alternatively, viewed as a passive transformation, the x-axis has been rotated 45° CCW to lie under our stationary 45° vector. We either have old vector (1,1,0) and new vector \left(\sqrt{2},\ 0,\ 0\right)\ , or we have old components (1,1,0) and new components \left(\sqrt{2},\ 0,\ 0\right)\ .

Under the passive interpretation I wrote a transformation, a rotation, from old to new, so Rz is the inverse transition matrix, i.e. the attitude matrix. The rows of an attitude matrix are the (old components of) the new basis vectors, so the first row of Rz(45°) is the old components of the new x-axis:

\left\{\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}},0\right\}

Those are both positive, so the new x-axis is in the first quadrant. As we said, the (x and y) coordinate axes have been rotated CCW.

Rz is either the attitude matrix (inverse transition matrix) for a CCW rotation of the x and y axes (passive transformation), or it is a CW rotation of vectors about the z-axis (active transformation).

I emphasize that there is nothing sacred about my choice of signs in Rz. since my recent and current applications were to planetary orbits and airplane coordinates, my choice gives me positive angles for the customary CCW rotations of coordinate axes to get planetary and airplane coordinates. If someone says that a rotation about the z axis is given by

\left(\begin{array}{lll} \cos (\theta ) & -\sin (\theta ) & 0 \\ \sin (\theta ) & \cos (\theta ) & 0 \\ 0 & 0 & 1\end{array}\right)

… well, you know enough to show that this represents either a CCW rotation of a vector, or a CW rotation of the x and y axes.

be careful

A positive angle in my Rz specifies a CCW rotation of the coordinate axes or a CW rotation of a vector. That can be confusing because we often denote the coordinate axes by unit vectors, but rotating a basis vector is not the same as rotating the coordinate axis it lies on.

I think I was confused by that when I was an undergraduate. It seems to me that when I was first dealing with rotations, I used to try to figure out the effect of a rotation matrix by applying it to a unit vector on the x-axis. No problem so far. But I think I thought that the image of the \hat{i} vector would be the new x-axis. It’s hard to be sure what I was really doing all that long ago, but what i’m sure of is that more often than not, I was off by the sign of the angle.

But here’s the thing. What a rotation does to the unit \hat{i} vector is not what it does to the x-axis. The active interpretation rotates the unit vector one way, while the passive interpretation rotates the x-axis the other way.

Even today, that seems a little strange, but there it is.

around the y or x

Similarly, for Ry(, a rotation about the y axis…

\left(\begin{array}{lll} \cos (\theta ) & 0 & -\sin (\theta ) \\ 0 & 1 & 0 \\ \sin (\theta ) & 0 & \cos (\theta )\end{array}\right)

we get…for a +45° rotation… applied to a +45° vector in the xz-plane…

\left(\begin{array}{l} 0 \\ 0 \\ \sqrt{2}\end{array}\right)\ = \left(\begin{array}{lll} \frac{1}{\sqrt{2}} & 0 & -\frac{1}{\sqrt{2}} \\ 0 & 1 & 0 \\ \frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}}\end{array}\right) \left(\begin{array}{l} 1 \\ 0 \\ 1\end{array}\right)

maps our test vector to the z-axis (active) or rotates our z-axis to lie under the vector (passive), i.e. we have done a CCW rotation of the x and z coordinate axes about the y-axis. Ry is the attitude matrix, and our new z-axis is given by the 3rd row of Ry:

\left\{\frac{1}{\sqrt{2}},0,\frac{1}{\sqrt{2}}\right\}

(If you can visualize that rotation, it’s the z-axis that is rotated into the first quadrant of the xz-plane; that’s why the signs are positive in the third row of Ry instead of in the first.)

Finally, for Rx, a rotation about the x-axis…

\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & \cos (\theta ) & \sin (\theta ) \\ 0 & -\sin (\theta ) & \cos (\theta )\end{array}\right)

we get… from a 45° rotation applied to a 45° vector in the yz-plane…

\left(\begin{array}{l} 0 \\ \sqrt{2} \\ 0\end{array}\right)\ = \left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ 0 & -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\end{array}\right) \left(\begin{array}{l} 0 \\ 1 \\ 1\end{array}\right)

which lies on the positive y-axis, so we have rotated the y-axis CCW. Once again, Rx is an attitude matrix, and the new y-axis is given by the 2nd row:

\left\{0,\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}}\right\}

not a rotation

Let’s take a simple example which is not a rotation: leave x alone, double y. that is, I want a matrix whose effect on any input vector is to double the y-component. That matrix is

L = \left(\begin{array}{ll} 1 & 0 \\ 0 & 2\end{array}\right)

Applied to an arbitrary vector (a,b)… we get

\left(\begin{array}{l} a \\ 2 b\end{array}\right)

Or, applying L to the basis vectors, we get its columns (1,0) and (0,2). Those are the active interpretation. What does that say about the passive interpretation?

Looking at it directly, L must be the inverse transition matrix, because L maps old to new. Then the transition matrix P = L^{-1} is

P = \left(\begin{array}{ll} 1 & 0 \\ 0 & \frac{1}{2}\end{array}\right)

Then the attitude matrix A = P^T is the transpose transition, so (although it’s trivial for a diagonal matrix)

 A = P^T = \left(\begin{array}{ll} 1 & 0 \\ 0 & \frac{1}{2}\end{array}\right)

which says what we know: the y component has doubled, so the new y basis vector – which is the second row of A – has half the length.

Instead, we could have taken that as our starting point: the y component has doubled, so the new y basis vector has half the length. The attitude matrix must be

A = \left(\begin{array}{ll} 1 & 0 \\ 0 & \frac{1}{2}\end{array}\right)

The transition matrix P = A^T is the transpose attitude (again, trivial in this case):

P = A^T = \left(\begin{array}{ll} 1 & 0 \\ 0 & \frac{1}{2}\end{array}\right)

and then the inverse transpose matrix is…

P^{-1} = \left(\begin{array}{ll} 1 & 0 \\ 0 & 2\end{array}\right)

and that is, as it should be, our original matrix L.

mars

We discussed planetary and solar system coordinate systems just a few days ago(here), and I said I’d show you this. Bates et al gives me orbital elements for mars; what I need are

  • \Omega = 49.322°,
  • i = 1.85°,
  • “longitude of perihelion” \Pi = 335.497°,

and they tell me that

\Pi = \Omega + \omega, so \omega = \Pi - \Omega\ .

(and the orbital elements are in bates et al. but not in prussing; the books are not substitutes for each other.)

Now, I want a rotation about z thru \Omega\ :

R1 = \left(\begin{array}{lll} 0.651807 & 0.758385 & 0 \\ -0.758385 & 0.651807 & 0 \\ 0 & 0 & 1\end{array}\right)

Then a rotation about x thru i:

R2 = \left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 0.999479 & 0.032283 \\ 0 & -0.032283 & 0.999479\end{array}\right)

Then a rotation about z thru \omega = \Pi - \Omega\ :

R3 = \left(\begin{array}{lll} 0.278572 & -0.960415 & 0 \\ 0.960415 & 0.278572 & 0 \\ 0 & 0 & 1\end{array}\right)

So the combined rotation is…

Rm = R3 R2 R1 = \left(\begin{array}{lll} 0.90956 & -0.414415 & -0.0310051 \\ 0.414851 & 0.909845 & 0.00899314 \\ 0.0244829 & -0.0210423 & 0.999479\end{array}\right)

(note that the order is R1, then R2, then R3.)

That’s the attitude matrix relating mars to the solar system, with the solar system as old coordinates. The third row is mars’ z-axis (wrt the solar system). it is very nearly (0,0,1), as it should be: the angle of inclination was less than 2°. Let’s draw mars’ x and y axes in red.

That makes sense. If we neglect the rotation about x thru 1.85°, then we have two rotations about z thru a combined angle of \Omega + \omega = \Pi = 335°, i.e. -25°: the new x-axis was rotated CCW almost all the way around.

Next time, we’ll see how to find the axis of rotation and angle of rotation for the final rotation matrix for mars, and, conversely, how to find the rotation matrix given the axis and angle.