Quaternions and Rotations 1

What I’m heading for is to compute the following eight transformations (shown by arrows) among representations of a 3D rotation.

Let me begin by talking about rotations generally (see rotations 1 for more detail).

Which way are we going?

First, let’s remember that there must be at least one convention. If we say that we are going to “do a rotation of 45° CCW about the z axis” — we have not actually specified the rotation.

Are we rotating the coordinate system or are we rotating a vector? Believe me, it matters.

My convention, and Kuipers (see my bibliographies page), — since we are almost always rotating the coordinate system — to write the rotation matrix as

Rz(\theta) = \left(\begin{array}{ccc} \cos (\theta ) & \sin (\theta ) & 0 \\ -\sin (\theta ) & \cos (\theta ) & 0 \\ 0 & 0 & 1\end{array}\right)

What I have done is to use a positive angle for a CCW rotation of the coordinate system. (This is Kuipers, too, e.g. p. 90.) The other two rotation matrices, using the same convention, are

Rx(\theta) = \left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & \cos (\theta ) & \sin (\theta ) \\ 0 & -\sin (\theta ) & \cos (\theta )\end{array}\right)

Ry(\theta) = \left(\begin{array}{ccc} \cos (\theta ) & 0 & -\sin (\theta ) \\ 0 & 1 & 0 \\ \sin (\theta ) & 0 & \cos (\theta )\end{array}\right)

In particular, if I apply a rotation of 45° about the z-axis to the vector (1,1,0), I get

(\sqrt{2}, 0, 0)

i.e. a vector of length Sqrt[2] lying on the x-axis. The passive interpretation is that I have left the vector alone and rotated the coordinate system by 45° CCW, and the given vector lies on the new x-axis. The active interpretation is that I rotated the vector 45° CW, and the new vector lies on the old x-axis.

But a CW rotation is usually spoken of as a negative angle. Using the axis and angle of rotation, I would have to say that I have rotated the vector thru -45° about the z-axis.

For more detail about all that, look here.

If your primary concern is the rotation of vectors, you may wish to use the rotation matrix

Rz(-\theta) = \left(\begin{array}{ccc} \cos (\theta ) & -\sin (\theta ) & 0 \\ \sin (\theta ) & \cos (\theta ) & 0 \\ 0 & 0 & 1\end{array}\right)

(and similarly for the other two axes) so that a positive angle corresponds to a CCW rotation of a vector.

Perhaps more importantly, if you wish to use the following drawing and description, then then you need to use that most recent rotation matrix. (I used \phi\ instead of \theta\ just because it was readily available in Adobe Illustrator in the Verdana font.)

That is, if you want a positive angle \phi\ to effect the CCW rotation of V to W (about the axis A), then you need the other convention instead of mine.

Everything I did in my prior axis/angle posts used my convention. (Here’s the first, and here’s the second.) And I will continue to use my convention, but I think I will explicitly show you all the details of the other convention, too, in a separate post.

So, a positive angle is a CCW rotation of the coordinate system, hence a CW rotation of a vector. Note that the reason for emphasizing the convention is so that I don’t have to remember to add “of the coordinate system”.

Having reminded you that a full description of a rotation must specifiy not only angle and axis but also whether a positive angle corresponds to a CW or a CCW rotation of the coordinate system (or, conversely, of a vector), let me make another point.

An infinite number of valid solutions

Consider a rotation thru +270° about the z-axis:

Rz(270{}^{\circ}) = \left(\begin{array}{ccc} 0 & -1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1\end{array}\right)

And consider a rotation thru -90 about the z-axis:

Rz(-90{}^{\circ}) = \left(\begin{array}{ccc} 0 & -1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1\end{array}\right)

Gee, those two rotations, viewed as physical actions, are very different motions, but they have exactly the same rotation matrix.

Ho hum. We know that. There are an infinite number of different angles that lead to any given rotation matrix.

What that means is that the rotation matrix does not contain the information about “how we got there”. It is futile and foolish to expect the rotation matrix to tell you whether the angle was “really” positive or “really” greater than 180°, or any of the infinitely many other possibilities. The matrix of trig functions just doesn’t have any distinquished solution.

True, to get any answer at all, we must set default choices. And be prepared to over-ride them.

One way to do that, for example, is to default the answer to an angle between (-180°, 180°]. For most of my purposes, I expect that will work just fine. Another way to do that, not quite so obvious, is to call inverse trig functions — which have their own default restrictions.

It may well be that there are additional difficulties when we come to look at Euler angle decompositions of rotations… but the non-uniqueness of the angles is a problem even when we have a single rotation about one of the coordinate axes… i.e. a single Euler angle rotation.

Do not be in a hurry attribute to an Euler angle decomposition a problem which may have a much simpler cause.

If we don’t like the angle we get, we need to change our answer to the question: Well, which angle do we want the computer to choose?

The computer doesnt choose the angle — we do. If we know enough to say, “That’s the wrong angle!”, then we knew enough to have told the computer, “Pick that one instead.”

Using quaternions to rotate a vector

With that major caveat out of the way, let me start by showing you the magic of quaternions for rotations. Then I will back up and present all four representations. (I introduced quaternions here.)

To be specific, let me show you how to use quaternions to perform that same 45° rotation of that same vector. We had a vector:

V = {1,1,0}.

We have a rotation of 45° about the z-axis. Written as an angle and axis of rotation, that’s

Next we create a pure quaternion from the vector to be rotated:

We have a pure quaternion whose imaginary part is {\sqrt{2},\ 0,\ 0\ }… that is, the same answer we got before.

Getting a composite rotation using quaternions

There’s another related piece of magic. We just saw how to rotate a vector using a quaternion. Suppose I have two rotation matrices, and I want to find the composite rotation.

Suppose for example that the first is a rotation thru 90° about the z-axis:

m1 = Rz(90{}^{\circ}) = \left(\begin{array}{ccc} 0 & 1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & 1\end{array}\right)

The second is a rotation thru 90° about the (new) x-axis:

m2 = Rx(90{}^{\circ}) = \left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & -1 & 0\end{array}\right)

We apply them to our vector, getting

m2 m1 V = {1,0,1}.

We could compute the angle-axis for the composite rotation (I’ll remind you how to do that, later):

… which tell us that the composite is a rotation thru 120° about the axis (1, 1, 1).

We could use quaternions instead. First, get the angle-axis representations from the two given rotation matrices. (Or I could have written these directly, since I did in fact specify the rotations by angle and axis. But instead, calling my functions, I get to see that my code work correctly because it computes the given angles/axes.)

… which agrees that the combination of the two rotations is a rotation about (1,1,1) thru 2 Pi/3.

You might note that I used m1 and m2 for the initial matrices, a1 and a2 for the corresponding angle/axis representations, q1 and q2 for the corresponding quaternion representations… and a3 and q3 for the combined rotation. I did not explicitly compute the combined rotation matrix, but you should guess that I would have called it m3.

Now let’s start talking about the four representations.

Rotation matrix and angle/axis of rotation

We have already discussed the transformations between a rotation matrix and its angle/axis representation: from angle/axis to rotation matrix here, and from rotation matrix to angle/axis here.

Let me summarize them.

If we have a rotation (as usual, of the coordinate system) thru an angle \theta\ about an axis of rotation A, then we

  • construct a unit vector r from A, r = (a, b, c);

  • construct a skew-symmetric matrix N from the components a, b, c;

  • construct the rotation matrix R from N.

To be specific,

N = \left(\begin{array}{ccc} 0 & c & -b \\ -c & 0 & a \\ b & -a & 0\end{array}\right)

and

R = I + sin\ \theta\ N + (1-cos\ \theta)\  N^2\

There are no subtleties in that direction: we started with an arbitrary angle, we compute its sine and cosine.

Oh, one last note. Once I’m writing a small subroutine to do this for me, I might as well let it compute the unit vector: I take any angle and any vector as my input, and my Mathematica code will automatically scale the vector to a unit vector.

What about the other way? Given a rotation matrix M, how do we get the axis and angle of rotation?

The axis of rotation is straight-forward: it’s that unique eigenvector, associated with eigenvalue 1, of the rotation matrix.

The angle of rotation was a little more difficult. Once we know the eigenvector (a, b, c), we have the rotation matrix R with unknown \theta\ , which is equal to the original matrix M… the trace of M (and R) is equal to 1 + 2\ cos\ \theta\ .

(Right? The given rotation matrix can be written in terms of its angle and axis, and once we have the axis, we’re just looking for the angle which makes R = M.)

Then R – R^T is very simple, and equal to M – M^T:

R - R^T = \left(\begin{array}{ccc} 0 & 2 c \sin (\theta ) & -2 b \sin (\theta ) \\ -2 c \sin (\theta ) & 0 & 2 a \sin (\theta ) \\ 2 b \sin (\theta ) & -2 a \sin (\theta ) & 0\end{array}\right)

Computationally, it is possible for two of a, b, c to be zero, so we need to be ready to compute Sin \theta\ using any one of the terms 2c\ Sin \theta\ , -2b\ Sin \theta\ , or 2a\ Sin \theta\ .

(As I said in the second angle/axis post, instead of getting the eigenvector, we could in fact construct the vector (2 a\ Sin \theta\ , 2 b\ Sin \theta\ , 2 c\ Sin \theta\ )\ from the off-diagonal terms, and then make it a unit vector to get a, b, c. By calling for the eigenvector with eigenvalue 1, instead, I get confirmation that the input was a rotation matrix. And I’m beginning to think that I am unreasonably fond of the eigenvector, and should drop the calculation in favor of using the off-diagonal terms.)

Having both the sine and the cosine, we want an angle. Well, there are an infinite number of them, all differing by a multiple of 360°. As I said at the beginning, pick one — and if you know enough to decide it’s not what you want, you should know enough to make it right, after you see the default answer.

My default is to pick an angle in the half-open interval (-180°, 180°], i.e. (-\pi, \pi]\ . It happens to be what Mathematica’s two-argument arc tangent function returns, but I had a different reason in mind. (Later.)

I’ve actually shown you a couple of those. We had a couple of rotation matrices m1, m2… and their product m2.m1, for which we computed angle-axis representations a1, a2m a3 respectively:

Let’s recompute the matrices, from their angle & axis representations, and compare the new computations to the originals. Each of the following three images shows the new computation, followed by the corresponding matrix.

Here’s a1 converted to a matrix, followed by the original m1.

Now a2 converted to a matrix, followed by m2…

Finally, a3 converted to a matrix, and compared with m3…

We see that the new computations agreed with the original matrices. Those are hardly a thorough set of tests, but we’ll see some more down the road.

Now that we can move between rotation matrices and angle/axis, let’s move between angle/axis and quaternions.

Quaternions and angle/axis of rotation

Let me start by saying that, in a very real sense, the quaternion representation is a tooled-up version of the angle/axis representation. While the angle/axis representation carries a lot of intuition, we can’t compute with it, not directly. The quaternion representation replaces the angle by the cosine of the half-angle, and changes the size of the unit vector representing the axis. A slight change, and suddenly we can do meaningful computations.

As usual, if we have the angle, going to the trig is unambiguous. Given any rotation angle \theta\ and a unit vector axis of rotation (a, b, c), we construct the unit quaternion

( cos (\theta/2), a\ sin (\theta/2), b\ sin (\theta/2), c\ sin (\theta/2) )\ .

The half angle \theta/2\ is crucial, as is the unit vector (a, b, c).

As before, once I decided to code this up, it made sense to let Mathematica compute the unit vector (a, b, c) by scaling whatever vector I chose to give it.

For example, we found a3:

OK, let’s go backwards, from quaternion to axis/angle. As usual, given trig functions and seeking an angle, we are going to have to choose one angle out of the infinite number of possibilities. And, as usual, if we know enough to say it’s the wrong angle, then we should change it to the correct angle.

Given a unit quaternion

(A, B, C, D)

we write its polar form

(cos\ \alpha, b\ sin\ \alpha, c\ sin\ \alpha, d\ sin\ \alpha)\

and let \alpha =\ arc cos A. Mathematica will return \alpha\ between 0 and Pi, and it’s the half-angle, so we double it. But if the result is larger than Pi, I would prefer a negative angle, so I would subtract 2 pi. That’s my default choice.

And I may have said too much too fast. 2\ \alpha\ is the rotation angle, but I force it to be in the half-open interval (\pi,-\pi]\ .

Having gotten \alpha\ , I get sin\ \alpha\ , and then I divide B, C, D by sin\ \alpha\ to get b, c, d.

Ha! You know, that is just what I did – and I’m going to change it… instead make a unit vector out of (B, C, D) if possible… so I never try dividing by zero. Writing these posts is very, very good for me.

So. We can move between the quaternion representation and the angle/axis representation.

From Euler angle sequence to matrix or quaternion

We have taken care of four of the eight transformations. Two of the remaining four are straight-forward.

To specify an Euler angle sequence of, say, XYZ, with angles \phi, \theta, \psi\ , is to specify that the rotation matrix M can be written as the product

M = Rz(\psi)\ Ry(\theta)\ Rx(\phi)\ ,

where the convention I’m used to is that “XYZ” means rotate about the X-axis first.

Note that an Euler angle sequence gives us the rotation order and the angles, rather than giving us the three rotation matrices Rz(\psi), Ry(\theta), and Rx(\phi)\ — which would lose information, namely the angles.

Therefore, getting from an Euler angle sequence to the rotation matrix is just to multiply three matrices. (Some of them could be identity matrices.)

Similarly, getting from an Euler angle sequence to the corresponding quaternion is just to multiply three quaternions.

Just to be clear…. instead of the matrices Rz(\psi)\ Ry(\theta)\ Rx(\phi)\ , we would have three quaternions (one for each axis, look at their imaginary parts)

qz = Quaternion[\frac{\psi}{2}\ , 0, 0, 1]

qy = Quaternion[\frac{\theta}{2}\ , 0, 1, 0]

qx = Quaternion[\frac{\phi}{2}\ , 1, 0, 0]

and the quaternion Q corresponding to the rotation matrix M is:

Q = qx qy qz.

From matrix or quaternion to Euler angle representation

That leaves two transformations: from a rotation matrix, or from a quaternion, to an Euler angle sequence.

The challenge is to find a specified Euler angle sequence for a given matrix. That is, we have a rotation matrix and a list of three coordinate rotations, and we want to find three angles that make it work; given M and “XYZ” find the angles for each of Rz(\psi)\ Ry(\theta)\ Rx(\phi)\ . As usual, the angles are not unique. But we would end up trying to solve 9 equations in 3 unknowns.

On the other hand, if I were to try to find an Euler angle sequence for a quaternion, then I would have 3 equations in 3 unknowns. Sounds simpler.

(The set of equations depends on the specific Euler angle sequence, and there are 12 possible sequences. So far I have coded up the two most common in my experience.)

That is actually what prompted all of these posts about quaternions: my hope, apparently realized, to find any Euler angle sequence for a given rotation matrix by finding, instead, the Euler angle sequence for its quaternion representation.

Think about it: if I find the 7th transformation I don’t have to find the eighth.

That is, I’m going to construct the red arrow (a function from matrix to Euler angle sequence) by composing the three blue arrows (functions). Yes, I have yet to describe the last blue arrow, from quaternion to Euler angle sequence. Next technical post, I expect.

But not this day.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: