CORDIC (COordinate Rotation DIgital Computer)
Simplified Discussion



I freely admit this page borrows very heavily from an article by Pitts Jarvis written in the early '90's in Dr. Dobb's Journal.

I had been interested in CORDIC for a long time, collecting various papers and articles without gaining much true understanding. Either they were solely mathematical with no direction on how to implement algorithms in hardware, or they did cover hardware but always had key logic somehow missing.

Finally I ran across Jarvis's paper online early this Millennium. Suddenly in one place was as much information as I needed in order to implement CORDIC in the FSA™ instruction set.

(Ironically, I soon found a hard copy of same in my files. Had the info all along. Did I even look at it when I copied it? Maybe it's simply that the timing wasn't right, then.)

Anyway, this is my attempt to pick the highest points from CORDIC theory and lay them out shortly and simply — crappy graphics and all.

Finally, here is the source link to Mr. Jarvis's original paper. It exists other places around the Net as well.

— Bob Loy



Many of the really useful transcendental functions used in engineering and other high tech disciplines come from, or are related to, trigonometry.

You remember about points on a graph, and angles, and vectors:


           ^
         y |     _ (x,y)
           |     /|
           |    /
           |   /
           |  /
           | /
           |/)a
           --------->
                  x


CORDIC algorithms compute trig functions by rotating the coordinate system — like twirling the graph paper underneath the graph.

The basic equation to rotate a vector [x,y] through an angle (a) is:

           ---   ---     ---                 ---
           |       |     |                     |
           |   x   |     |  x cos a - y sin a  |
           |       |     |                     |
        Ra |       |  =  |                     |
           |       |     |                     |
           |   y   |     |  x sin a + y cos a  |
           |       |     |                     |
           ---   ---     ---                 ---

If you understand matrix algebra, and I barely remember it myself, you may already be thinking that there are a lot of multiplications with complex numbers here — and CORDIC is supposed to be fast! Not to mention the ultimate purpose of the CORDIC algorithm is to compute transcendentals, and here we already have transcendentals in the formula. Obviously a lot of simplification is due.

One simplification is to lay the beginning vector along the X axis by letting x = 1 and y = 0,

           ^
           |
           |
           |
           |
           |
           ======>--->
               (1,0)

yielding small, easily represented numbers.

Another is to pick the angle of rotation such that the multiplications will be in base two and so can be accomplished by shifting. Now we're getting down to the ones & zeros we know and love!

Of course, there's always some devilish details which fight back against elegant simplicity. CORDIC computations require a few stored constants to work accurately
(I can't tell you how disappointed I was when I first found this out. I was hoping CORDIC could spin out functions from first principles in pure hardware. Oh, well), but the constants are few, tracing back to that matrix equation above.

One more simplification comes from factoring out the cosine; this has the effect of turning the leftover sines into tangents. Thus the CORDIC algorithm works through an array of arctangents corresponding to the successive angles the "graph paper" will be rotated by, with one arctan value stored for each bit of accuracy to be computed.

The factored out cosine also becomes a constant, but it is dealt with very elegantly as we will soon see.

The algorithm itself comes down to three variables, x, y & z. The idea is to load three registers with the variables, crunch them awhile with shifts and adds (or subs) and find a finished transcendental in one, or sometimes two of the regs.

The numerical crunching is all about spinning the coordinates underneath the vector, beginning with a large jump, then ever smaller with each shift, adding or subtracting depending on whether the current value of one of the registers (such as z) is positive or negative.

It is almost as if the X axis and the final value are connected by springs and the X axis oscillates with smaller and smaller values until it finally damps out to the correct one.

On paper, the computation is often represented for the variables as

         [x, y, z] --> [some value, some value, some value]

where on the left side are the start parameters and on the right the results. For example, loading our flat vector x & y, along with angle a, in the regs would be written as

         [1, 0, a] --> [some value, some value, 0]

The actual crunching typically reduces 'a' to zero, and when that is reached the other registers have also changed — for the better.

Here's the description for computing sines and cosines, (both at once!):

         [K, 0, a] --> [cos a, sin a, 0]

Wait a minute, what's that K doing in there? Remember the cosine constant? From factoring the matrix on paper one would think it'd be waiting around until the very end, to hit the algorithm results with a nasty, time-consuming multiplication. But it turns out sticking it in the x register at the beginning works just fine.

CORDIC can be used to compute all manner of functions such as multiply or divide, through circular or hyperbolic trig functions, through logs and roots to ... it depends on the values loaded initially into the three registers as well as which reg is given the pos / neg test. Here's an example,

         [K, K, a] --> [e^a, e^a, 0]

where the final result gives two equal variables, each being 'e' (2.7182818284...) raised to the angle power.

There have been many ways developed to generate transcendental functions with computing machinery. CORDIC, the method that made the scientific calculator a reality, remains one of the fastest and most general approaches, particularly if there is no hardware multiplier available.




Jarvis even provided a long C language program detailing over a dozen CORDIC functions. Here's his code for generating cosine and sine:


/* C code from Pitts Jarvis */
/* n=29 lets us represent numbers in the interval [-4, 4) in 32 bit long. */
#define fractionBits 29
#define Delta(n, Z) (Z >= 0) ? (n) : -(n)
unsigned X, Y, Z;

Circular(x, y, z) long x, y, z;
{
  int i;

  X = x; Y = y; Z = z;
  for (i = 0; i <= fractionBits; ++i)
  {
    x = X >> i;
    y = Y >> i;
    z = atan[i];
    X -= Delta(y, Z);
    Y += Delta(x, Z);
    Z -= Delta(z, Z);
  }
}

Just below is the screen shot of the FSA™ Simulator from the home page. Since I was working within a sixteen bit word, my "fractionBits" is 13 rather than 29.

The top word on the rightmost memory display is the 45 degree (pi/4 radians) input value, followed by the cosine constant from the formula, followed by the arctan splits. Notice that the first arctan value is also pi/4 radians. See also that the 'z' variable (the green register above P1) has indeed been reduced to zero.


Screen shot of the FSA™ Simulator thirteen cycles into a CORDIC computation of sine & cosine. Input: 45 degrees. Output: cosine is at top of P1 at level C3, sine is just below it. Both values are .70703125 in a fixed point format with thirteen bits to the right of the binary point.



Quick Test Input

I later extended the CORDIC simulation shown as the screen shot above to allow whole number degrees as input by using the following radian conversions for angles of 1, 2, 4, 8, 16, 32 & 64 degrees:
0000000010001111
0000000100011101
0000001000111011
0000010001110111
0000100011101111
0001000111011111
0010001110111111
Of course, whole number degrees don't provide much resolution, but I mainly was looking for an easy way to test the results. Finer accuracy can come later.

The data is in a 3.13 fixed point format, eg, the bottom value would be read as: 001.0001110111111. I used Von Neumann rounding on each value, a fast, (no carries necessary) central tendency rounding method achieved simply by forcing a computation's LSB to one.

Once I convert the degree input into binary, I shift it to the right one bit at a time while stepping thru the radian array. If the degree's LSB tests as a zero, the current radian value is skipped, otherwise it is added to a cumulative total. Crude, but fast and doesn't take up much space.

— Bob Loy


GenAPro Home