comp.dsp | Fast Cube Root Using C33

I need a fast cube root routine.  I started with the obvious
http://metamerist.com/cbrt/cbrt.htm
What is key is getting the initial estimate using the Kahan bit hack
but that only works with IEEE floating point and not with TI floating
point.  I can't believe that I am the first to do this since the C33
is ancient.  The obvious solution is to convert the TI float to the
IEEE float, apply the hack, and then convert back to TI float but
there must be a better way.   Actually doing another iteration is
probably faster.  The problem is that sign of the mantissa is at bit
23.

Does anyboy know of a link?  Thanks.

Peter Nachtwey

Reply by ●February 27, 20092009-02-27

On Feb 26, 10:04 pm, pnachtwey <pnacht...@gmail.com> wrote:
> I need a fast cube root routine.  I started with the obvioushttp://metamerist.com/cbrt/cbrt.htm
> What is key is getting the initial estimate using the Kahan bit hack
> but that only works with IEEE floating point and not with TI floating
> point.  I can't believe that I am the first to do this since the C33
> is ancient.  The obvious solution is to convert the TI float to the
> IEEE float, apply the hack, and then convert back to TI float but
> there must be a better way.   Actually doing another iteration is
> probably faster.  The problem is that sign of the mantissa is at bit
> 23.
>
> Does anyboy know of a link?  Thanks.
>
> Peter Nachtwey

Did you evaluate the Turkowski algorithm at the page you linked? It
uses a slightly more complicated seeding algorithm for the initial
guess for Newton's algorithm, but it's still relatively simple (a
quadratic). I don't know what TI's floating-point format is, but I'd
guess it might be more expedient to evaluate one quadratic polynomial
than to perform the bit manipulations needed to convert back and forth
between formats.

Jason

Reply by pnachtwey ●February 27, 20092009-02-27

On Feb 26, 8:19&#4294967295;pm, cincy...@gmail.com wrote:
> On Feb 26, 10:04 pm, pnachtwey <pnacht...@gmail.com> wrote:
>
> > I need a fast cube root routine. &#4294967295;I started with the obvioushttp://metamerist.com/cbrt/cbrt.htm
> > What is key is getting the initial estimate using the Kahan bit hack
> > but that only works with IEEE floating point and not with TI floating
> > point. &#4294967295;I can't believe that I am the first to do this since the C33
> > is ancient. &#4294967295;The obvious solution is to convert the TI float to the
> > IEEE float, apply the hack, and then convert back to TI float but
> > there must be a better way. &#4294967295; Actually doing another iteration is
> > probably faster. &#4294967295;The problem is that sign of the mantissa is at bit
> > 23.
>
> > Does anyboy know of a link? &#4294967295;Thanks.
>
> > Peter Nachtwey
>
> Did you evaluate the Turkowski algorithm at the page you linked? It
> uses a slightly more complicated seeding algorithm for the initial
> guess for Newton's algorithm, but it's still relatively simple (a
> quadratic).
I didn't consider that one because of the mod operator and too many
divides.  The polynomial isn't a problem because all the instructions
are 1 clock cycle and just a few multiply and adds but there is no
divide or mod instruction on a TI C33 so they are expensive.  The ldexp
()  and frexp() functions aren't that cheap either.

I was hoping someone knows a variation of this that works on a TI C33.
	unsigned int* p = (unsigned int *) &f;
	*p = *p/3 + 709921077;
This is mean and lean.
I think that even the generic nth root modified to work on the TI C33
and an extra iteration is a better option.  I just need to avoid
divides.

Peter Nachtwey

Reply by Glen Herrmannsfeldt ●February 27, 20092009-02-27

pnachtwey wrote:

>>On Feb 26, 10:04 pm, pnachtwey <pnacht...@gmail.com> wrote:

>>>I need a fast cube root routine.  
(big snip)

> I didn't consider that one because of the mod operator and too many
> divides.  The polynomial isn't a problem because all the instructions
> are 1 clock cycle and just a few multiply and adds but there is no
> divide or mod instruction on a TI C33 so they are expensive.  The ldexp
> ()  and frexp() functions aren't that cheap either.

I didn't look at the specific routine, but the most important
part should be dividing the unbiased exponent by three.
(That can be done with a fixed point multiply by something
close to 1/3.)   If the C33 floating point is like some of the
other TI DSPs it is a little unusual, but it shouldn't be so hard
to get the exponent out.  Maybe extract the exponent bits then
use a lookup table to find a multiplier.  Multiply by the
appropriate table entry and then start Newton-Raphson.

The better the initial guess the faster it converges.

> I was hoping someone knows a variation of this that works on a TI C33.
> 	unsigned int* p = (unsigned int *) &f;
> 	*p = *p/3 + 709921077;
> This is mean and lean.
> I think that even the generic nth root modified to work on the TI C33
> and an extra iteration is a better option.  I just need to avoid
> divides.

-- glen

Reply by Jerry Avins ●February 27, 20092009-02-27

pnachtwey wrote:
> On Feb 26, 8:19 pm, cincy...@gmail.com wrote:
>> On Feb 26, 10:04 pm, pnachtwey <pnacht...@gmail.com> wrote:
>>
>>> I need a fast cube root routine.  I started with the obvioushttp://metamerist.com/cbrt/cbrt.htm
>>> What is key is getting the initial estimate using the Kahan bit hack
>>> but that only works with IEEE floating point and not with TI floating
>>> point.  I can't believe that I am the first to do this since the C33
>>> is ancient.  The obvious solution is to convert the TI float to the
>>> IEEE float, apply the hack, and then convert back to TI float but
>>> there must be a better way.   Actually doing another iteration is
>>> probably faster.  The problem is that sign of the mantissa is at bit
>>> 23.
>>> Does anyboy know of a link?  Thanks.
>>> Peter Nachtwey
>> Did you evaluate the Turkowski algorithm at the page you linked? It
>> uses a slightly more complicated seeding algorithm for the initial
>> guess for Newton's algorithm, but it's still relatively simple (a
>> quadratic).
> I didn't consider that one because of the mod operator and too many
> divides.  The polynomial isn't a problem because all the instructions
> are 1 clock cycle and just a few multiply and adds but there is no
> divide or mod instruction on a TI C33 so they are expensive.  The ldexp
> ()  and frexp() functions aren't that cheap either.
> 
> I was hoping someone knows a variation of this that works on a TI C33.
> 	unsigned int* p = (unsigned int *) &f;
> 	*p = *p/3 + 709921077;
> This is mean and lean.
> I think that even the generic nth root modified to work on the TI C33
> and an extra iteration is a better option.  I just need to avoid
> divides.

I can check what I don't remember, but not now (it's after midnight). 
Isn't the 'C33 float a signed exponent rather than offset? If so, that's 
even easier than IEEE. Just divide the exponent part by three, subtract 
1, and set the mantissa to (almost) 1.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Reply by Tim Wescott ●February 27, 20092009-02-27

On Thu, 26 Feb 2009 19:04:01 -0800, pnachtwey wrote:

> I need a fast cube root routine.  I started with the obvious
> http://metamerist.com/cbrt/cbrt.htm
> What is key is getting the initial estimate using the Kahan bit hack but
> that only works with IEEE floating point and not with TI floating point.
>  I can't believe that I am the first to do this since the C33 is
> ancient.  The obvious solution is to convert the TI float to the IEEE
> float, apply the hack, and then convert back to TI float but there must
> be a better way.   Actually doing another iteration is probably faster. 
> The problem is that sign of the mantissa is at bit 23.
> 
> Does anyboy know of a link?  Thanks.
> 
> Peter Nachtwey

Does anyone here remember how to take the square root of a decimal number 
via "long division"?  My dad taught me how when I was considerably 
shorter than I am now, possibly because he was appalled at his brother 
teaching me how to do it on a calculator using Newton's method.

For those of you under 80, it's basically a method for churning out 
decimals of the square root of a number in a method very akin to long 
division, except that at each step instead of subtracting (cumulative 
result) * (guess) you subtract (2 * cumulative result + guess) * (guess) 
-- you're basically trying to complete the equation (cumulative result + 
guess) ^ 2 = (cumulative result)^2 + 2 * (cumulative result)(guess) + 
(guess)^2.

It turns out that -- like long division -- it's really easy to do on a 
computer in binary because when you're only multiplying by 0 or 1 you 
don't have to waste a lot of time guessing.

It also turns out that if you're a _real_ math geek you can work out how 
to do this with cube roots, and, like square roots and division the 
guesswork is held to a minimum because you're just dealing with ones and 
zeros.

If the group shows sufficient interest I'll write it up and post it.  I 
figured it out when I was in high school then for got it.  Peter's post 
got me interested enough to figure it out again (thanks Peter!), but I'm 
not sure if it's faster than Newton's method unless you're doing the work 
on an FPGA.

Each step starts with you knowing the starting value - answer^3, and 
needing to calculate 3*answer^2 and 3*answer.  You then have to subtract 
those two numbers from your residual to know if you emit a '1' or a '0'.

I _suspect_ that it's not as quick as Newton's method, but it's late and 
I'm not (by gawd) going to check!

-- 
http://www.wescottdesign.com

Reply by Tim Wescott ●February 27, 20092009-02-27

On Thu, 26 Feb 2009 19:04:01 -0800, pnachtwey wrote:

> I need a fast cube root routine.  I started with the obvious
> http://metamerist.com/cbrt/cbrt.htm
> What is key is getting the initial estimate using the Kahan bit hack but
> that only works with IEEE floating point and not with TI floating point.
>  I can't believe that I am the first to do this since the C33 is
> ancient.  The obvious solution is to convert the TI float to the IEEE
> float, apply the hack, and then convert back to TI float but there must
> be a better way.   Actually doing another iteration is probably faster. 
> The problem is that sign of the mantissa is at bit 23.
> 
> Does anyboy know of a link?  Thanks.
> 
> Peter Nachtwey

Dunno the details (I started doing DSP long after the C33 went obsolete, 
and I'm too cool* to use a floating point DSP anyway).  So I can't help 
you directly, but I _can_ jump in and kibbitz.

Have you considered extracting the exponent, doing a quick table look-up 
on the resulting mantissa bits, and using whatever the table coughs up as 
a seed for N-R?

* or perhaps my employers are too cheap.  Hmm.

-- 
http://www.wescottdesign.com

Reply by Glen Herrmannsfeldt ●February 27, 20092009-02-27

Tim Wescott wrote:
(snip)

> Does anyone here remember how to take the square root of a decimal number 
> via "long division"?  My dad taught me how when I was considerably 
> shorter than I am now, possibly because he was appalled at his brother 
> teaching me how to do it on a calculator using Newton's method.

I have a book that tells how to do it on an abacus.

I believe it is the same or similar algorithm, but I never
used the pencil and paper method.  I once did sqrt(2) to
six places after the decimal point on an abacus, though.

For sqrt, you divide the abacus in half, one part starts with the
number whose square root is desired, and decreases during the
calculation.  The other increases toward the desired square root.

There is also a cube root algorithm that divides the abacus in thirds.
Again the original number decreases and the 2/3 and 1/3 power
increase, finally resulting in the cube root.  For more places you
need two abacii (abacuses) together, but I only have one.

-- glen

Reply by Jerry Avins ●February 27, 20092009-02-27

Tim Wescott wrote:
> On Thu, 26 Feb 2009 19:04:01 -0800, pnachtwey wrote:
> 
>> I need a fast cube root routine.  I started with the obvious
>> http://metamerist.com/cbrt/cbrt.htm
>> What is key is getting the initial estimate using the Kahan bit hack but
>> that only works with IEEE floating point and not with TI floating point.
>>  I can't believe that I am the first to do this since the C33 is
>> ancient.  The obvious solution is to convert the TI float to the IEEE
>> float, apply the hack, and then convert back to TI float but there must
>> be a better way.   Actually doing another iteration is probably faster. 
>> The problem is that sign of the mantissa is at bit 23.
>>
>> Does anyboy know of a link?  Thanks.
>>
>> Peter Nachtwey
> 
> Does anyone here remember how to take the square root of a decimal number 
> via "long division"?  My dad taught me how when I was considerably 
> shorter than I am now, possibly because he was appalled at his brother 
> teaching me how to do it on a calculator using Newton's method.
> 
> For those of you under 80, it's basically a method for churning out 
> decimals of the square root of a number in a method very akin to long 
> division, except that at each step instead of subtracting (cumulative 
> result) * (guess) you subtract (2 * cumulative result + guess) * (guess) 
> -- you're basically trying to complete the equation (cumulative result + 
> guess) ^ 2 = (cumulative result)^2 + 2 * (cumulative result)(guess) + 
> (guess)^2.
> 
> It turns out that -- like long division -- it's really easy to do on a 
> computer in binary because when you're only multiplying by 0 or 1 you 
> don't have to waste a lot of time guessing.
> 
> It also turns out that if you're a _real_ math geek you can work out how 
> to do this with cube roots, and, like square roots and division the 
> guesswork is held to a minimum because you're just dealing with ones and 
> zeros.
> 
> If the group shows sufficient interest I'll write it up and post it.  I 
> figured it out when I was in high school then for got it.  Peter's post 
> got me interested enough to figure it out again (thanks Peter!), but I'm 
> not sure if it's faster than Newton's method unless you're doing the work 
> on an FPGA.
> 
> Each step starts with you knowing the starting value - answer^3, and 
> needing to calculate 3*answer^2 and 3*answer.  You then have to subtract 
> those two numbers from your residual to know if you emit a '1' or a '0'.
> 
> I _suspect_ that it's not as quick as Newton's method, but it's late and 
> I'm not (by gawd) going to check!

There is in general a fundamental difference between paper-and-pencil 
algorithms and those run on a computer with a good divide, and usually, 
the computer version needs fewer steps.

With paper and pencil, one wants the next digit that is just not too 
large. With machinery that can divide, one wants the best approximation 
without regard to the number of digits produced.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;

Reply by Vladimir Vassilevsky ●February 27, 20092009-02-27

Tim Wescott wrote:

> On Thu, 26 Feb 2009 19:04:01 -0800, pnachtwey wrote:
> 
> 
>>I need a fast cube root routine. 

[...]

> 
> Does anyone here remember how to take the square root of a decimal number 
> via "long division"? 

[...]

That algorithm computes one bit per iteration. Simple, but very slow. 
Newton-Raphson efficiently doubles the number of known bits at each step.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

Previous12 3 4 Next

Fast Cube Root Using C33

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group