[vtk-developers] floating-point vs. integer performance

Fri Jun 15 10:29:38 EDT 2001

On Fri, 15 Jun 2001, Volpe, Christopher R (CRD) wrote:

>
> |> Some of you might find this interesting:
> |>
> |> I added some templates to vtkImageReslice so that it can do all
> |> transformation & interpolation operations with either fixed-point
> |> or floating-point arithmetic, with a method to switch between
> |> them at run-time.
> |>
> |> At first I was overjoyed, because on my PIII machine the fixed-point
> |> code ran twice as fast (in some special cases as much as four times
> |> as fast) as the original floating-point version.
>
> Have you tried this on a PII as well as a PIII?

No, but I could try and get back to you.  I wouldn't expect any
difference.

> I've done some experiments with integer arithmetic
> vs. floating point arithmetic when I was toying with the idea of representing points and doing
> transformations in fixed point. (This was mainly because the results were to be used as indices into
> a 3D volume, and in CPU time, the compiler-generated x86 processor state change to trunc mode to do
> an integer cast is like watching continental drift.

I don't use the floor() function, it is pitifully slow on x86 processors.
You can just do the float->int conversion in the current mode, followed
by an 'if' statement that adds or subtracts 1 as necessary.  Much faster.

> Using inline assembler to hard code a round
> instruction speeds this up so that it is merely agonizingly slow, but I digress...) I've found that
> floating point beats integer arithmetic by a small margin on a PII. I think my test was something
> simple like multiplying two volatile variables and storing results in a tight loop. Not the best of
> tests, I'm sure.

Ah, but there are several tricks you can play with fixed-point to speed
things up.  For example and integer multiplied by a fixed-point number
is simply  (int)*(fixed),  no need to convert the int to a fixed-point
real before the multiply.  And truncation is just a (fixed)>>(radix),
rounding is (fixed+(1<<(radix-1))>>(radix)).
This (vs. float->int conversions) is probably where I get most of the
benefit.  I haven't profiled the various operations to see which ones
make the biggest difference.

> |> But then I compiled it on an R10000 IRIX computer, and the
> |> fixed-point
> |> math was 20% slower than the floating-point math.  On an R12000 IRIX,
> |> the fixed-point math was 50% slower.
>
> I don't know about Irix, but a while back, on Solaris, an integer multiplication would generate a
> FUNCTION CALL (_imul(), I believe) to do the operation, unless you specified some obscure compiler
> flag to generate SPARC v8 architecture instructions, or something along those lines.

That is almost beyond comprehension.  I'll check the SGI compiler
man pages to be sure it is not the same.

 - David