memcpy speed
David Gobbi
dgobbi at irus.rri.on.ca
Mon Mar 27 12:50:10 EST 2000
Hi Sebastien,
I'm certainly not suggesting that all 'for' loops are optimized
into 'while' or 'do' loops... I think that 'for' loops are
superior programming style, and the other forms should be used
very sparingly -- only when they provide a fairly significant
performance boost. Hopefully, gcc will soon have optimizations that
eliminate the performance differences between the various forms.
As for using *ptr++ to go through a matrix instead of mat[i][j],
I think that is a very bad idea in almost all cases... it makes
the code much harder to understand. Unrolling the inner
loop into 4 statements provides a higher performance boost
and is much, much easier on the eyes.
- David
--
David Gobbi, MSc dgobbi at irus.rri.on.ca
Advanced Imaging Research Group
Robarts Research Institute, University of Western Ontario
On Mon, 27 Mar 2000, Sebastien Barre wrote:
> At 13:06 26/03/00 -0500, David Gobbi a écrit:
>
> >Well, I decided to run a benchmark of memcpy() versus
> >loops of the form
> >
> >j = count;
> >while (--j >= 0)
> >{
> > *cp1++ = *cp2++;
> >}
> >
> >and loops of the form
> >
> >for (j = 0; j < count; j++)
> >{
> > *cp1++ = *cp2++;
> >}
> >
> >
> >Depending on the architecture and the data type, you need to copy
> >at least 32 bytes or memcpy is much slower than copying
> >the data in a loop. There is often a factor of >5 improvement
> >in looping over using memcpy!
>
> I've also observed that "feature" sometimes. It depends on the
> architecture, but it might be worth having a front-end for memcpy that will
> use either memcpy or that kind of loop (choice made at compilation time
> depening on a #DEFINE for exemple).
>
> >Also, with gcc, the 'j = count; while (--j >= 0)' form of looping is
> >around 15% to 30% faster than 'for (j = 0; j < count; j++)' form
> >for copying less than 16 bytes.
>
> Oh yes, that's common optimization trick :) (actually I thought it was true
> for more than 16 bytes also, I guess compilers get better and better in
> that game).
>
> BTW, I'm quite sure that there is a lot of places in VTK where optimization
> might be done by replacing loop indices like mat[i][j] = ... by a pointer
> notation like *ptr = ... (not **ptr, but really *ptr).
>
> --
> Sebastien BARRE
> IRCOM-SIC, UMR-CNRS 6615 - Université de Poitiers
> Bât. SP2MI, Bvd 3 - Téléport 2, BP 179 F-86960 Futuroscope Cedex
> Tel. : +33 (0)5 49 49 65 95 / 65 83, Fax : +33 (0)5 49 49 65 70
> http://www-sic.univ-poitiers.fr/barre/ ou http://www.hds.utc.fr/~barre/
>
More information about the vtk-developers
mailing list