memcpy speed

Sebastien Barre barre at sic.sp2mi.univ-poitiers.fr
Wed Mar 29 04:24:36 EST 2000


Hi David

David Gobbi wrote:

 > I think that 'for' loops are
 > superior programming style,

Well, not obvious for me, it's a bit shorter but I do not see such 
difference. I mean that in C there is a lot of way to do some dirty 'for' 
loops too.

 > I'm certainly not suggesting that all 'for' loops are optimized
 > into 'while' or 'do' loops...

Ooops, sorry, I was not suggesting you said that.

But as you stated, compilers improve every day, what I was suggesting was 
true some years ago, and is no more true :) I *do* remember 5 years ago, as 
I was writing a matrix library, that I performed a lot of tests using both 
Windows (turbo C++) and Linux (gcc) : replacing 'for' with 'while' made the 
difference very often.

 > As for using *ptr++ to go through a matrix instead of mat[i][j],
 > I think that is a very bad idea in almost all cases... it makes
 > the code much harder to understand.

That's very true, in fact I was just suggesting improving "frozen" (i.e. 
very stable) component of a lib's code. These small parts that are run very 
often and might benefit from profiling. These components need to "do what 
they have to do", but how they perform the task is not so important if you 
see them as black-boxes (I'm speaking about small parts, like inverting a 
matrix with a known algorithm for example). I was wondering if vtkMath 
could be such candidate, we could try and see if it has an impact on the 
overall performances.

You are right, although not so "scary", pointer notation is a bit harder to 
understand. I guess that the problem is that I'm used to the STL, the C++ 
Standard Template Library, and its power also relies on what it is called 
"iterators", which are kind of generalized pointers.

 > Unrolling the inner
 > loop into 4 statements provides a higher performance boost

I trust you. Once again, not so long ago, using pointers could speed up 
your code by a factor of 2 or 3 in such situation. This has changed, I've 
coded an example with VC++ 6 and egcs on 10x10, 100x100, 400x400 matrices, 
and saw no differences (depends on the size of the cache also).

Sincerely,
--
Sebastien BARRE
IRCOM-SIC, UMR-CNRS 6615 - Université de Poitiers
Bât. SP2MI, Bvd 3 - Téléport 2, BP 179 F-86960 Futuroscope Cedex
Tel. : +33 (0)5 49 49 65 95 / 65 83, Fax : +33 (0)5 49 49 65 70
http://www-sic.univ-poitiers.fr/barre/ ou  http://www.hds.utc.fr/~barre/



More information about the vtk-developers mailing list