[Insight-developers] transform internal types
M.Staring at lumc.nl
M.Staring at lumc.nl
Fri May 28 08:41:18 EDT 2010
Hi Luis,
I see that my more fundamental arguments don't make much of an impression.
Regarding your more pragmatic questions:
> I agree with you in that it is a waste of resources.
> It is however an insignificant waste of resources.
>
> ...
>
> How many milliseconds are we talking about ?
>
> ...
>
> How many milliseconds does it take to compute the
> registration in the GPU ?
>
We just finished implementing it. For 3D images of size about 440^3, using the MSD metric, 2000 samples from the fixed image to compute its derivative, standard gradient descent, B-spline transform, we have the following scheme:
1 convert transform parameters vector mu_k to float
2 copy mu_k to GPU
3 compute d MSD / d mu and take an optimisation step resulting in mu_{k+1}
4 copy mu_{k+1} to CPU
5 convert mu_{k+1} to double
And a second scheme that does not need to do step 1 and 5.
Experiments were performed on a Intel Xeon W3520 @ 2.66 GHz and Nvidia Geforce GTX 285,
Windows 7, 64 bit.
For the first scheme with M the size of mu and the reported values time in micro-seconds:
step M1=100,000 M2=650,000
1+2: 1096.69 3243.99
3 : 3899.09 6374.7
4+5: 2315.11 6011.83
And the second scheme:
step M1=100,000 M2=650,000
2 : 726.14 1834.73
3 : 3988.17 6262.58
4 : 2112.36 4952.77
Which means that:
mili-seconds percentage
M1 M2 M1 M2
cast 0.6 2.5 8% 16%
iter 3.9 6.3 54% 41%
copy 2.8 6.8 39% 44%
total 7.4 15.6
So, casting back and forth to floats takes 0.6 ms = 16% of the computation time. The GetValueAndDerivative() operation on the GPU takes 41% of the time. And transfering the data back and forth is the bottle neck with 44%. Casting compared to everything neglecting data transfer is 28% of the total time.
Lessons learned:
Casting is not neglectible.
Data transfer between CPU and GPU is not a good idea.
Cheers,
Marius
More information about the Insight-developers
mailing list