[Rtk-users] rtkfdk: CudaFDKWeightProjectionFilter runs twice

Simon Rit simon.rit at creatis.insa-lyon.fr
Sat Jun 6 09:45:47 EDT 2015


Hi Chao,
Thanks for the detailed report. It was a tough bug but I think I nailed it:
https://github.com/SimonRit/RTK/commit/9ff302899f247c10b8ab198df7b2d1aac6bbcad8
This occured only in the displaced detector situation I think. Let me know
if the fix does work for you. Thanks again,
Simon

On Tue, Jun 2, 2015 at 11:08 PM, Chao Wu <wuchao04 at gmail.com> wrote:

> Hi Simon,
>
> I found how to reproduce it.
> Instead of using proj.mha, if you use separate projection files (e.g.
> proj_0.tif ~ proj_7.tif), and add the -l flag you will see the problem. The
> --subsetsize is needed to show the issue (not necessarily =1 but should let
> the for loop in the FDK filter run several times). The issue is not in the
> first time the for loop runs, but only in sequential ones.
>
> Here's my output using separate tif files as projections: without -l it
> runs 8 times, with -l it runs 15 times = 1 + 7*2 (first time in the loop is
> normal, next ones weight filter run twice).
>
> D:\Chao\RTK>rtkfdk -p . -r proj_ -o fdk.mha -g g --hardware cuda
> --subsetsize 1
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
>
> D:\Chao\RTK>rtkfdk -p . -r proj_ -o fdk.mha -g g --hardware cuda
> --subsetsize 1 -l
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
> CudaFDKWeightProjectionFilter::GPUGenerateData()
>
> When I print out the GRAM activities by #define VERBOSE in
> itkCudaDataManager.h and add some other debug info in the code (before and
> after filter->Update(), and at the beginning and end of GPUGenerateData()
> of the filters) I have the following log showing the problem with -l flag
> on (I show below the log of the first 4 projections and hope it is still
> readable...)
> When I focus on the CudaDataManager that updates (copies) the projection
> data from CPU to GPU for the weight filter, I actually found two:
> 00000000036B7CC0 and 00000000036B80C0. The latter one only exists for
> projections other than the first projection and runs during the first
> weight filter update, and the former one exists for all projections and is
> the one running during the only weight filter update for the first
> projection and the one taking action when the weight filter reruns during
> the ramp filter update for the remaining projections.
> (Note that I have a small modification in the GRAM message: when a gpu
> buffer is freed for a new allocation, I log it as "Deallocate" instead of
> "Freed" to discriminate it from a pure free operation.)
>
> ===Projection 0:
>
> >>>>>>>> Start Weight Update
> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
> 0000002304620000->0000000009700040 : 4805568
> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
> 000000000384B3F0::Allocate Create GPU buffer of size 4805568 Bytes :
> 0000002304AC0000
> 00000000036B7CC0::UpdateGPUBuffer CPU->GPU data copy
> 0000000009BA0040->0000002304AC0000 : 4805568
> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
> End Weight Update <<<<<<<<
>
> >>>>>>>> Start Ramp Update
> --------VVVVVVVV CudaFFTConvolutionImageFilter GenerateData--------
> 000000000391D390::Allocate Create GPU buffer of size 9704448 Bytes :
> 0000002304F60000
> 000000000391C800::Allocate Create GPU buffer of size 31112 Bytes :
> 00000023058C0000
> 00000000036B85C0::UpdateGPUBuffer CPU->GPU data copy
> 00000000039A5660->00000023058C0000 : 31112
> 00000000038B3A40::Allocate Create GPU buffer of size 846660 Bytes :
> 00000023059C0000
> --------^^^^^^^^ CudaFFTConvolutionImageFilter GenerateData--------
> 000000000391C800::Freed GPU buffer of size 31112 Bytes : 00000023058C0000
> 000000000391D390::Freed GPU buffer of size 9704448 Bytes : 0000002304F60000
> End Ramp Update <<<<<<<<
>
> >>>>>>>> Start BP Update
> --------VVVVVVVV CudaFDKBackProjectionImageFilter GenerateData--------
> 0000000003832A80::Allocate Create GPU buffer of size 4290772992 Bytes :
> 0000002306420000
> 00000000036B79C0::UpdateGPUBuffer CPU->GPU data copy
> 00000029FFC10040->0000002306420000 : 4290772992
> --------^^^^^^^^ CudaFDKBackProjectionImageFilter GenerateData--------
> End BP Update <<<<<<<<
>
> ===Projection 1:
>
> >>>>>>>> Start Weight Update
> 000000000380B570::Deallocate GPU buffer of size 2402784 Bytes :
> 0000002303F20000
> 000000000380B570::Allocate Create GPU buffer of size 2418336 Bytes :
> 0000002303F20000
> 00000000036B75C0::UpdateGPUBuffer CPU->GPU data copy
> 000000000ABF0040->0000002303F20000 : 2418336
> 0000000003832530::Deallocate GPU buffer of size 4805568 Bytes :
> 0000002304180000
> 0000000003832530::Allocate Create GPU buffer of size 4836672 Bytes :
> 0000002304180000
> 0000000003832800::Deallocate GPU buffer of size 4805568 Bytes :
> 0000002304620000
> 0000000003832800::Allocate Create GPU buffer of size 4836672 Bytes :
> 0000002304620000
> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
> 0000002304620000->0000000009010040 : 4836672
> 000000000384B3F0::Freed GPU buffer of size 4805568 Bytes : 0000002304AC0000
> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
> 000000000384B170::Allocate Create GPU buffer of size 4836672 Bytes :
> 0000002305AC0000
> 00000000036B80C0::UpdateGPUBuffer CPU->GPU data copy
> 00000000094B0040->0000002305AC0000 : 4836672
> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
> End Weight Update <<<<<<<<
>
> >>>>>>>> Start Ramp Update
> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
> 0000002304620000->0000000009010040 : 4836672
> 000000000384B170::Freed GPU buffer of size 4836672 Bytes : 0000002305AC0000
> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
> 000000000391D110::Allocate Create GPU buffer of size 4836672 Bytes :
> 0000002305AC0000
> 00000000036B7CC0::UpdateGPUBuffer CPU->GPU data copy
> 0000000009950040->0000002305AC0000 : 4836672
> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
> --------VVVVVVVV CudaFFTConvolutionImageFilter GenerateData--------
> 00000000039BFED0::Allocate Create GPU buffer of size 9704448 Bytes :
> 0000002304AC0000
> 00000000039C01A0::Allocate Create GPU buffer of size 31112 Bytes :
> 0000002305F60000
> 00000000036B81C0::UpdateGPUBuffer CPU->GPU data copy
> 00000000039B8470->0000002305F60000 : 31112
> 00000000039C02E0::Allocate Create GPU buffer of size 1097208 Bytes :
> 0000002306060000
> 00000000038B3A40::Freed GPU buffer of size 846660 Bytes : 00000023059C0000
> --------^^^^^^^^ CudaFFTConvolutionImageFilter GenerateData--------
> 00000000039C01A0::Freed GPU buffer of size 31112 Bytes : 0000002305F60000
> 00000000039BFED0::Freed GPU buffer of size 9704448 Bytes : 0000002304AC0000
> End Ramp Update <<<<<<<<
>
> >>>>>>>> Start BP Update
> --------VVVVVVVV CudaFDKBackProjectionImageFilter GenerateData--------
> --------^^^^^^^^ CudaFDKBackProjectionImageFilter GenerateData--------
> End BP Update <<<<<<<<
>
> ===Projection 2:
>
> >>>>>>>> Start Weight Update
> 00000000036B75C0::UpdateGPUBuffer CPU->GPU data copy
> 000000000ABF0040->0000002303F20000 : 2418336
> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
> 0000002304620000->0000000009010040 : 4836672
> 000000000391D110::Freed GPU buffer of size 4836672 Bytes : 0000002305AC0000
> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
> 000000000391CBC0::Allocate Create GPU buffer of size 4836672 Bytes :
> 0000002304AC0000
> 00000000036B80C0::UpdateGPUBuffer CPU->GPU data copy
> 000000000AF50040->0000002304AC0000 : 4836672
> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
> End Weight Update <<<<<<<<
>
> >>>>>>>> Start Ramp Update
> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
> 0000002304620000->0000000009010040 : 4836672
> 000000000391CBC0::Freed GPU buffer of size 4836672 Bytes : 0000002304AC0000
> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
> 000000000391D020::Allocate Create GPU buffer of size 4836672 Bytes :
> 0000002304AC0000
> 00000000036B7CC0::UpdateGPUBuffer CPU->GPU data copy
> 00000000098E0040->0000002304AC0000 : 4836672
> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
> --------VVVVVVVV CudaFFTConvolutionImageFilter GenerateData--------
> 00000000039C0420::Allocate Create GPU buffer of size 9704448 Bytes :
> 0000002304F60000
> 00000000039C0100::Allocate Create GPU buffer of size 31112 Bytes :
> 0000002306180000
> 00000000036B81C0::UpdateGPUBuffer CPU->GPU data copy
> 000000000A408080->0000002306180000 : 31112
> 00000000039C0060::Allocate Create GPU buffer of size 1183044 Bytes :
> 00000023058C0000
> 00000000039C02E0::Freed GPU buffer of size 1097208 Bytes : 0000002306060000
> --------^^^^^^^^ CudaFFTConvolutionImageFilter GenerateData--------
> 00000000039C0100::Freed GPU buffer of size 31112 Bytes : 0000002306180000
> 00000000039C0420::Freed GPU buffer of size 9704448 Bytes : 0000002304F60000
> End Ramp Update <<<<<<<<
>
> >>>>>>>> Start BP Update
> --------VVVVVVVV CudaFDKBackProjectionImageFilter GenerateData--------
> --------^^^^^^^^ CudaFDKBackProjectionImageFilter GenerateData--------
> End BP Update <<<<<<<<
>
> ===Projection 3:
>
> >>>>>>>> Start Weight Update
> 00000000036B75C0::UpdateGPUBuffer CPU->GPU data copy
> 000000000ABF0040->0000002303F20000 : 2418336
> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
> 0000002304620000->0000000009010040 : 4836672
> 000000000391D020::Freed GPU buffer of size 4836672 Bytes : 0000002304AC0000
> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
> 000000000391D110::Allocate Create GPU buffer of size 4836672 Bytes :
> 0000002305A00000
> 00000000036B80C0::UpdateGPUBuffer CPU->GPU data copy
> 0000000013400040->0000002305A00000 : 4836672
> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
> End Weight Update <<<<<<<<
>
> >>>>>>>> Start Ramp Update
> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
> 0000002304620000->0000000009010040 : 4836672
> 000000000391D110::Freed GPU buffer of size 4836672 Bytes : 0000002305A00000
> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
> 00000000039C0420::Allocate Create GPU buffer of size 4836672 Bytes :
> 0000002305A00000
> 00000000036B7CC0::UpdateGPUBuffer CPU->GPU data copy
> 00000000098E0040->0000002305A00000 : 4836672
> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
> --------VVVVVVVV CudaFFTConvolutionImageFilter GenerateData--------
> 00000000039C0380::Allocate Create GPU buffer of size 9704448 Bytes :
> 0000002304AC0000
> 00000000039C0290::Allocate Create GPU buffer of size 31112 Bytes :
> 0000002305EA0000
> 00000000036B81C0::UpdateGPUBuffer CPU->GPU data copy
> 000000000A414080->0000002305EA0000 : 31112
> 00000000039C0560::Allocate Create GPU buffer of size 1081036 Bytes :
> 0000002305FA0000
> 00000000039C0060::Freed GPU buffer of size 1183044 Bytes : 00000023058C0000
> --------^^^^^^^^ CudaFFTConvolutionImageFilter GenerateData--------
> 00000000039C0290::Freed GPU buffer of size 31112 Bytes : 0000002305EA0000
> 00000000039C0380::Freed GPU buffer of size 9704448 Bytes : 0000002304AC0000
> End Ramp Update <<<<<<<<
>
> >>>>>>>> Start BP Update
> --------VVVVVVVV CudaFDKBackProjectionImageFilter GenerateData--------
> --------^^^^^^^^ CudaFDKBackProjectionImageFilter GenerateData--------
> End BP Update <<<<<<<<
>
>
> Regards,
> Chao
>
>
> 2015-06-02 17:43 GMT+02:00 Simon Rit <simon.rit at creatis.insa-lyon.fr>:
>
>> Hi Chao,
>> It looks bad but I couldn't reproduce the problem. What I did is add
>> messages in the GPUGenerateData and launch a very simple sequence of
>> command lines:
>>   rtksimulatedgeometry -n 8 -o g
>>   rtkprojectshepploganphantom -g g -o proj.mha
>>   rtkfdk -p . -r proj.mha -o fdk.mha -g g --hardware cuda
>> in which case I go only once in each GPUGenerateData. Can you give us
>> a command line example where you can see the problem?
>> Thanks,
>> Simon
>>
>> On Mon, Jun 1, 2015 at 6:53 PM, Chao Wu <wuchao04 at gmail.com> wrote:
>> > Hi all,
>> >
>> > When testing CUDA-based FDK I found the CudaFDKWeightProjectionFilter
>> seems
>> > to rerun unnecessarily. Details below:
>> >
>> > In FDKConeBeamReconstructionFilter<TInputImage, TOutputImage,
>> > TFFTPrecision>::GenerateData() the three sub-filters run separately for
>> > timing:
>> >
>> >     m_PreFilterProbe.Start();
>> >     m_WeightFilter->Update();
>> >     m_PreFilterProbe.Stop();
>> >
>> >     m_FilterProbe.Start();
>> >   m_RampFilter->Update();
>> > m_FilterProbe.Stop();
>> >
>> >     m_BackProjectionProbe.Start();
>> >     m_BackProjectionFilter->Update();
>> >     m_BackProjectionProbe.Stop();
>> >
>> > However with some debug procedure I found when executing
>> > m_RampFilter->Update() the m_WeightFilter is updated again. Maybe
>> there's
>> > something wrong with filter modified time or flags of CPU/GPU buffer?
>> >
>> > Furthermore, if I remove m_WeightFilter->Update() then both
>> m_WeightFilter
>> > and m_RampFilter will rerun during update of m_BackProjectionFilter.
>> Only if
>> > I remove both m_WeightFilter->Update() and m_RampFilter->Update() and
>> let
>> > m_BackProjectionFilter execute the whole minipipeline, all filters will
>> run
>> > nicely only once for each projection subset.
>> >
>> > The results are identical for all cases I tested.
>> >
>> > I did not check whether the same issue applies to the CPU or OpenCL
>> version.
>> >
>> > Regards,
>> > Chao
>> >
>> > _______________________________________________
>> > Rtk-users mailing list
>> > Rtk-users at public.kitware.com
>> > http://public.kitware.com/mailman/listinfo/rtk-users
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/rtk-users/attachments/20150606/0096e982/attachment-0002.html>


More information about the Rtk-users mailing list