[Rtk-users] rtkfdk: CudaFDKWeightProjectionFilter runs twice
Chao Wu
wuchao04 at gmail.com
Tue Jun 9 04:27:10 EDT 2015
Thanks Simon, so far the program runs without any problem.
Regards, Chao
2015-06-06 15:45 GMT+02:00 Simon Rit <simon.rit at creatis.insa-lyon.fr>:
> Hi Chao,
> Thanks for the detailed report. It was a tough bug but I think I nailed it:
>
> https://github.com/SimonRit/RTK/commit/9ff302899f247c10b8ab198df7b2d1aac6bbcad8
> This occured only in the displaced detector situation I think. Let me know
> if the fix does work for you. Thanks again,
> Simon
>
> On Tue, Jun 2, 2015 at 11:08 PM, Chao Wu <wuchao04 at gmail.com> wrote:
>
>> Hi Simon,
>>
>> I found how to reproduce it.
>> Instead of using proj.mha, if you use separate projection files (e.g.
>> proj_0.tif ~ proj_7.tif), and add the -l flag you will see the problem. The
>> --subsetsize is needed to show the issue (not necessarily =1 but should let
>> the for loop in the FDK filter run several times). The issue is not in the
>> first time the for loop runs, but only in sequential ones.
>>
>> Here's my output using separate tif files as projections: without -l it
>> runs 8 times, with -l it runs 15 times = 1 + 7*2 (first time in the loop is
>> normal, next ones weight filter run twice).
>>
>> D:\Chao\RTK>rtkfdk -p . -r proj_ -o fdk.mha -g g --hardware cuda
>> --subsetsize 1
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>>
>> D:\Chao\RTK>rtkfdk -p . -r proj_ -o fdk.mha -g g --hardware cuda
>> --subsetsize 1 -l
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>> CudaFDKWeightProjectionFilter::GPUGenerateData()
>>
>> When I print out the GRAM activities by #define VERBOSE in
>> itkCudaDataManager.h and add some other debug info in the code (before and
>> after filter->Update(), and at the beginning and end of GPUGenerateData()
>> of the filters) I have the following log showing the problem with -l flag
>> on (I show below the log of the first 4 projections and hope it is still
>> readable...)
>> When I focus on the CudaDataManager that updates (copies) the projection
>> data from CPU to GPU for the weight filter, I actually found two:
>> 00000000036B7CC0 and 00000000036B80C0. The latter one only exists for
>> projections other than the first projection and runs during the first
>> weight filter update, and the former one exists for all projections and is
>> the one running during the only weight filter update for the first
>> projection and the one taking action when the weight filter reruns during
>> the ramp filter update for the remaining projections.
>> (Note that I have a small modification in the GRAM message: when a gpu
>> buffer is freed for a new allocation, I log it as "Deallocate" instead of
>> "Freed" to discriminate it from a pure free operation.)
>>
>> ===Projection 0:
>>
>> >>>>>>>> Start Weight Update
>> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
>> 0000002304620000->0000000009700040 : 4805568
>> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
>> 000000000384B3F0::Allocate Create GPU buffer of size 4805568 Bytes :
>> 0000002304AC0000
>> 00000000036B7CC0::UpdateGPUBuffer CPU->GPU data copy
>> 0000000009BA0040->0000002304AC0000 : 4805568
>> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
>> End Weight Update <<<<<<<<
>>
>> >>>>>>>> Start Ramp Update
>> --------VVVVVVVV CudaFFTConvolutionImageFilter GenerateData--------
>> 000000000391D390::Allocate Create GPU buffer of size 9704448 Bytes :
>> 0000002304F60000
>> 000000000391C800::Allocate Create GPU buffer of size 31112 Bytes :
>> 00000023058C0000
>> 00000000036B85C0::UpdateGPUBuffer CPU->GPU data copy
>> 00000000039A5660->00000023058C0000 : 31112
>> 00000000038B3A40::Allocate Create GPU buffer of size 846660 Bytes :
>> 00000023059C0000
>> --------^^^^^^^^ CudaFFTConvolutionImageFilter GenerateData--------
>> 000000000391C800::Freed GPU buffer of size 31112 Bytes : 00000023058C0000
>> 000000000391D390::Freed GPU buffer of size 9704448 Bytes :
>> 0000002304F60000
>> End Ramp Update <<<<<<<<
>>
>> >>>>>>>> Start BP Update
>> --------VVVVVVVV CudaFDKBackProjectionImageFilter GenerateData--------
>> 0000000003832A80::Allocate Create GPU buffer of size 4290772992 Bytes :
>> 0000002306420000
>> 00000000036B79C0::UpdateGPUBuffer CPU->GPU data copy
>> 00000029FFC10040->0000002306420000 : 4290772992
>> --------^^^^^^^^ CudaFDKBackProjectionImageFilter GenerateData--------
>> End BP Update <<<<<<<<
>>
>> ===Projection 1:
>>
>> >>>>>>>> Start Weight Update
>> 000000000380B570::Deallocate GPU buffer of size 2402784 Bytes :
>> 0000002303F20000
>> 000000000380B570::Allocate Create GPU buffer of size 2418336 Bytes :
>> 0000002303F20000
>> 00000000036B75C0::UpdateGPUBuffer CPU->GPU data copy
>> 000000000ABF0040->0000002303F20000 : 2418336
>> 0000000003832530::Deallocate GPU buffer of size 4805568 Bytes :
>> 0000002304180000
>> 0000000003832530::Allocate Create GPU buffer of size 4836672 Bytes :
>> 0000002304180000
>> 0000000003832800::Deallocate GPU buffer of size 4805568 Bytes :
>> 0000002304620000
>> 0000000003832800::Allocate Create GPU buffer of size 4836672 Bytes :
>> 0000002304620000
>> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
>> 0000002304620000->0000000009010040 : 4836672
>> 000000000384B3F0::Freed GPU buffer of size 4805568 Bytes :
>> 0000002304AC0000
>> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
>> 000000000384B170::Allocate Create GPU buffer of size 4836672 Bytes :
>> 0000002305AC0000
>> 00000000036B80C0::UpdateGPUBuffer CPU->GPU data copy
>> 00000000094B0040->0000002305AC0000 : 4836672
>> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
>> End Weight Update <<<<<<<<
>>
>> >>>>>>>> Start Ramp Update
>> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
>> 0000002304620000->0000000009010040 : 4836672
>> 000000000384B170::Freed GPU buffer of size 4836672 Bytes :
>> 0000002305AC0000
>> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
>> 000000000391D110::Allocate Create GPU buffer of size 4836672 Bytes :
>> 0000002305AC0000
>> 00000000036B7CC0::UpdateGPUBuffer CPU->GPU data copy
>> 0000000009950040->0000002305AC0000 : 4836672
>> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
>> --------VVVVVVVV CudaFFTConvolutionImageFilter GenerateData--------
>> 00000000039BFED0::Allocate Create GPU buffer of size 9704448 Bytes :
>> 0000002304AC0000
>> 00000000039C01A0::Allocate Create GPU buffer of size 31112 Bytes :
>> 0000002305F60000
>> 00000000036B81C0::UpdateGPUBuffer CPU->GPU data copy
>> 00000000039B8470->0000002305F60000 : 31112
>> 00000000039C02E0::Allocate Create GPU buffer of size 1097208 Bytes :
>> 0000002306060000
>> 00000000038B3A40::Freed GPU buffer of size 846660 Bytes : 00000023059C0000
>> --------^^^^^^^^ CudaFFTConvolutionImageFilter GenerateData--------
>> 00000000039C01A0::Freed GPU buffer of size 31112 Bytes : 0000002305F60000
>> 00000000039BFED0::Freed GPU buffer of size 9704448 Bytes :
>> 0000002304AC0000
>> End Ramp Update <<<<<<<<
>>
>> >>>>>>>> Start BP Update
>> --------VVVVVVVV CudaFDKBackProjectionImageFilter GenerateData--------
>> --------^^^^^^^^ CudaFDKBackProjectionImageFilter GenerateData--------
>> End BP Update <<<<<<<<
>>
>> ===Projection 2:
>>
>> >>>>>>>> Start Weight Update
>> 00000000036B75C0::UpdateGPUBuffer CPU->GPU data copy
>> 000000000ABF0040->0000002303F20000 : 2418336
>> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
>> 0000002304620000->0000000009010040 : 4836672
>> 000000000391D110::Freed GPU buffer of size 4836672 Bytes :
>> 0000002305AC0000
>> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
>> 000000000391CBC0::Allocate Create GPU buffer of size 4836672 Bytes :
>> 0000002304AC0000
>> 00000000036B80C0::UpdateGPUBuffer CPU->GPU data copy
>> 000000000AF50040->0000002304AC0000 : 4836672
>> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
>> End Weight Update <<<<<<<<
>>
>> >>>>>>>> Start Ramp Update
>> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
>> 0000002304620000->0000000009010040 : 4836672
>> 000000000391CBC0::Freed GPU buffer of size 4836672 Bytes :
>> 0000002304AC0000
>> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
>> 000000000391D020::Allocate Create GPU buffer of size 4836672 Bytes :
>> 0000002304AC0000
>> 00000000036B7CC0::UpdateGPUBuffer CPU->GPU data copy
>> 00000000098E0040->0000002304AC0000 : 4836672
>> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
>> --------VVVVVVVV CudaFFTConvolutionImageFilter GenerateData--------
>> 00000000039C0420::Allocate Create GPU buffer of size 9704448 Bytes :
>> 0000002304F60000
>> 00000000039C0100::Allocate Create GPU buffer of size 31112 Bytes :
>> 0000002306180000
>> 00000000036B81C0::UpdateGPUBuffer CPU->GPU data copy
>> 000000000A408080->0000002306180000 : 31112
>> 00000000039C0060::Allocate Create GPU buffer of size 1183044 Bytes :
>> 00000023058C0000
>> 00000000039C02E0::Freed GPU buffer of size 1097208 Bytes :
>> 0000002306060000
>> --------^^^^^^^^ CudaFFTConvolutionImageFilter GenerateData--------
>> 00000000039C0100::Freed GPU buffer of size 31112 Bytes : 0000002306180000
>> 00000000039C0420::Freed GPU buffer of size 9704448 Bytes :
>> 0000002304F60000
>> End Ramp Update <<<<<<<<
>>
>> >>>>>>>> Start BP Update
>> --------VVVVVVVV CudaFDKBackProjectionImageFilter GenerateData--------
>> --------^^^^^^^^ CudaFDKBackProjectionImageFilter GenerateData--------
>> End BP Update <<<<<<<<
>>
>> ===Projection 3:
>>
>> >>>>>>>> Start Weight Update
>> 00000000036B75C0::UpdateGPUBuffer CPU->GPU data copy
>> 000000000ABF0040->0000002303F20000 : 2418336
>> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
>> 0000002304620000->0000000009010040 : 4836672
>> 000000000391D020::Freed GPU buffer of size 4836672 Bytes :
>> 0000002304AC0000
>> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
>> 000000000391D110::Allocate Create GPU buffer of size 4836672 Bytes :
>> 0000002305A00000
>> 00000000036B80C0::UpdateGPUBuffer CPU->GPU data copy
>> 0000000013400040->0000002305A00000 : 4836672
>> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
>> End Weight Update <<<<<<<<
>>
>> >>>>>>>> Start Ramp Update
>> 00000000036B78C0::UpdateCPUBuffer GPU->CPU data copy
>> 0000002304620000->0000000009010040 : 4836672
>> 000000000391D110::Freed GPU buffer of size 4836672 Bytes :
>> 0000002305A00000
>> --------VVVVVVVV CudaFDKWeightProjectionFilter GenerateData--------
>> 00000000039C0420::Allocate Create GPU buffer of size 4836672 Bytes :
>> 0000002305A00000
>> 00000000036B7CC0::UpdateGPUBuffer CPU->GPU data copy
>> 00000000098E0040->0000002305A00000 : 4836672
>> --------^^^^^^^^ CudaFDKWeightProjectionFilter GenerateData--------
>> --------VVVVVVVV CudaFFTConvolutionImageFilter GenerateData--------
>> 00000000039C0380::Allocate Create GPU buffer of size 9704448 Bytes :
>> 0000002304AC0000
>> 00000000039C0290::Allocate Create GPU buffer of size 31112 Bytes :
>> 0000002305EA0000
>> 00000000036B81C0::UpdateGPUBuffer CPU->GPU data copy
>> 000000000A414080->0000002305EA0000 : 31112
>> 00000000039C0560::Allocate Create GPU buffer of size 1081036 Bytes :
>> 0000002305FA0000
>> 00000000039C0060::Freed GPU buffer of size 1183044 Bytes :
>> 00000023058C0000
>> --------^^^^^^^^ CudaFFTConvolutionImageFilter GenerateData--------
>> 00000000039C0290::Freed GPU buffer of size 31112 Bytes : 0000002305EA0000
>> 00000000039C0380::Freed GPU buffer of size 9704448 Bytes :
>> 0000002304AC0000
>> End Ramp Update <<<<<<<<
>>
>> >>>>>>>> Start BP Update
>> --------VVVVVVVV CudaFDKBackProjectionImageFilter GenerateData--------
>> --------^^^^^^^^ CudaFDKBackProjectionImageFilter GenerateData--------
>> End BP Update <<<<<<<<
>>
>>
>> Regards,
>> Chao
>>
>>
>> 2015-06-02 17:43 GMT+02:00 Simon Rit <simon.rit at creatis.insa-lyon.fr>:
>>
>>> Hi Chao,
>>> It looks bad but I couldn't reproduce the problem. What I did is add
>>> messages in the GPUGenerateData and launch a very simple sequence of
>>> command lines:
>>> rtksimulatedgeometry -n 8 -o g
>>> rtkprojectshepploganphantom -g g -o proj.mha
>>> rtkfdk -p . -r proj.mha -o fdk.mha -g g --hardware cuda
>>> in which case I go only once in each GPUGenerateData. Can you give us
>>> a command line example where you can see the problem?
>>> Thanks,
>>> Simon
>>>
>>> On Mon, Jun 1, 2015 at 6:53 PM, Chao Wu <wuchao04 at gmail.com> wrote:
>>> > Hi all,
>>> >
>>> > When testing CUDA-based FDK I found the CudaFDKWeightProjectionFilter
>>> seems
>>> > to rerun unnecessarily. Details below:
>>> >
>>> > In FDKConeBeamReconstructionFilter<TInputImage, TOutputImage,
>>> > TFFTPrecision>::GenerateData() the three sub-filters run separately for
>>> > timing:
>>> >
>>> > m_PreFilterProbe.Start();
>>> > m_WeightFilter->Update();
>>> > m_PreFilterProbe.Stop();
>>> >
>>> > m_FilterProbe.Start();
>>> > m_RampFilter->Update();
>>> > m_FilterProbe.Stop();
>>> >
>>> > m_BackProjectionProbe.Start();
>>> > m_BackProjectionFilter->Update();
>>> > m_BackProjectionProbe.Stop();
>>> >
>>> > However with some debug procedure I found when executing
>>> > m_RampFilter->Update() the m_WeightFilter is updated again. Maybe
>>> there's
>>> > something wrong with filter modified time or flags of CPU/GPU buffer?
>>> >
>>> > Furthermore, if I remove m_WeightFilter->Update() then both
>>> m_WeightFilter
>>> > and m_RampFilter will rerun during update of m_BackProjectionFilter.
>>> Only if
>>> > I remove both m_WeightFilter->Update() and m_RampFilter->Update() and
>>> let
>>> > m_BackProjectionFilter execute the whole minipipeline, all filters
>>> will run
>>> > nicely only once for each projection subset.
>>> >
>>> > The results are identical for all cases I tested.
>>> >
>>> > I did not check whether the same issue applies to the CPU or OpenCL
>>> version.
>>> >
>>> > Regards,
>>> > Chao
>>> >
>>> > _______________________________________________
>>> > Rtk-users mailing list
>>> > Rtk-users at public.kitware.com
>>> > http://public.kitware.com/mailman/listinfo/rtk-users
>>> >
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/rtk-users/attachments/20150609/45a3fe55/attachment-0002.html>
More information about the Rtk-users
mailing list