<div dir="ltr"><div>Hi,</div><div>I think the problem is that you're doing the memory allocation yourself. The cudaMalloc is automatically done by the data manager when you require the buffer pointer (<span style="font-size:9.5pt;font-family:Consolas;color:darkcyan"></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetCudaDataManager()</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetGPUBufferPointer())</span>, you don't have to do it. Try removing the cudaMalloc calls.</div><div>Cheers,</div><div>Simon<br></div><div><span style="font-size:9.5pt;font-family:Consolas;color:black"></span></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Mar 3, 2019 at 5:38 PM Kabelitz, Gordian <<a href="mailto:Gordian.Kabelitz@medma.uni-heidelberg.de">Gordian.Kabelitz@medma.uni-heidelberg.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-US">
<div class="gmail-m_4220338164844420415WordSection1">
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:rgb(31,73,125)">Hi Simon,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:rgb(31,73,125)"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:rgb(31,73,125)">I made a workaround by including a explicit copying from device to host after the kernel is called. The function looks now like this:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:rgb(31,73,125)"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">CUDA_gradient</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[3],<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[3],<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">)<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">{<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">int3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dev_Size
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">make_int3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2]);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">float3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dev_Spacing
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">make_float3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2]);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:green">// Output volume</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputMemorySize =
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0] *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1] *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2] *
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">sizeof</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue" lang="DE">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE"> * dev_grad_x;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue" lang="DE">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE"> * dev_grad_y;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> * dev_grad_z;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMalloc</span><span style="font-size:9.5pt;font-family:Consolas;color:black">((</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)&dev_grad_x,
outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMalloc</span><span style="font-size:9.5pt;font-family:Consolas;color:black">((</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)&dev_grad_y,
outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMalloc</span><span style="font-size:9.5pt;font-family:Consolas;color:black">((</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)&dev_grad_z,
outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemset</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(dev_grad_x, 2.f, outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemset</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(dev_grad_y, 3.f, outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemset</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(dev_grad_z, 4.f, outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:green">// Thread Block Dimensions</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dimBlock
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3(</span><span style="font-size:9.5pt;font-family:Consolas;color:black">10, 10, 10</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">)</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> blocksInX =
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">iDivUp</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0],
dimBlock.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> blocksInY =
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">iDivUp</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1],
dimBlock.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> blocksInZ =
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">iDivUp</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2],
dimBlock.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dimGrid
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3(</span><span style="font-size:9.5pt;font-family:Consolas;color:black">blocksInX, blocksInY, blocksInZ</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">)</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">gradient_kernel</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> <<< dimGrid, dimBlock >>> (dev_in, dev_grad_x, dev_grad_y, dev_grad_z, dev_Size, dev_Spacing);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)">CUDA_CHECK_ERROR</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemcpy</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
dev_grad_x, outputMemorySize, </span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(49,79,79)">cudaMemcpyDeviceToHost</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)">CUDA_CHECK_ERROR</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemcpy</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
dev_grad_y, outputMemorySize, </span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(49,79,79)">cudaMemcpyDeviceToHost</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)">CUDA_CHECK_ERROR</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemcpy</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
dev_grad_z, outputMemorySize, </span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(49,79,79)">cudaMemcpyDeviceToHost</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)">CUDA_CHECK_ERROR</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">}<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:black">I am not sure if this violates the intended behavior of the ITKCudaCommon by explicitly copying the memory (that needed to be allocated before).<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:black">Still I cannot solve why the implicit memory copying mechanism do not work.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:black">Have you look into this problem or do you miss any information?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:black">With best regards,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:black">Gordian</span><span style="font-size:11pt;font-family:"Calibri","sans-serif";color:rgb(31,73,125)"><u></u><u></u></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><b><span style="font-size:10pt;font-family:"Tahoma","sans-serif"" lang="DE">Von:</span></b><span style="font-size:10pt;font-family:"Tahoma","sans-serif"" lang="DE"> Simon Rit [mailto:<a href="mailto:simon.rit@creatis.insa-lyon.fr" target="_blank">simon.rit@creatis.insa-lyon.fr</a>]
<br>
<b>Gesendet:</b> Mittwoch, 27. Februar 2019 21:58<br>
<b>An:</b> Kabelitz, Gordian<br>
<b>Cc:</b> <a href="mailto:rtk-users@public.kitware.com" target="_blank">rtk-users@public.kitware.com</a><br>
<b>Betreff:</b> Re: [Rtk-users] GPU kernel do not change output variables<u></u><u></u></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal">Hi,<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Sounds like a challenge. When you say you set fixed numbers, did you check that you reach the point where you set this number? You can use cuprintf to check what's going on in the kernel.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">One thing wrong I noticed: you use size.Fill in a loop, which is a bit odd because it will Fill the size with the last value of the loop.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">I hope this helps,<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Simon<u></u><u></u></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal">On Wed, Feb 27, 2019 at 9:39 PM Kabelitz, Gordian <<a href="mailto:Gordian.Kabelitz@medma.uni-heidelberg.de" target="_blank">Gordian.Kabelitz@medma.uni-heidelberg.de</a>> wrote:<u></u><u></u></p>
</div>
<blockquote style="border-color:currentcolor currentcolor currentcolor rgb(204,204,204);border-style:none none none solid;border-width:medium medium medium 1pt;padding:0cm 0cm 0cm 6pt;margin-left:4.8pt;margin-right:0cm">
<div>
<div>
<p class="MsoNormal" style="margin-bottom:3.75pt;line-height:12pt">
Hi rtk-users,<u></u><u></u></p>
<p class="MsoNormal" style="margin-bottom:3.75pt;line-height:12pt">
I am facing an oddity which I cannot explain.<u></u><u></u></p>
<p class="MsoNormal" style="margin-bottom:3.75pt;line-height:12pt">
I want to implement a new gradient filter. The input is an CudaImage<float,3> and the output should be an CudaImage<CovariantVector<float,3>,3>. The filter runs without any cuda errors but the output (pout_(xyz)) is has not changed at all. The kernel function
is accessed and the print out from there seems to be okay. I tried to explicitly copy the content of the GPUBuffer into the CPUBuffer. Still no success. Even if I set fixed numbers in the kernel to the output image nothing changed. I use CUDA 9.0, Visual
Studio 2015, ITK 5.0, RTK 2.0 as remote module, CMake 3.13., Windows 7 64bit. The relevant code snippets are below.<u></u><u></u></p>
<p class="MsoNormal" style="margin-bottom:3.75pt;line-height:12pt">
Do I miss something obvious? Any recommendation are welcome.<u></u><u></u></p>
<p class="MsoNormal" style="margin-bottom:3.75pt;line-height:12pt">
With kind regards,<u></u><u></u></p>
<p class="MsoNormal" style="margin-bottom:3.75pt;line-height:12pt">
Gordian<u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<p class="MsoNormal">The GPUGenerateData function:<u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">GPUGenerateData()</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> {</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> inputSize[3];</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputSize[3];</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> inputSpacing[3];</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputSpacing[3];</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">for</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
i = 0; i<3; i++)</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> {</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> inputSize[i] =
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetInput()->GetBufferedRegion().GetSize()[i];</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputSize[i] =
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetOutput()->GetBufferedRegion().GetSize()[i];</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> inputSpacing[i] =
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetInput()->GetSpacing()[i];</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputSpacing[i] =
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetOutput()->GetSpacing()[i];</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">if</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> ((inputSize[i] != outputSize[i]) || (inputSpacing[i] != outputSpacing[i]))</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> {</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> std::cerr
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan"><<</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(163,21,21)">"The CUDA laplacian filter can only handle input and output regions of equal size and spacing"</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan"><<</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> std::</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">endl</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">exit</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(1);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> }</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> }</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *pin = *(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetInput()->GetCudaDataManager()->GetGPUBufferPointer());</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> // This is a test area</span><u></u><u></u></p>
<p class="MsoNormal" style="margin-left:36pt;text-indent:36pt">
<span style="font-size:9.5pt;font-family:Consolas;color:blue">typename</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">InputImageType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">::</span><span style="font-size:9.5pt;font-family:Consolas;color:darkolivegreen">IndexType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
index;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> index.Fill(0);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">typename</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">InputImageType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">::</span><span style="font-size:9.5pt;font-family:Consolas;color:darkolivegreen">SizeType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
size;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">for</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">auto</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
i = 0; i < 3; ++i)</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> size.Fill(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetInput()->GetLargestPossibleRegion().GetSize()[i]);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">typename</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">InputImageType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">::</span><span style="font-size:9.5pt;font-family:Consolas;color:darkolivegreen">RegionType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
region(index, size);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> // images for gradients</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue" lang="DE">auto</span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE"> grad_x
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan" lang="DE">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan" lang="DE">CudaImage</span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE"><</span><span style="font-size:9.5pt;font-family:Consolas;color:blue" lang="DE">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE">,
3>::New();</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE"> grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan" lang="DE">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE">SetRegions(region);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE">
</span><span style="font-size:9.5pt;font-family:Consolas;color:black">grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">Allocate();</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">FillBuffer(1);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">auto</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_y
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">CudaImage</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
3>::New();</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">SetRegions(region);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">Allocate();</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">auto</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_z
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">CudaImage</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
3>::New();</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">SetRegions(region);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">Allocate();</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *pout_x = *(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)(grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetCudaDataManager()</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetGPUBufferPointer());</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *pout_y = *(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)(grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetCudaDataManager()</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetGPUBufferPointer());</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *pout_z = *(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)(grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetCudaDataManager()</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetGPUBufferPointer());</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">CUDA_gradient</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(inputSize, inputSpacing, pin, pout_x, pout_y, pout_z); // after this line neither of the pout_(xyz)
images have changed.</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> // put the gradient images in a single covariant vector image</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">auto</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> CompositeImageFilter
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> itk::</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">ComposeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">CudaImage</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
3>, </span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">CudaImage</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">CovariantVector</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
3>,3>>::New();</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> CompositeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">SetInput1(grad_x);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> CompositeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">SetInput2(grad_y);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> CompositeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">SetInput3(grad_z);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> CompositeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">Update();</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetOutput()->Graft(CompositeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetOutput());</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> }</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="color:black">The cuda/kernel function</span><u></u><u></u></p>
<p class="MsoNormal"><span style="color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)">__global__</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">gradient_kernel</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
* </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">int3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">float3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">CUDA_gradient</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[3],</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[3],</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">)</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">{</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">int3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dev_Size
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">make_int3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2]);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">float3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dev_Spacing
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">make_float3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2]);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:green">// Output volume</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputMemorySize =
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0] *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1] *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2] *
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">sizeof</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMalloc</span><span style="font-size:9.5pt;font-family:Consolas;color:black">((</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)&</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
outputMemorySize);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMalloc</span><span style="font-size:9.5pt;font-family:Consolas;color:black">((</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)&</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
outputMemorySize);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMalloc</span><span style="font-size:9.5pt;font-family:Consolas;color:black">((</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)&</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
outputMemorySize);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemset</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
0, outputMemorySize);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemset</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
0, outputMemorySize);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemset</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
0, outputMemorySize);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">printf</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(163,21,21)">"Device Variable Copying:\t</span><span style="font-size:9.5pt;font-family:Consolas;color:mediumseagreen">%s</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(163,21,21)">\n"</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaGetErrorString</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaGetLastError</span><span style="font-size:9.5pt;font-family:Consolas;color:black">()));</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:green">// Thread Block Dimensions</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dimBlock
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3(</span><span style="font-size:9.5pt;font-family:Consolas;color:black">16, 4, 4</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">)</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> blocksInX =
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">iDivUp</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0],
dimBlock.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> blocksInY =
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">iDivUp</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1],
dimBlock.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> blocksInZ =
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">iDivUp</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2],
dimBlock.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dimGrid
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3(</span><span style="font-size:9.5pt;font-family:Consolas;color:black">blocksInX, blocksInY, blocksInZ</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">)</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">gradient_kernel</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> <<< dimGrid, dimBlock >>> (dev_in, dev_out_x, dev_out_y, dev_out_z, dev_Size, dev_Spacing);</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaDeviceSynchronize</span><span style="font-size:9.5pt;font-family:Consolas;color:black">();</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">printf</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(163,21,21)">"Device Variable Copying:\t</span><span style="font-size:9.5pt;font-family:Consolas;color:mediumseagreen">%s</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(163,21,21)">\n"</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaGetErrorString</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaGetLastError</span><span style="font-size:9.5pt;font-family:Consolas;color:black">()));</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)">CUDA_CHECK_ERROR</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">}</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)">__global__</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">gradient_kernel</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
* </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">int3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">float3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">)</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">{</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">unsigned</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> i = blockIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
* blockDim.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> + threadIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">unsigned</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> j = blockIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
* blockDim.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> + threadIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">unsigned</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> k = blockIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
* blockDim.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> + threadIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">if</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (i >=
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
|| j >= </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
|| k >= </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">)</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">return</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> id = (k *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ j) * </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ i;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> id_x = (k *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ j) * </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ i + 1;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> id_y = (k *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ j + 1)* </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ i;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> id_z = ((k + 1) *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ j) * </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ i;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">if</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (i == (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
- 1)) </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = 0;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">else</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id_x]
- </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id]) /
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">if</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (j == (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
- 1)) </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = 0;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">else</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id_y]
- </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id]) /
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">if</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (k == (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
- 1)) </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = 0;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">else</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id_z]
- </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id]) /
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">}</span><u></u><u></u></p>
<p class="MsoNormal"><span style="color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="color:black"> </span><u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
<p class="MsoNormal">_______________________________________________<br>
Rtk-users mailing list<br>
<a href="mailto:Rtk-users@public.kitware.com" target="_blank">Rtk-users@public.kitware.com</a><br>
<a href="https://public.kitware.com/mailman/listinfo/rtk-users" target="_blank">https://public.kitware.com/mailman/listinfo/rtk-users</a><u></u><u></u></p>
</blockquote>
</div>
</div>
</div>
</blockquote></div>