<div dir="ltr"><div>Hi,</div><div>Sounds like a challenge. When you say you set fixed numbers, did you check that you reach the point where you set this number? You can use cuprintf to check what's going on in the kernel.</div><div>One thing wrong I noticed: you use size.Fill in a loop, which is a bit odd because it will Fill the size with the last value of the loop.<br></div><div>I hope this helps,</div><div>Simon<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Feb 27, 2019 at 9:39 PM Kabelitz, Gordian <<a href="mailto:Gordian.Kabelitz@medma.uni-heidelberg.de">Gordian.Kabelitz@medma.uni-heidelberg.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-US">
<div class="gmail-m_2590231864960196398WordSection1">
<p class="MsoNormal" style="margin-right:0cm;margin-bottom:3.75pt;margin-left:0cm;line-height:12pt">
Hi rtk-users,<u></u><u></u></p>
<p class="MsoNormal" style="margin-right:0cm;margin-bottom:3.75pt;margin-left:0cm;line-height:12pt">
I am facing an oddity which I cannot explain.<u></u><u></u></p>
<p class="MsoNormal" style="margin-right:0cm;margin-bottom:3.75pt;margin-left:0cm;line-height:12pt">
I want to implement a new gradient filter. The input is an CudaImage<float,3> and the output should be an CudaImage<CovariantVector<float,3>,3>. The filter runs without any cuda errors but the output (pout_(xyz)) is has not changed at all. The kernel function
is accessed and the print out from there seems to be okay. I tried to explicitly copy the content of the GPUBuffer into the CPUBuffer. Still no success. Even if I set fixed numbers in the kernel to the output image nothing changed. I use CUDA 9.0, Visual
Studio 2015, ITK 5.0, RTK 2.0 as remote module, CMake 3.13., Windows 7 64bit. The relevant code snippets are below.<u></u><u></u></p>
<p class="MsoNormal" style="margin-right:0cm;margin-bottom:3.75pt;margin-left:0cm;line-height:12pt">
Do I miss something obvious? Any recommendation are welcome.<u></u><u></u></p>
<p class="MsoNormal" style="margin-right:0cm;margin-bottom:3.75pt;margin-left:0cm;line-height:12pt">
With kind regards,<u></u><u></u></p>
<p class="MsoNormal" style="margin-right:0cm;margin-bottom:3.75pt;margin-left:0cm;line-height:12pt">
Gordian<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">The GPUGenerateData function:<u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">GPUGenerateData()<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> {<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> inputSize[3];<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputSize[3];<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> inputSpacing[3];<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputSpacing[3];<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">for</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
i = 0; i<3; i++)<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> {<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> inputSize[i] =
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetInput()->GetBufferedRegion().GetSize()[i];<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputSize[i] =
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetOutput()->GetBufferedRegion().GetSize()[i];<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> inputSpacing[i] =
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetInput()->GetSpacing()[i];<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputSpacing[i] =
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetOutput()->GetSpacing()[i];<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">if</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> ((inputSize[i] != outputSize[i]) || (inputSpacing[i] != outputSpacing[i]))<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> {<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> std::cerr
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan"><<</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(163,21,21)">"The CUDA laplacian filter can only handle input and output regions of equal size and spacing"</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan"><<</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> std::</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">endl</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">exit</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(1);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> }<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> }<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *pin = *(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetInput()->GetCudaDataManager()->GetGPUBufferPointer());<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> // This is a test area<u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt;text-indent:36pt">
<span style="font-size:9.5pt;font-family:Consolas;color:blue">typename</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">InputImageType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">::</span><span style="font-size:9.5pt;font-family:Consolas;color:darkolivegreen">IndexType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
index;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> index.Fill(0);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">typename</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">InputImageType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">::</span><span style="font-size:9.5pt;font-family:Consolas;color:darkolivegreen">SizeType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
size;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">for</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">auto</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
i = 0; i < 3; ++i)<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> size.Fill(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetInput()->GetLargestPossibleRegion().GetSize()[i]);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">typename</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">InputImageType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">::</span><span style="font-size:9.5pt;font-family:Consolas;color:darkolivegreen">RegionType</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
region(index, size);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> // images for gradients<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue" lang="DE">auto</span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE"> grad_x
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan" lang="DE">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan" lang="DE">CudaImage</span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE"><</span><span style="font-size:9.5pt;font-family:Consolas;color:blue" lang="DE">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE">,
3>::New();<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE"> grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan" lang="DE">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE">SetRegions(region);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black" lang="DE">
</span><span style="font-size:9.5pt;font-family:Consolas;color:black">grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">Allocate();<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">FillBuffer(1);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">auto</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_y
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">CudaImage</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
3>::New();<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">SetRegions(region);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">Allocate();<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">auto</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_z
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">CudaImage</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
3>::New();<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">SetRegions(region);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">Allocate();<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *pout_x = *(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)(grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetCudaDataManager()</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetGPUBufferPointer());<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *pout_y = *(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)(grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetCudaDataManager()</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetGPUBufferPointer());<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *pout_z = *(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)(grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetCudaDataManager()</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetGPUBufferPointer());<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">CUDA_gradient</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(inputSize, inputSpacing, pin, pout_x, pout_y, pout_z); // after this line neither of the pout_(xyz)
images have changed.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> // put the gradient images in a single covariant vector image<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">auto</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> CompositeImageFilter
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> itk::</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">ComposeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">CudaImage</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
3>, </span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">CudaImage</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">CovariantVector</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
3>,3>>::New();<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> CompositeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">SetInput1(grad_x);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> CompositeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">SetInput2(grad_y);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> CompositeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">SetInput3(grad_z);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> CompositeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">Update();<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">this</span><span style="font-size:9.5pt;font-family:Consolas;color:black">->GetOutput()->Graft(CompositeImageFilter</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">-></span><span style="font-size:9.5pt;font-family:Consolas;color:black">GetOutput());<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"> }<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="color:black">The cuda/kernel function<u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)">__global__</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">gradient_kernel</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
* </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">int3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">float3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">CUDA_gradient</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[3],<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[3],<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">)<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">{<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">int3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dev_Size
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">make_int3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2]);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">float3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dev_Spacing
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">make_float3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1],
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2]);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:green">// Output volume</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> outputMemorySize =
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0] *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1] *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2] *
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">sizeof</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMalloc</span><span style="font-size:9.5pt;font-family:Consolas;color:black">((</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)&</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMalloc</span><span style="font-size:9.5pt;font-family:Consolas;color:black">((</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)&</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMalloc</span><span style="font-size:9.5pt;font-family:Consolas;color:black">((</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black">**)&</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemset</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
0, outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemset</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
0, outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaMemset</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">dev_out_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
0, outputMemorySize);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">printf</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(163,21,21)">"Device Variable Copying:\t</span><span style="font-size:9.5pt;font-family:Consolas;color:mediumseagreen">%s</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(163,21,21)">\n"</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaGetErrorString</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaGetLastError</span><span style="font-size:9.5pt;font-family:Consolas;color:black">()));<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:green">// Thread Block Dimensions</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dimBlock
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3(</span><span style="font-size:9.5pt;font-family:Consolas;color:black">16, 4, 4</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">)</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> blocksInX =
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">iDivUp</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[0],
dimBlock.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> blocksInY =
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">iDivUp</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[1],
dimBlock.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> blocksInZ =
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">iDivUp</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[2],
dimBlock.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> dimGrid
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">=</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">dim3(</span><span style="font-size:9.5pt;font-family:Consolas;color:black">blocksInX, blocksInY, blocksInZ</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">)</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">gradient_kernel</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> <<< dimGrid, dimBlock >>> (dev_in, dev_out_x, dev_out_y, dev_out_z, dev_Size, dev_Spacing);<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaDeviceSynchronize</span><span style="font-size:9.5pt;font-family:Consolas;color:black">();<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">printf</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(163,21,21)">"Device Variable Copying:\t</span><span style="font-size:9.5pt;font-family:Consolas;color:mediumseagreen">%s</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(163,21,21)">\n"</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaGetErrorString</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">cudaGetLastError</span><span style="font-size:9.5pt;font-family:Consolas;color:black">()));<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)">CUDA_CHECK_ERROR</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">}<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)">__global__</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:blue">void</span><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:darkslateblue">gradient_kernel</span><span style="font-size:9.5pt;font-family:Consolas;color:black">(</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
* </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">float</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">int3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">,
</span><span style="font-size:9.5pt;font-family:Consolas;color:darkcyan">float3</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">)<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">{<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">unsigned</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> i = blockIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
* blockDim.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> + threadIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">unsigned</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> j = blockIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
* blockDim.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> + threadIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">unsigned</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> k = blockIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
* blockDim.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> + threadIdx.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">if</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (i >=
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
|| j >= </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
|| k >= </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">)<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">return</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> id = (k *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ j) * </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ i;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> id_x = (k *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ j) * </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ i + 1;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> id_y = (k *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ j + 1)* </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ i;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">long</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">int</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> id_z = ((k + 1) *
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ j) * </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
+ i;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">if</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (i == (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
- 1)) </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = 0;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">else</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id_x]
- </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id]) /
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">x</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">if</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (j == (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
- 1)) </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = 0;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">else</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id_y]
- </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id]) /
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">y</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">if</span><span style="font-size:9.5pt;font-family:Consolas;color:black"> (k == (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Size</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
- 1)) </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = 0;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:blue">else</span><span style="font-size:9.5pt;font-family:Consolas;color:black">
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">grad_z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id] = (</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id_z]
- </span><span style="font-size:9.5pt;font-family:Consolas;color:gray">in</span><span style="font-size:9.5pt;font-family:Consolas;color:black">[id]) /
</span><span style="font-size:9.5pt;font-family:Consolas;color:gray">c_Spacing</span><span style="font-size:9.5pt;font-family:Consolas;color:black">.</span><span style="font-size:9.5pt;font-family:Consolas;color:darkred">z</span><span style="font-size:9.5pt;font-family:Consolas;color:black">;<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9.5pt;font-family:Consolas;color:black">}<u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
_______________________________________________<br>
Rtk-users mailing list<br>
<a href="mailto:Rtk-users@public.kitware.com" target="_blank">Rtk-users@public.kitware.com</a><br>
<a href="https://public.kitware.com/mailman/listinfo/rtk-users" rel="noreferrer" target="_blank">https://public.kitware.com/mailman/listinfo/rtk-users</a><br>
</blockquote></div>