From colddiesun at gmail.com Fri Feb 1 10:30:25 2019 From: colddiesun at gmail.com (tao sun) Date: Fri, 1 Feb 2019 10:30:25 -0500 Subject: [Rtk-users] cast itk:::Image to itk::CudaImage In-Reply-To: References:

Message-ID: Hi, Now I fix the problem. It seems like before this I have issues with loading the rtkcuda libraries. My runtime argument failed to tell the program to load them. Thanks a lot for your help. Tao Simon Rit ? 2019?1?30? ?? ??4:22??? > I don't see what has changed since this commit... > For Tao, I would suggest to try a minimal example in one separate main > just to be sure that this is indeed this line that causes the problem and > not some bad memory management before. I really don't see what is the issue > here since you are using CUDA code from the same RTK version. > > On Wed, Jan 30, 2019 at 6:17 PM Chao Wu wrote: > >> L.S., >> >> I encouter recently the "Cuda Error # 3" with a Quadro P4000 when running >> rtkfdk, but I am not sure if the cause is the same as Tao's problem. >> CUDA_CHECK() marks Driver API error with a "#" whereas Runtime API error >> with a ":", so it is CUresult 3 (CUDA_ERROR_NOT_INITIALIZED). >> Indeed when I add CUDA_CHECK to cuInit(0) in itkCudaContextManager.cxx I >> already got an CUresult 100 (CUDA_ERROR_NO_DEVICE) here. >> In the same itkCudaContextManager.cxx file, itk::CudaGetAvailableDevices >> (which calls Runtime API) works well, so only Driver API doesn't work >> properly. >> The same build works fine with a Tesla P40. >> >> I don't have time to investigate further but I found on my computer a >> previous build based on commit 5717b6d02675ee10f03200038566f06dfcc2ad19 in >> May 2018 doesn't have this issue with the Quadro card. >> So I guess it shouldn't be the problem of the card or the driver, but a >> problem induced by a later commit. >> >> Best regards, >> Chao >> >> Andreas Andersen ?2019?1?30??? ??5:33??? >> >>> I've encountered the "Cuda Error # 3" issue two times. >>> As far as I remember, both were caused by the GPU being unreachable at >>> runtime: >>> * A missing or outdated (relative to CUDA version) nVidia driver >>> * The code running within a non-native environment. >>> The second point can also be triggered if the binary is located on >>> remote storage. >>> >>> However, both of these cases were on Windows, and by the path >>> "/home/tsun/bin/" I would guess you're on some Linux distro. >>> >>> /Andreas >>> >>> __________________________________ >>> >>> Andreas Gravgaard Andersen >>> >>> Department of Oncology, >>> >>> Aarhus University Hospital >>> >>> N?rrebrogade 44, >>> >>> 8000, Aarhus C >>> >>> Mail: agravgaard at protonmail.com >>> >>> Cell: +45 3165 8140 >>> >>> >>> On Wed, 30 Jan 2019 at 17:29, Simon Rit >>> wrote: >>> >>>> Can you send the code if you want us to help? >>>> >>>> On Wed, Jan 30, 2019 at 5:21 PM tao sun wrote: >>>> >>>>> No I am not using that. But the error was thrown before GRAFT() >>>>> function was called. It happens when I initialized the backprojector: >>>>> bp = rtk::CudaRayCastBackProjectionImageFilter::New(). >>>>> >>>>> Tao >>>>> >>>>> Simon Rit ? 2019?1?30? ?? ??11:13??? >>>>> >>>>>> Are you using the HEAD version of the git ? Because I recently >>>>>> corrected a bug in the Graft function (commit >>>>>> b2d73642ce171ba9890af2c107a1a31f923454b5). >>>>>> Simon >>>>>> >>>>>> On Wed, Jan 30, 2019 at 5:05 PM tao sun wrote: >>>>>> >>>>>>> Hi Simon, >>>>>>> >>>>>>> CUDA_HAVE_GPU is on. So is CUDA_FOUND. I can run examples like >>>>>>> rtkfdk with gpu on without any problem though, >>>>>>> By the way I am using CUDA 9.2.88. >>>>>>> >>>>>>> Tao >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Simon Rit ? 2019?1?30? ?? ??1:25??? >>>>>>> >>>>>>>> Hi, >>>>>>>> No, not really. In my experience, this occurs mainly when you don't >>>>>>>> have a GPU properly configured for CUDA. Can you check the value of >>>>>>>> CUDA_HAVE_GPU in cmake ? If it's OFF, then this is indeed the problem. >>>>>>>> Simon >>>>>>>> >>>>>>>> On Wed, Jan 30, 2019 at 1:23 AM tao sun >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi again, >>>>>>>>> >>>>>>>>> Finally I got time to work on this. I was able to compile the >>>>>>>>> program this time using the Graft() function. However, there's a runtime >>>>>>>>> error when I run the program: >>>>>>>>> >>>>>>>>> >>>>>>>>> /home/tsun/bin/RTK-1.4.0/utilities/ITKCudaCommon/src/itkCudaDataManager.cxx:38 >>>>>>>>> @ unknown : Cuda Error #3 >>>>>>>>> terminate called after throwing an instance of >>>>>>>>> 'itk::ExceptionObject' >>>>>>>>> what(): >>>>>>>>> /home/tsun/bin/RTK-1.4.0/utilities/ITKCudaCommon/src/itkCudaDataManager.cxx:38: >>>>>>>>> Cuda Error # 3 >>>>>>>>> Aborted >>>>>>>>> >>>>>>>>> It happens when a new gpu backprojector is created: >>>>>>>>> bp = rtk::CudaRayCastBackProjectionImageFilter::New(); >>>>>>>>> >>>>>>>>> Any insights for this? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Tao >>>>>>>>> >>>>>>>>> tao sun ? 2019?1?19? ?? ??8:51??? >>>>>>>>> >>>>>>>>>> Thank you all! I will give a try using your solutions. >>>>>>>>>> Tao >>>>>>>>>> >>>>>>>>>> Simon Rit ? 2019?1?17? ?? >>>>>>>>>> ??12:26??? >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> That's one solution. Two other: >>>>>>>>>>> - graft the output to a Cuda image >>>>>>>>>>> itk::CudaImage::Pointer cuImg = itk::CudaImage>>>>>>>>>> 3>::New(); >>>>>>>>>>> cuImg->Graft(projectionReader->GetOutput()) >>>>>>>>>>> - use the rtk::ImportImageFilter which is templated over image >>>>>>>>>>> type to allow precisely this (I used it in Gate here >>>>>>>>>>> >>>>>>>>>>> ). >>>>>>>>>>> Best regards, >>>>>>>>>>> Simon >>>>>>>>>>> >>>>>>>>>>> On Wed, Jan 16, 2019 at 11:06 PM Andreas Andersen < >>>>>>>>>>> andreasga22 at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Tao, >>>>>>>>>>>> >>>>>>>>>>>> I think you want the CastImageFilter >>>>>>>>>>>> >>>>>>>>>>>> from ITK. >>>>>>>>>>>> >>>>>>>>>>>> Something like this: >>>>>>>>>>>> using castToImageType = itk::CastImageFilter< >>>>>>>>>>>> itk:::Image, itk::CudaImage>; >>>>>>>>>>>> typename castToImageType::Pointer castfilter = >>>>>>>>>>>> castToImageType::New(); >>>>>>>>>>>> castfilter->SetInput(projectionReader->GetOutput()); >>>>>>>>>>>> castfilter->Update(); >>>>>>>>>>>> auto cuda_image = castfilter->GetOutput(); >>>>>>>>>>>> >>>>>>>>>>>> Best regards Andreas >>>>>>>>>>>> >>>>>>>>>>>> __________________________________ >>>>>>>>>>>> >>>>>>>>>>>> Andreas Gravgaard Andersen >>>>>>>>>>>> >>>>>>>>>>>> Department of Oncology, >>>>>>>>>>>> >>>>>>>>>>>> Aarhus University Hospital >>>>>>>>>>>> >>>>>>>>>>>> N?rrebrogade 44, >>>>>>>>>>>> >>>>>>>>>>>> 8000, Aarhus C >>>>>>>>>>>> >>>>>>>>>>>> Mail: agravgaard at protonmail.com >>>>>>>>>>>> >>>>>>>>>>>> Cell: +45 3165 8140 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, 16 Jan 2019 at 22:59, tao sun >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> I have read in some image using itk::ImportImageFilter. >>>>>>>>>>>>> ImportFilterType::Pointer projectionReader = >>>>>>>>>>>>> ImportFilterType::New(); >>>>>>>>>>>>> ... >>>>>>>>>>>>> projectionReader->Update(); >>>>>>>>>>>>> >>>>>>>>>>>>> The type of the image is itk:::Image. I wonder if >>>>>>>>>>>>> there is any way I can cast it to itk::CudaImage? >>>>>>>>>>>>> In rtkforwardprojections.cxx the imageReaderType is defined as >>>>>>>>>>>>> CudaImageType so there is no such problem. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Tao >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> Rtk-users mailing list >>>>>>>>>>>>> Rtk-users at public.kitware.com >>>>>>>>>>>>> https://public.kitware.com/mailman/listinfo/rtk-users >>>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Rtk-users mailing list >>>>>>>>>>>> Rtk-users at public.kitware.com >>>>>>>>>>>> https://public.kitware.com/mailman/listinfo/rtk-users >>>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>> Rtk-users mailing list >>>>>>>>> Rtk-users at public.kitware.com >>>>>>>>> https://public.kitware.com/mailman/listinfo/rtk-users >>>>>>>>> >>>>>>>> _______________________________________________ >>>> Rtk-users mailing list >>>> Rtk-users at public.kitware.com >>>> https://public.kitware.com/mailman/listinfo/rtk-users >>>> >>> _______________________________________________ >>> Rtk-users mailing list >>> Rtk-users at public.kitware.com >>> https://public.kitware.com/mailman/listinfo/rtk-users >>> >> _______________________________________________ > Rtk-users mailing list > Rtk-users at public.kitware.com > https://public.kitware.com/mailman/listinfo/rtk-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.rit at creatis.insa-lyon.fr Wed Feb 6 05:21:21 2019 From: simon.rit at creatis.insa-lyon.fr (Simon Rit) Date: Wed, 6 Feb 2019 11:21:21 +0100 Subject: [Rtk-users] Release of RTK 2.0.0 Message-ID: Dear RTK users, RTK v2.0.0 has just been released, about one year after RTK v1.4.0. The change of major version indicates that there has been a major refactoring of the code to let RTK compile as an ITK external or remote module. Release notes: * Refactoring of the repository to follow the structure of ITK's modules. * Removal of SimpleRTK wrapping. * Development of ITK's python wrapping. * Spectral CT developments: implementation of Mechlem's one-step inversion, adaptation of the conjugate gradient algorithm to vector projections. * Implementation of (ordered subsets) expectation maximization. * Development of attenuated forward and backprojectors. Many thanks to all contributors, in alphabetical order for this release: Andreas Gravgaard Andersen, Antoine Robert, Chao Wu, Cyril Mory, D?enan Zuki?, Hans Johnson, Lucas Gandel, Matt McCormick, S?bastien Brousmiche, Simon Rit and Thomas Baudier. As usual, be aware that we don't focus on releases since we have a public github repository that we try to keep stable. I still recommend the use of the master HEAD over releases to enjoy the new RTK developments before their release. We still have a few on-going projects for which we will use and enhance RTK. Simon (for the RTK consortium) -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.rit at creatis.insa-lyon.fr Tue Feb 12 04:53:04 2019 From: simon.rit at creatis.insa-lyon.fr (Simon Rit) Date: Tue, 12 Feb 2019 10:53:04 +0100 Subject: [Rtk-users] Release of RTK 2.0.0 In-Reply-To: References: Message-ID: The remote module file for RTK v2 has been integrated in ITK's master branch: https://github.com/InsightSoftwareConsortium/ITK/commit/aa9e2d1797a23ba46a78afa294f51d970e6f54c3 You can now compile RTK when you compile ITK by turning on the advanced cmake option "Module_RTK". This is a great step! Moreover, I have compiled the RTK python packages for all versions available for ITK (includes several python versions on Linux, MacOS and Windows). To install the RTK package, simply run python -m pip install --upgrade pip python -m pip install --upgrade --pre itk python -m pip install itk-rtk You can test the FirstReconstruction.py example (from master, I have just fixed a small issue). Let us know if something is not working as expected, this package probably needs a more thorough testing. Simon On Wed, Feb 6, 2019 at 11:21 AM Simon Rit wrote: > Dear RTK users, > > RTK v2.0.0 has just > been released, about one year after RTK v1.4.0. The change of major version > indicates that there has been a major refactoring of the code to let RTK > compile as an ITK external or remote module. > > Release notes: > * Refactoring of the repository to follow the structure of ITK's modules. > * Removal of SimpleRTK wrapping. > * Development of ITK's python wrapping. > * Spectral CT developments: implementation of Mechlem's one-step inversion, > adaptation of the conjugate gradient algorithm to vector projections. > * Implementation of (ordered subsets) expectation maximization. > * Development of attenuated forward and backprojectors. > > Many thanks to all contributors, in alphabetical order for this release: > Andreas Gravgaard Andersen, Antoine Robert, Chao Wu, Cyril Mory, D?enan > Zuki?, > Hans Johnson, Lucas Gandel, Matt McCormick, S?bastien Brousmiche, Simon > Rit and > Thomas Baudier. > > As usual, be aware that we don't focus on releases since we have a public > github repository that we try to keep > stable. I still recommend the use of the master HEAD over releases to enjoy > the new RTK developments before their release. We still have a few on-going > projects for which we will use and enhance RTK. > > Simon (for the RTK consortium) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amarnaths161 at gmail.com Wed Feb 13 02:02:34 2019 From: amarnaths161 at gmail.com (Amarnath S) Date: Wed, 13 Feb 2019 12:32:34 +0530 Subject: [Rtk-users] WaterPreCorrectionFilter Example - Python Message-ID: Hi, Thanks for creating the Python version of RTK, and enabling us to get it via pip. I downloaded it yesterday, and successfully ran the First Reconstruction Python example. Am now trying to convert the WaterPreCorrection example, which was earlier available using Simple RTK, on this page http://wiki.openrtk.org/index.php/WaterPreCorrection Am using the same dataset for this, as available in the archive "beam_hardening.tgz" available on the same page. However, am getting an error while running my code. Have created a minimal code in Python, for getting the error, and it is given here. import sys import itk from itk import RTK as rtk import glob import os # Script parameter dir = "F:\\RTKDir\\output\\" # List of filenames fileNames = [] for file in os.listdir(dir): if file.startswith("attenuation") and file.endswith(".mha"): fileNames.append(dir + file) projReader = rtk.ProjectionsReader.New() projReader.SetFileNames(fileNames) print("Filenames set") waterPre = rtk.WaterPrecorrectionImageFilter.New(); wpcoeffs = [0, 0, 0, 0, 0, 0, 1] print("Water Pre defined") waterPre.SetCoefficients(wpcoeffs) waterPre.SetInput(projReader.GetOutput()) print("Gives error on the above line - Expecting argument of type itkImageF3 or itkImageSourceIF3") It gives error for the line where we set the input to waterPre, the last-but-one line of this script - waterPre.SetInput(projReader.GetOutput()). The error message is TypeError: Expecting argument of type itkImageF3 or itkImageSourceIF3. As far as I know, I have replicated the code from the example available for SimpleRTK. Note, I have created a folder called F:\RTKDir\, which has the contents of the archive "beam_hardening.tgz" have been extracted there. So, my questions are: 1. Have the input types been changed for the WaterPreCorrection Filter, since the posting of the earlier example? 2. If so, how can I specify projections as read in from 360 files as inputs to this filter? Any pointers in this regard will be greatly appreciated. Thanks and Regards - Amarnath -------------- next part -------------- An HTML attachment was scrubbed... URL: From Pierre.Daye at iba-group.com Mon Feb 18 01:46:02 2019 From: Pierre.Daye at iba-group.com (Pierre Daye) Date: Mon, 18 Feb 2019 06:46:02 +0000 Subject: [Rtk-users] segfault in RTK 2.0 Message-ID: <3fafba3534f7423283092c4471bcc952@LLNEXMBX1.goiba.net> Hello, First of all congrats for the update of RTK and the new Python integration. This is really great for Python users who don't want the whole C++ armada to test a small reconstruction principle! I could easily run the first reconstruction and it worked perfectly! However, I tried then to first save the images using these method within a simulator python class: def buildGeometry(self, offsetZ, offsetY): geometry = rtk.ThreeDCircularProjectionGeometry.New() for x in range(0, self.param['numberOfProjections']): angle = self.param['firstAngle'] + x * self.param['angularArc'] / self.param['numberOfProjections'] geometry.AddProjection(self.param['sid'] + offsetZ, self.param['sdd'], angle, self.param['isox'], self.param['isoy'], self.param['outOfPlaneAngle'], self.param['inPlaneAngle'], self.param['sourceOffsetX'], offsetY) return geometry def runCBCT(self, filename, offsetZ, offsetY): geometry = self.buildGeometry(offsetZ, offsetY) constantImageSource = rtk.ConstantImageSource[self.param['TImageType']].New() constantImageSource.SetOrigin( self.param['origin'] ) constantImageSource.SetSpacing( self.param['spacing'] ) constantImageSource.SetSize( self.param['sizeOutput'] ) constantImageSource.SetConstant(0.0) source = constantImageSource.GetOutput() self.rei.SetGeometry(geometry) self.rei.SetInput(source) projections = self.rei.GetOutput() writer = itk.ImageFileWriter[self.param['TImageType']].New() writer.SetFileName(filename) writer.SetInput(projections) writer.Update() This parts works perfectly and the generated images are OK (checked with IMageJ). The problems start when I want to reconstruct the volume. def buildVolumeFromFile(self, file, fileOUT, CBCT): geometry = CBCT.buildGeometry(0, 0) print("Performing reconstruction") TImageType = self.param['TImageType'] reader = itk.ImageFileReader[TImageType].New() reader.SetFileName(file) reiImage = reader.GetOutput() # Create reconstructed image constantImageSource2 = rtk.ConstantImageSource[TImageType].New() origin = [ -63.5, -63.5, -63.5 ] sizeOutput = [ 128, 128, 128 ] constantImageSource2.SetOrigin( origin ) constantImageSource2.SetSpacing( [1, 1, 1] ) constantImageSource2.SetSize( sizeOutput ) constantImageSource2.SetConstant(0.0) source2 = constantImageSource2.GetOutput() print("Performing reconstruction") feldkamp = rtk.FDKConeBeamReconstructionFilter[TImageType].New() feldkamp.SetGeometry( geometry ) feldkamp.SetInput(0, source2) feldkamp.SetInput(1, reiImage) image = feldkamp.GetOutput() print("Masking field-of-view") fov = rtk.FieldOfViewImageFilter[TImageType, TImageType].New() fov.SetGeometry(geometry) fov.SetProjectionsStack(reiImage) fov.SetInput(image) image = fov.GetOutput() writer = itk.ImageFileWriter[TImageType].New() writer.SetFileName ( fileOUT ) writer.SetInput(image) writer.Update() I cannot run this script. It generates a criptic segfault... What am I doing wrong here? If one of you RTK gurus could help me, I would greatly appreciate it! Thanks, Pierre Disclaimer | Use of IBA e-communication The contents of this e-mail message and any attachments are intended solely for the recipient (s) named above. This communication is intended to be and to remain confidential and may be protected by intellectual property rights. Any use of the information contained herein (including but not limited to, total or partial reproduction, communication or distribution of any form) by persons other than the designated recipient(s) is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free. Ion Beam Applications does not accept liability for any such errors. Thank you for your cooperation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.rit at creatis.insa-lyon.fr Tue Feb 19 17:15:12 2019 From: simon.rit at creatis.insa-lyon.fr (Simon Rit) Date: Tue, 19 Feb 2019 23:15:12 +0100 Subject: [Rtk-users] segfault in RTK 2.0 In-Reply-To: <3fafba3534f7423283092c4471bcc952@LLNEXMBX1.goiba.net> References: <3fafba3534f7423283092c4471bcc952@LLNEXMBX1.goiba.net> Message-ID: Hi, Thanks for sharing your experience. I tried to use your code to reproduce your problem but I couldn't. Enclosed is the code that I run with import daye simulator = daye.simulator() simulator.runCBCT('proj.mha', 0, 0) simulator.buildVolumeFromFile('proj.mha','recon.mha',simulator) Does it work for you? If yes, can you send your full example? Simon On Mon, Feb 18, 2019 at 7:51 AM Pierre Daye wrote: > > > Hello, > > > > First of all congrats for the update of RTK and the new Python > integration. This is really great for Python users who don?t want the whole > > C++ armada to test a small reconstruction principle! > > I could easily run the first reconstruction and it worked perfectly! > However, I tried then to first save the images using these method within > > a simulator python class: > > > > def buildGeometry(self, offsetZ, offsetY): > > geometry = rtk.ThreeDCircularProjectionGeometry.New() > > for x in range(0, self.param['numberOfProjections']): > > angle = self.param['firstAngle'] + x * > self.param['angularArc'] / self.param['numberOfProjections'] > > geometry.AddProjection(self.param['sid'] + offsetZ, > > self.param['sdd'], angle, self.param['isox'], > > self.param['isoy'], self.param['outOfPlaneAngle'], > > self.param['inPlaneAngle'], self.param['sourceOffsetX'], > offsetY) > > > > return geometry > > > > def runCBCT(self, filename, offsetZ, offsetY): > > geometry = self.buildGeometry(offsetZ, offsetY) > > > > constantImageSource = > rtk.ConstantImageSource[self.param['TImageType']].New() > > constantImageSource.SetOrigin( self.param['origin'] ) > > constantImageSource.SetSpacing( self.param['spacing'] ) > > constantImageSource.SetSize( self.param['sizeOutput'] ) > > constantImageSource.SetConstant(0.0) > > source = constantImageSource.GetOutput() > > > > self.rei.SetGeometry(geometry) > > self.rei.SetInput(source) > > > > projections = self.rei.GetOutput() > > > > writer = itk.ImageFileWriter[self.param['TImageType']].New() > > writer.SetFileName(filename) > > writer.SetInput(projections) > > writer.Update() > > > > This parts works perfectly and the generated images are OK (checked with > IMageJ). > > The problems start when I want to reconstruct the volume. > > > > def buildVolumeFromFile(self, file, fileOUT, CBCT): > > geometry = CBCT.buildGeometry(0, 0) > > print("Performing reconstruction") > > TImageType = self.param['TImageType'] > > reader = itk.ImageFileReader[TImageType].New() > > reader.SetFileName(file) > > reiImage = reader.GetOutput() > > > > # Create reconstructed image > > constantImageSource2 = rtk.ConstantImageSource[TImageType].New() > > origin = [ -63.5, -63.5, -63.5 ] > > sizeOutput = [ 128, 128, 128 ] > > constantImageSource2.SetOrigin( origin ) > > constantImageSource2.SetSpacing( [1, 1, 1] ) > > constantImageSource2.SetSize( sizeOutput ) > > constantImageSource2.SetConstant(0.0) > > source2 = constantImageSource2.GetOutput() > > > > print("Performing reconstruction") > > feldkamp = rtk.FDKConeBeamReconstructionFilter[TImageType].New() > > feldkamp.SetGeometry( geometry ) > > feldkamp.SetInput(0, source2) > > feldkamp.SetInput(1, reiImage) > > image = feldkamp.GetOutput() > > > > print("Masking field-of-view") > > fov = rtk.FieldOfViewImageFilter[TImageType, TImageType].New() > > fov.SetGeometry(geometry) > > fov.SetProjectionsStack(reiImage) > > fov.SetInput(image) > > image = fov.GetOutput() > > > > writer = itk.ImageFileWriter[TImageType].New() > > writer.SetFileName ( fileOUT ) > > writer.SetInput(image) > > writer.Update() > > > > I cannot run this script. It generates a criptic segfault? What am I doing > wrong here? > > If one of you RTK gurus could help me, I would greatly appreciate it! > > > > Thanks, > > > > Pierre > > Disclaimer | Use of IBA e-communication > > > The contents of this e-mail message and any attachments are intended > solely for the recipient (s) named above. This communication is intended to > be and to remain confidential and may be protected by intellectual property > rights. Any use of the information contained herein (including but not > limited to, total or partial reproduction, communication or distribution of > any form) by persons other than the designated recipient(s) is prohibited. > Please notify the sender immediately by e-mail if you have received this > e-mail by mistake and delete this e-mail from your system. E-mail > transmission cannot be guaranteed to be secure or error-free. Ion Beam > Applications does not accept liability for any such errors. Thank you for > your cooperation. > _______________________________________________ > Rtk-users mailing list > Rtk-users at public.kitware.com > https://public.kitware.com/mailman/listinfo/rtk-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: daye.py Type: text/x-python Size: 3703 bytes Desc: not available URL: From Gordian.Kabelitz at medma.uni-heidelberg.de Wed Feb 20 10:06:04 2019 From: Gordian.Kabelitz at medma.uni-heidelberg.de (Kabelitz, Gordian) Date: Wed, 20 Feb 2019 15:06:04 +0000 Subject: [Rtk-users] rtk::CudaVectorImage available Message-ID: Hello, I am looking for a ITKCudaCommon datatype that resembles the itk::VectorImage. Is there a possibility to use itk::CudaImage somehow or other workarounds? Thanks in advance, Gordian -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.rit at creatis.insa-lyon.fr Wed Feb 20 10:26:47 2019 From: simon.rit at creatis.insa-lyon.fr (Simon Rit) Date: Wed, 20 Feb 2019 16:26:47 +0100 Subject: [Rtk-users] rtk::CudaVectorImage available In-Reply-To: References: Message-ID: Hi Gordian, I don't think so. We have successfully used images of vectors (e.g., itk::CudaImage,2>) but I have never used VectorImage and I think you would need to rewrite an itk::CudaVectorImage to do so. I'm not sure how difficult that would be... Simon On Wed, Feb 20, 2019 at 4:13 PM Kabelitz, Gordian < Gordian.Kabelitz at medma.uni-heidelberg.de> wrote: > Hello, > > > > I am looking for a ITKCudaCommon datatype that resembles the > itk::VectorImage. > > > > Is there a possibility to use itk::CudaImage somehow or other workarounds? > > > > Thanks in advance, > > Gordian > _______________________________________________ > Rtk-users mailing list > Rtk-users at public.kitware.com > https://public.kitware.com/mailman/listinfo/rtk-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Gordian.Kabelitz at medma.uni-heidelberg.de Thu Feb 21 17:43:49 2019 From: Gordian.Kabelitz at medma.uni-heidelberg.de (Kabelitz, Gordian) Date: Thu, 21 Feb 2019 22:43:49 +0000 Subject: [Rtk-users] rtk::CudaVectorImage available In-Reply-To: References: Message-ID: <2c20b332a8644aecbb2302034376cc5b@exch06.ad.uni-heidelberg.de> Hi Simon, I asked because of the description of the itk::VectorImage (https://itk.org/Doxygen/html/classitk_1_1VectorImage.html): ?Conceptually, a VectorImage< TPixel, 3 > is the same as a Image< VariableLengthVector< TPixel >, 3 >. The difference lies in the memory organization. The latter results in a fragmented organization with each location in the Image holding a pointer to an VariableLengthVector holding the actual pixel. The former stores the k pixels instead of a pointer reference, which apart from avoiding fragmentation of memory also avoids storing a 8 bytes of pointer reference for each pixel. The parameter k can be set using SetVectorLength.? Does it make any performance differences on the CPU algorithms? I guess that the memory is reorganized in the rtk::CudaImageFilter to make it more efficient on the GPU (like a new image for each vector component). Or it is handled in a different way? I think it will be more time consuming and error prone to do it by myself since I am not very familiar with GPU memory management. Gordian Von: Simon Rit [mailto:simon.rit at creatis.insa-lyon.fr] Gesendet: Mittwoch, 20. Februar 2019 16:27 An: Kabelitz, Gordian Cc: rtk-users at public.kitware.com Betreff: Re: [Rtk-users] rtk::CudaVectorImage available Hi Gordian, I don't think so. We have successfully used images of vectors (e.g., itk::CudaImage,2>) but I have never used VectorImage and I think you would need to rewrite an itk::CudaVectorImage to do so. I'm not sure how difficult that would be... Simon On Wed, Feb 20, 2019 at 4:13 PM Kabelitz, Gordian > wrote: Hello, I am looking for a ITKCudaCommon datatype that resembles the itk::VectorImage. Is there a possibility to use itk::CudaImage somehow or other workarounds? Thanks in advance, Gordian _______________________________________________ Rtk-users mailing list Rtk-users at public.kitware.com https://public.kitware.com/mailman/listinfo/rtk-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.rit at creatis.insa-lyon.fr Thu Feb 21 17:54:50 2019 From: simon.rit at creatis.insa-lyon.fr (Simon Rit) Date: Thu, 21 Feb 2019 23:54:50 +0100 Subject: [Rtk-users] rtk::CudaVectorImage available In-Reply-To: <2c20b332a8644aecbb2302034376cc5b@exch06.ad.uni-heidelberg.de> References: <2c20b332a8644aecbb2302034376cc5b@exch06.ad.uni-heidelberg.de> Message-ID: Hi, I'm not very familiar with VariableLengthVector but I think it is not the same as a Vector. An itk::CudaImage,2> buffer is not fragmented. In terms of performances, it depends how you use them. In our usage, I think it's more efficient to have all values belonging to a pixel next to each other in memory. No, there is no memory reorganization for the GPU, it's a straight copy of the CPU to the GPU. We use textures internally sometimes but this is done when executing the filter. Don't hesitate to describe the problem you want to solve, I can try to help... Simon On Thu, Feb 21, 2019 at 11:45 PM Kabelitz, Gordian < Gordian.Kabelitz at medma.uni-heidelberg.de> wrote: > Hi Simon, > > > > I asked because of the description of the itk::VectorImage ( > https://itk.org/Doxygen/html/classitk_1_1VectorImage.html): > > *?**Conceptually, a **VectorImage< TPixel, 3 >** is the same as a **Image > < > VariableLengthVector< TPixel >, 3 >**. The difference lies in the memory > organization. The latter results in a fragmented organization with each > location in the Image ** holding > a pointer to an **VariableLengthVector > ** holding > the actual pixel. The former stores the **k** pixels instead of a pointer > reference, which apart from avoiding fragmentation of memory also avoids > storing a 8 bytes of pointer reference for each pixel. The parameter **k** can > be set using **SetVectorLength**.?* > > > > Does it make any performance differences on the CPU algorithms? > > I guess that the memory is reorganized in the rtk::CudaImageFilter to make > it more efficient on the GPU (like a new image for each vector component). > Or it is handled in a different way? > > > > I think it will be more time consuming and error prone to do it by myself > since I am not very familiar with GPU memory management. > > > > Gordian > > > > *Von:* Simon Rit [mailto:simon.rit at creatis.insa-lyon.fr] > *Gesendet:* Mittwoch, 20. Februar 2019 16:27 > *An:* Kabelitz, Gordian > *Cc:* rtk-users at public.kitware.com > *Betreff:* Re: [Rtk-users] rtk::CudaVectorImage available > > > > Hi Gordian, > > I don't think so. We have successfully used images of vectors (e.g., > itk::CudaImage,2>) but I have never used VectorImage > and I think you would need to rewrite an itk::CudaVectorImage to do so. I'm > not sure how difficult that would be... > > Simon > > > > On Wed, Feb 20, 2019 at 4:13 PM Kabelitz, Gordian < > Gordian.Kabelitz at medma.uni-heidelberg.de> wrote: > > Hello, > > > > I am looking for a ITKCudaCommon datatype that resembles the > itk::VectorImage. > > > > Is there a possibility to use itk::CudaImage somehow or other workarounds? > > > > Thanks in advance, > > Gordian > > _______________________________________________ > Rtk-users mailing list > Rtk-users at public.kitware.com > https://public.kitware.com/mailman/listinfo/rtk-users > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Gordian.Kabelitz at medma.uni-heidelberg.de Wed Feb 27 15:32:09 2019 From: Gordian.Kabelitz at medma.uni-heidelberg.de (Kabelitz, Gordian) Date: Wed, 27 Feb 2019 20:32:09 +0000 Subject: [Rtk-users] GPU kernel do not change output variables Message-ID: Hi rtk-users, I am facing an oddity which I cannot explain. I want to implement a new gradient filter. The input is an CudaImage and the output should be an CudaImage,3>. The filter runs without any cuda errors but the output (pout_(xyz)) is has not changed at all. The kernel function is accessed and the print out from there seems to be okay. I tried to explicitly copy the content of the GPUBuffer into the CPUBuffer. Still no success. Even if I set fixed numbers in the kernel to the output image nothing changed. I use CUDA 9.0, Visual Studio 2015, ITK 5.0, RTK 2.0 as remote module, CMake 3.13., Windows 7 64bit. The relevant code snippets are below. Do I miss something obvious? Any recommendation are welcome. With kind regards, Gordian The GPUGenerateData function: GPUGenerateData() { int inputSize[3]; int outputSize[3]; float inputSpacing[3]; float outputSpacing[3]; for (int i = 0; i<3; i++) { inputSize[i] = this->GetInput()->GetBufferedRegion().GetSize()[i]; outputSize[i] = this->GetOutput()->GetBufferedRegion().GetSize()[i]; inputSpacing[i] = this->GetInput()->GetSpacing()[i]; outputSpacing[i] = this->GetOutput()->GetSpacing()[i]; if ((inputSize[i] != outputSize[i]) || (inputSpacing[i] != outputSpacing[i])) { std::cerr << "The CUDA laplacian filter can only handle input and output regions of equal size and spacing" << std::endl; exit(1); } } float *pin = *(float**)(this->GetInput()->GetCudaDataManager()->GetGPUBufferPointer()); // This is a test area typename InputImageType::IndexType index; index.Fill(0); typename InputImageType::SizeType size; for (auto i = 0; i < 3; ++i) size.Fill(this->GetInput()->GetLargestPossibleRegion().GetSize()[i]); typename InputImageType::RegionType region(index, size); // images for gradients auto grad_x = CudaImage::New(); grad_x->SetRegions(region); grad_x->Allocate(); grad_x->FillBuffer(1); auto grad_y = CudaImage::New(); grad_y->SetRegions(region); grad_y->Allocate(); auto grad_z = CudaImage::New(); grad_z->SetRegions(region); grad_z->Allocate(); float *pout_x = *(float**)(grad_x->GetCudaDataManager()->GetGPUBufferPointer()); float *pout_y = *(float**)(grad_y->GetCudaDataManager()->GetGPUBufferPointer()); float *pout_z = *(float**)(grad_z->GetCudaDataManager()->GetGPUBufferPointer()); CUDA_gradient(inputSize, inputSpacing, pin, pout_x, pout_y, pout_z); // after this line neither of the pout_(xyz) images have changed. // put the gradient images in a single covariant vector image auto CompositeImageFilter = itk::ComposeImageFilter, CudaImage,3>>::New(); CompositeImageFilter->SetInput1(grad_x); CompositeImageFilter->SetInput2(grad_y); CompositeImageFilter->SetInput3(grad_z); CompositeImageFilter->Update(); this->GetOutput()->Graft(CompositeImageFilter->GetOutput()); } The cuda/kernel function __global__ void gradient_kernel(float * in, float * grad_x, float * grad_y, float * grad_z, int3 c_Size, float3 c_Spacing); void CUDA_gradient( int size[3], float spacing[3], float *dev_in, float *dev_out_x, float *dev_out_y, float *dev_out_z) { int3 dev_Size = make_int3(size[0], size[1], size[2]); float3 dev_Spacing = make_float3(spacing[0], spacing[1], spacing[2]); // Output volume long int outputMemorySize = size[0] * size[1] * size[2] * sizeof(float); cudaMalloc((void**)&dev_out_x, outputMemorySize); cudaMalloc((void**)&dev_out_y, outputMemorySize); cudaMalloc((void**)&dev_out_z, outputMemorySize); cudaMemset(dev_out_x, 0, outputMemorySize); cudaMemset(dev_out_y, 0, outputMemorySize); cudaMemset(dev_out_z, 0, outputMemorySize); printf("Device Variable Copying:\t%s\n", cudaGetErrorString(cudaGetLastError())); // Thread Block Dimensions dim3 dimBlock = dim3(16, 4, 4); int blocksInX = iDivUp(size[0], dimBlock.x); int blocksInY = iDivUp(size[1], dimBlock.y); int blocksInZ = iDivUp(size[2], dimBlock.z); dim3 dimGrid = dim3(blocksInX, blocksInY, blocksInZ); gradient_kernel <<< dimGrid, dimBlock >>> (dev_in, dev_out_x, dev_out_y, dev_out_z, dev_Size, dev_Spacing); cudaDeviceSynchronize(); printf("Device Variable Copying:\t%s\n", cudaGetErrorString(cudaGetLastError())); CUDA_CHECK_ERROR; } __global__ void gradient_kernel(float * in, float * grad_x, float * grad_y, float * grad_z, int3 c_Size, float3 c_Spacing) { unsigned int i = blockIdx.x * blockDim.x + threadIdx.x; unsigned int j = blockIdx.y * blockDim.y + threadIdx.y; unsigned int k = blockIdx.z * blockDim.z + threadIdx.z; if (i >= c_Size.x || j >= c_Size.y || k >= c_Size.z) return; long int id = (k * c_Size.y + j) * c_Size.x + i; long int id_x = (k * c_Size.y + j) * c_Size.x + i + 1; long int id_y = (k * c_Size.y + j + 1)* c_Size.x + i; long int id_z = ((k + 1) * c_Size.y + j) * c_Size.x + i; if (i == (c_Size.x - 1)) grad_x[id] = 0; else grad_x[id] = (in[id_x] - in[id]) / c_Spacing.x; if (j == (c_Size.y - 1)) grad_y[id] = 0; else grad_y[id] = (in[id_y] - in[id]) / c_Spacing.y; if (k == (c_Size.z - 1)) grad_z[id] = 0; else grad_z[id] = (in[id_z] - in[id]) / c_Spacing.z; } -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.rit at creatis.insa-lyon.fr Wed Feb 27 15:58:14 2019 From: simon.rit at creatis.insa-lyon.fr (Simon Rit) Date: Wed, 27 Feb 2019 21:58:14 +0100 Subject: [Rtk-users] GPU kernel do not change output variables In-Reply-To: References: Message-ID: Hi, Sounds like a challenge. When you say you set fixed numbers, did you check that you reach the point where you set this number? You can use cuprintf to check what's going on in the kernel. One thing wrong I noticed: you use size.Fill in a loop, which is a bit odd because it will Fill the size with the last value of the loop. I hope this helps, Simon On Wed, Feb 27, 2019 at 9:39 PM Kabelitz, Gordian < Gordian.Kabelitz at medma.uni-heidelberg.de> wrote: > Hi rtk-users, > > I am facing an oddity which I cannot explain. > > I want to implement a new gradient filter. The input is an > CudaImage and the output should be an > CudaImage,3>. The filter runs without any cuda > errors but the output (pout_(xyz)) is has not changed at all. The kernel > function is accessed and the print out from there seems to be okay. I tried > to explicitly copy the content of the GPUBuffer into the CPUBuffer. Still > no success. Even if I set fixed numbers in the kernel to the output image > nothing changed. I use CUDA 9.0, Visual Studio 2015, ITK 5.0, RTK 2.0 as > remote module, CMake 3.13., Windows 7 64bit. The relevant code snippets are > below. > > Do I miss something obvious? Any recommendation are welcome. > > With kind regards, > > Gordian > > > > The GPUGenerateData function: > > GPUGenerateData() > > { > > int inputSize[3]; > > int outputSize[3]; > > float inputSpacing[3]; > > float outputSpacing[3]; > > > > for (int i = 0; i<3; i++) > > { > > inputSize[i] = this > ->GetInput()->GetBufferedRegion().GetSize()[i]; > > outputSize[i] = this > ->GetOutput()->GetBufferedRegion().GetSize()[i]; > > inputSpacing[i] = this->GetInput()->GetSpacing()[i]; > > outputSpacing[i] = this > ->GetOutput()->GetSpacing()[i]; > > > > if ((inputSize[i] != outputSize[i]) || > (inputSpacing[i] != outputSpacing[i])) > > { > > std::cerr << "The CUDA laplacian filter can > only handle input and output regions of equal size and spacing" << std:: > endl; > > exit(1); > > } > > } > > > > float *pin = *(float**)(this > ->GetInput()->GetCudaDataManager()->GetGPUBufferPointer()); > > > > // This is a test area > > typename InputImageType::IndexType index; > > index.Fill(0); > > typename InputImageType::SizeType size; > > for (auto i = 0; i < 3; ++i) > > size.Fill(this > ->GetInput()->GetLargestPossibleRegion().GetSize()[i]); > > typename InputImageType::RegionType region(index, size); > > // images for gradients > > auto grad_x = CudaImage::New(); > > grad_x->SetRegions(region); > > grad_x->Allocate(); > > grad_x->FillBuffer(1); > > auto grad_y = CudaImage::New(); > > grad_y->SetRegions(region); > > grad_y->Allocate(); > > auto grad_z = CudaImage::New(); > > grad_z->SetRegions(region); > > grad_z->Allocate(); > > > > float *pout_x = *(float**)(grad_x->GetCudaDataManager()-> > GetGPUBufferPointer()); > > float *pout_y = *(float**)(grad_y->GetCudaDataManager()-> > GetGPUBufferPointer()); > > float *pout_z = *(float**)(grad_z->GetCudaDataManager()-> > GetGPUBufferPointer()); > > > > CUDA_gradient(inputSize, inputSpacing, pin, pout_x, pout_y, > pout_z); // after this line neither of the pout_(xyz) images have changed. > > > > // put the gradient images in a single covariant vector image > > auto CompositeImageFilter = itk::ComposeImageFilter< > CudaImage, CudaImage,3>>::New(); > > CompositeImageFilter->SetInput1(grad_x); > > CompositeImageFilter->SetInput2(grad_y); > > CompositeImageFilter->SetInput3(grad_z); > > CompositeImageFilter->Update(); > > > > this->GetOutput()->Graft(CompositeImageFilter->GetOutput()); > > } > > > > The cuda/kernel function > > > > __global__ void gradient_kernel(float * in, float * grad_x, float * grad_y, > float * grad_z, int3 c_Size, float3 c_Spacing); > > > > void > > CUDA_gradient( > > int size[3], > > float spacing[3], > > float *dev_in, > > float *dev_out_x, > > float *dev_out_y, > > float *dev_out_z) > > { > > int3 dev_Size = make_int3(size[0], size[1], size[2]); > > float3 dev_Spacing = make_float3(spacing[0], spacing[1], spacing > [2]); > > > > // Output volume > > long int outputMemorySize = size[0] * size[1] * size[2] * sizeof( > float); > > cudaMalloc((void**)&dev_out_x, outputMemorySize); > > cudaMalloc((void**)&dev_out_y, outputMemorySize); > > cudaMalloc((void**)&dev_out_z, outputMemorySize); > > cudaMemset(dev_out_x, 0, outputMemorySize); > > cudaMemset(dev_out_y, 0, outputMemorySize); > > cudaMemset(dev_out_z, 0, outputMemorySize); > > printf("Device Variable Copying:\t%s\n", cudaGetErrorString( > cudaGetLastError())); > > > > // Thread Block Dimensions > > dim3 dimBlock = dim3(16, 4, 4); > > > > int blocksInX = iDivUp(size[0], dimBlock.x); > > int blocksInY = iDivUp(size[1], dimBlock.y); > > int blocksInZ = iDivUp(size[2], dimBlock.z); > > > > dim3 dimGrid = dim3(blocksInX, blocksInY, blocksInZ); > > > > gradient_kernel <<< dimGrid, dimBlock >>> (dev_in, dev_out_x, > dev_out_y, dev_out_z, dev_Size, dev_Spacing); > > cudaDeviceSynchronize(); > > printf("Device Variable Copying:\t%s\n", cudaGetErrorString( > cudaGetLastError())); > > CUDA_CHECK_ERROR; > > } > > > > __global__ > > void > > gradient_kernel(float * in, float * grad_x, float * grad_y, float * grad_z, > int3 c_Size, float3 c_Spacing) > > { > > > > unsigned int i = blockIdx.x * blockDim.x + threadIdx.x; > > unsigned int j = blockIdx.y * blockDim.y + threadIdx.y; > > unsigned int k = blockIdx.z * blockDim.z + threadIdx.z; > > > > if (i >= c_Size.x || j >= c_Size.y || k >= c_Size.z) > > return; > > > > long int id = (k * c_Size.y + j) * c_Size.x + i; > > long int id_x = (k * c_Size.y + j) * c_Size.x + i + 1; > > long int id_y = (k * c_Size.y + j + 1)* c_Size.x + i; > > long int id_z = ((k + 1) * c_Size.y + j) * c_Size.x + i; > > > > if (i == (c_Size.x - 1)) grad_x[id] = 0; > > else grad_x[id] = (in[id_x] - in[id]) / c_Spacing.x; > > > > if (j == (c_Size.y - 1)) grad_y[id] = 0; > > else grad_y[id] = (in[id_y] - in[id]) / c_Spacing.y; > > > > if (k == (c_Size.z - 1)) grad_z[id] = 0; > > else grad_z[id] = (in[id_z] - in[id]) / c_Spacing.z; > > } > > > > > > > _______________________________________________ > Rtk-users mailing list > Rtk-users at public.kitware.com > https://public.kitware.com/mailman/listinfo/rtk-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Gordian.Kabelitz at medma.uni-heidelberg.de Wed Feb 27 17:30:59 2019 From: Gordian.Kabelitz at medma.uni-heidelberg.de (Kabelitz, Gordian) Date: Wed, 27 Feb 2019 22:30:59 +0000 Subject: [Rtk-users] GPU kernel do not change output variables In-Reply-To: References: Message-ID: <7bc75960047840dcacdb792f8ae6e738@exch08.ad.uni-heidelberg.de> Hi Simon, I used the printf in the kernel to print the id and the fixed value. Therefore I know that the kernel is reached. It looked similar to this snippet: gradient_kernel(float * in, float * grad_x, float * grad_y, float * grad_z, int3 c_Size, float3 c_Spacing) { unsigned int i = blockIdx.x * blockDim.x + threadIdx.x; unsigned int j = blockIdx.y * blockDim.y + threadIdx.y; unsigned int k = blockIdx.z * blockDim.z + threadIdx.z; if (i >= c_Size.x || j >= c_Size.y || k >= c_Size.z) return; long int id = (k * c_Size.y + j) * c_Size.x + i; grad_x[id] = 10.f; printf("ID: %i -> %f\n", id, grad_x[id]); } With the expected result of ?ID: xxx -> 10.0000? To get the print out a cudaDeviceSynchronize (); is need ed after the call of CUDA_gradient(inputSize, inputSpacing, pin, pout_x, pout_y, pout_z);. I used the file writer to provide the grad_(xyz) CudaImages right after the call of CUDA_gradient(inputSize, inputSpacing, pin, pout_x, pout_y, pout_z); Those images are the same I created before. auto filewriter = itk::ImageFileWriter>::New(); filewriter->SetFileName("outX.nrrd"); filewriter->SetInput(grad_x); filewriter->Update(); The size.Fill() should be an size[i]. All dimension in my test case have the same size this wasn?t a problem. Still no idea. ? Von: Simon Rit [mailto:simon.rit at creatis.insa-lyon.fr] Gesendet: Mittwoch, 27. Februar 2019 21:58 An: Kabelitz, Gordian Cc: rtk-users at public.kitware.com Betreff: Re: [Rtk-users] GPU kernel do not change output variables Hi, Sounds like a challenge. When you say you set fixed numbers, did you check that you reach the point where you set this number? You can use cuprintf to check what's going on in the kernel. One thing wrong I noticed: you use size.Fill in a loop, which is a bit odd because it will Fill the size with the last value of the loop. I hope this helps, Simon On Wed, Feb 27, 2019 at 9:39 PM Kabelitz, Gordian > wrote: Hi rtk-users, I am facing an oddity which I cannot explain. I want to implement a new gradient filter. The input is an CudaImage and the output should be an CudaImage,3>. The filter runs without any cuda errors but the output (pout_(xyz)) is has not changed at all. The kernel function is accessed and the print out from there seems to be okay. I tried to explicitly copy the content of the GPUBuffer into the CPUBuffer. Still no success. Even if I set fixed numbers in the kernel to the output image nothing changed. I use CUDA 9.0, Visual Studio 2015, ITK 5.0, RTK 2.0 as remote module, CMake 3.13., Windows 7 64bit. The relevant code snippets are below. Do I miss something obvious? Any recommendation are welcome. With kind regards, Gordian The GPUGenerateData function: GPUGenerateData() { int inputSize[3]; int outputSize[3]; float inputSpacing[3]; float outputSpacing[3]; for (int i = 0; i<3; i++) { inputSize[i] = this->GetInput()->GetBufferedRegion().GetSize()[i]; outputSize[i] = this->GetOutput()->GetBufferedRegion().GetSize()[i]; inputSpacing[i] = this->GetInput()->GetSpacing()[i]; outputSpacing[i] = this->GetOutput()->GetSpacing()[i]; if ((inputSize[i] != outputSize[i]) || (inputSpacing[i] != outputSpacing[i])) { std::cerr << "The CUDA laplacian filter can only handle input and output regions of equal size and spacing" << std::endl; exit(1); } } float *pin = *(float**)(this->GetInput()->GetCudaDataManager()->GetGPUBufferPointer()); // This is a test area typename InputImageType::IndexType index; index.Fill(0); typename InputImageType::SizeType size; for (auto i = 0; i < 3; ++i) size.Fill(this->GetInput()->GetLargestPossibleRegion().GetSize()[i]); typename InputImageType::RegionType region(index, size); // images for gradients auto grad_x = CudaImage::New(); grad_x->SetRegions(region); grad_x->Allocate(); grad_x->FillBuffer(1); auto grad_y = CudaImage::New(); grad_y->SetRegions(region); grad_y->Allocate(); auto grad_z = CudaImage::New(); grad_z->SetRegions(region); grad_z->Allocate(); float *pout_x = *(float**)(grad_x->GetCudaDataManager()->GetGPUBufferPointer()); float *pout_y = *(float**)(grad_y->GetCudaDataManager()->GetGPUBufferPointer()); float *pout_z = *(float**)(grad_z->GetCudaDataManager()->GetGPUBufferPointer()); CUDA_gradient(inputSize, inputSpacing, pin, pout_x, pout_y, pout_z); // after this line neither of the pout_(xyz) images have changed. // put the gradient images in a single covariant vector image auto CompositeImageFilter = itk::ComposeImageFilter, CudaImage,3>>::New(); CompositeImageFilter->SetInput1(grad_x); CompositeImageFilter->SetInput2(grad_y); CompositeImageFilter->SetInput3(grad_z); CompositeImageFilter->Update(); this->GetOutput()->Graft(CompositeImageFilter->GetOutput()); } The cuda/kernel function __global__ void gradient_kernel(float * in, float * grad_x, float * grad_y, float * grad_z, int3 c_Size, float3 c_Spacing); void CUDA_gradient( int size[3], float spacing[3], float *dev_in, float *dev_out_x, float *dev_out_y, float *dev_out_z) { int3 dev_Size = make_int3(size[0], size[1], size[2]); float3 dev_Spacing = make_float3(spacing[0], spacing[1], spacing[2]); // Output volume long int outputMemorySize = size[0] * size[1] * size[2] * sizeof(float); cudaMalloc((void**)&dev_out_x, outputMemorySize); cudaMalloc((void**)&dev_out_y, outputMemorySize); cudaMalloc((void**)&dev_out_z, outputMemorySize); cudaMemset(dev_out_x, 0, outputMemorySize); cudaMemset(dev_out_y, 0, outputMemorySize); cudaMemset(dev_out_z, 0, outputMemorySize); printf("Device Variable Copying:\t%s\n", cudaGetErrorString(cudaGetLastError())); // Thread Block Dimensions dim3 dimBlock = dim3(16, 4, 4); int blocksInX = iDivUp(size[0], dimBlock.x); int blocksInY = iDivUp(size[1], dimBlock.y); int blocksInZ = iDivUp(size[2], dimBlock.z); dim3 dimGrid = dim3(blocksInX, blocksInY, blocksInZ); gradient_kernel <<< dimGrid, dimBlock >>> (dev_in, dev_out_x, dev_out_y, dev_out_z, dev_Size, dev_Spacing); cudaDeviceSynchronize(); printf("Device Variable Copying:\t%s\n", cudaGetErrorString(cudaGetLastError())); CUDA_CHECK_ERROR; } __global__ void gradient_kernel(float * in, float * grad_x, float * grad_y, float * grad_z, int3 c_Size, float3 c_Spacing) { unsigned int i = blockIdx.x * blockDim.x + threadIdx.x; unsigned int j = blockIdx.y * blockDim.y + threadIdx.y; unsigned int k = blockIdx.z * blockDim.z + threadIdx.z; if (i >= c_Size.x || j >= c_Size.y || k >= c_Size.z) return; long int id = (k * c_Size.y + j) * c_Size.x + i; long int id_x = (k * c_Size.y + j) * c_Size.x + i + 1; long int id_y = (k * c_Size.y + j + 1)* c_Size.x + i; long int id_z = ((k + 1) * c_Size.y + j) * c_Size.x + i; if (i == (c_Size.x - 1)) grad_x[id] = 0; else grad_x[id] = (in[id_x] - in[id]) / c_Spacing.x; if (j == (c_Size.y - 1)) grad_y[id] = 0; else grad_y[id] = (in[id_y] - in[id]) / c_Spacing.y; if (k == (c_Size.z - 1)) grad_z[id] = 0; else grad_z[id] = (in[id_z] - in[id]) / c_Spacing.z; } _______________________________________________ Rtk-users mailing list Rtk-users at public.kitware.com https://public.kitware.com/mailman/listinfo/rtk-users -------------- next part -------------- An HTML attachment was scrubbed... URL: