|
|
(One intermediate revision by one other user not shown) |
Line 1: |
Line 1: |
| This page outlines the proposed GPU acceleration framework in ITK v4. The GPU has become a cost-effective parallel computing platform for computationally expensive problems. Although many ITK image filters can benefit from the GPU, there has been no GPU support in ITK as of today. We propose to add a new data structure, framework, and some basic image operations that support the GPU in order to allow ITK developers to easily implement their filters running on both the CPU and GPU.
| |
|
| |
|
| == Goals ==
| | This page now forwards to: |
|
| |
|
| * Add the support for the GPU processing in ITK | | * [[ITK_Release_4/GPU Acceleration|GPU Acceleration - V4]] |
| ** GPU image class
| |
| ** Extension of ITK multithreading model to support the GPU
| |
| ** Basic GPU image operators
| |
| | |
| == Authors ==
| |
| | |
| GPU acceleration for ITK v4 has been proposed by Harvard University and University of Utah.
| |
| | |
| * Won-Ki Jeong (wkjeong -at- seas.harvard.edu)
| |
| * Hanspeter Pfister (pfister -at- seas.harvard.edu)
| |
| * Ross Whitaker (whitaker -at- cs.utah.edu)
| |
| | |
| == Plans ==
| |
| | |
| === GPU image class ===
| |
| | |
| We propose a new GPU image class, itk::GPUImage,
| |
| which provides a GPU data container and functions for implicit and explicit data transfers
| |
| between the CPU and the GPU memory spaces. itk::GPUImage will contain two snapshots
| |
| of the current image—one on the CPU and one on the GPU—but provide the functionality of
| |
| a single image to the user. itk::GPUImage inherits all the public functions from itk::Image,
| |
| so it can be used with the existing CPU ITK image filters as before. All the pixel operators,
| |
| for example GetPixel(), and the image iterators can be used to modify pixel values on the
| |
| CPU side. Conversely, GPU code will modify the pixel values on the GPU side. We propose
| |
| an automatic synchronization mechanism between the CPU and GPU buffers, transparent to
| |
| the user. Specifically, we propose the following functionalities for the ITK GPU image class:
| |
| | |
| * Efficient GPU memory management
| |
| * CPU and GPU synchronization scheme
| |
| * GPU buffer interface for direct access
| |
| | |
| === GPU support for ITK multithreading model ===
| |
| | |
| We will extend the ITK multithreaded architecture by introducing two new virtual functions, GPUGenerateData()
| |
| and GPUThreadedGenerateData(). These methods will borrow the implicit thread management
| |
| design from the existing architecture but manage threads based on GPU resources and
| |
| not CPU resources. When the filter is called, a superclass of the filter will decide between
| |
| single or multi-threaded execution and determine where to run the code, either on a CPU or
| |
| GPU. The superclass will spawn threads and call one of the four functions accordingly.
| |
| | |
| === Filter API to support GPU code ===
| |
| | |
| We will implement a filter class that has an API to execute GPU code written in OpenCL.
| |
| | |
| === Basic GPU image operators ===
| |
| | |
| We propose a set of basic GPU image operators and filters that can be used as building blocks for more complicated numerical algorithms, such as:
| |
| | |
| * Addition, subtraction, division, multiplication, inner product, reduction, copy and assignment operators
| |
| * Neighborhood operator filter (for convolution-type filter)
| |
| | |
| == Target architecture ==
| |
| | |
| We are going to use OpenCL to implement GPU code for wide applicability (Intel, AMD, and NVIDIA). We will consider supporting NVIDIA CUDA as well if required (for example, allowing to employ existing GPU libraries, such as cuFFT or cuBLAS).
| |