VTK/ImageBenchmark

From KitwarePublic
< VTK
Revision as of 03:07, 14 March 2016 by Dgobbi (talk | contribs) (→‎Options)
Jump to navigationJump to search

In VTK 7.1, a benchmarking utility called ImageBenchmark was added to the VTK examples. This utility makes it easy to benchmark certain VTK image filters, in order to assist developers who wish to optimize their performance. The source code is in VTK/Examples/ImageProcessing/Cxx/. The benchmarking utility is written in C++ to make it available to as broad a variety of developers as possible, and also to avoid any concerns that wrapping might invalidate the results (though it most certainly would not).

Options

There are several command-line options that can be used to control how the benchmarking is performed.

--runs N
The number of runs to perform, 6 is recommended for accurate results
--threads N
Request a specific number of threads
--threads N-M
Repeat benchmark for each number of threads N through M
--filter <filter_and_options>
The filter to benchmark (see second column of the results pasted in the next section).

Here is the full list of options:

Usage: ImageBenchmark [options]

Options:
  --runs N                      The number of runs to perform
  --threads N (or N-M or N,M,O) Request a certain number of threads
  --split-mode slab|beam|block  Use the specified splitting mode
  --enable-smp on|off           Use vtkSMPTools vs. vtkMultiThreader
  --clear-cache MBytes          Attempt to clear CPU cache between runs
  --bytes-per-piece N           Ask for N bytes per piece [65536]
  --min-piece-size XxYxZ        Minimum dimensions per piece [16x1x1]
  --size XxYxZ                  The image size [256x256x256]
  --type uchar|short|float      The data type for the input [uchar]
  --source <imagegsource>       Set the data source [random-valued voxels]
  --filter <filter>[:options]   Set the filter to benchmark [median]
  --output filename.png         Output middle slice as a png file.
  --units mvps|mvptps|seconds   The output units (see below for details).
  --header                      Print a header line before the results.
  --verbose                     Print verbose output to stdout.
  --version                     Print the VTK version and exit.
  --help                        Print this message.

This program prints benchmark results to stdout in csv format.  The default
units are megavoxels per second, but the --units option can specify units
of seconds, megavoxels per second (mvps), or megavoxels per thread per
second (mvptps).

If more than three runs are done (by use of --runs), then the mean and
standard deviation over all of the runs except the first will be printed
(use --header to get the column headings).

Sources: these are how the initial data set is produced.
  gaussian    A centered 3D gaussian.
  noise       Pseudo-random noise.
  grid        A grid, for checking rotations.
  mandelbrot  The mandelbrot set.

Filters: these are the algorithms that can be benchmarked.
  median:kernelsize=3        Test vtkImageMedian3D.
  reslice:kernel=nearest     Test vtkImageReslice (see below).
  resize:kernelsize=1        Test vtkImageResize.
  convolve:kernelsize=3      Test vtkImageConvolve.
  separable:kernelsize=3     Test vtkImageSeparableConvolution.
  gaussian:kernelsize=3      Test vtkImageGaussianSmooth.
  bspline:degree=3           Test vtkImageBSplineCoefficients.
  fft                        Test vtkImageFFT.
  histogram:stencil          Test vtkImageHistogram.
  colormap:components=3      Test vtkImageMapToColors.

The reslice filter takes the following options:
  stencil                    Spherical stencil (ignore voxels outside).
  kernel=nearest|linear|cubic|sinc|bspline   The interpolator to use.
  kernelsize=4               The kernelsize (sinc, bspline only).
  rotation=0/0/0/0           Rotation angle (degrees) and axis.

The colormap filter takes the following options:
  components=3               Output components (3=RGB, 4=RGBA).
  greyscale                  Rescale but do not apply a vtkLookupTable.

Sample Output

When run with no arguments, ImageBenchmark will give results for a variety of filters, as shown below. Results are reported in 'megavoxels per second', where a higher number indicates faster execution. The formatting is CSV (comma-separated value), which can be read by most plotting packages.

191.353,colormap:components=3
180.957,colormap:components=4
824.309,colormap:components=1:greyscale
799.908,colormap:components=2:greyscale
562.842,colormap:components=3:greyscale
452.914,colormap:components=4:greyscale
417.646,resize:kernelsize=1
151.029,resize:kernelsize=2
82.5898,resize:kernelsize=4
47.8107,resize:kernelsize=6
958.911,reslice:kernel=nearest:rotation=0/0/0/1
550.164,reslice:kernel=nearest:rotation=90/0/0/1
29.5369,reslice:kernel=nearest:rotation=90/0/1/0
101.706,reslice:kernel=nearest:rotation=45/0/0/1
39.1603,reslice:kernel=nearest:rotation=60/0/1/1
25.3316,reslice:kernel=linear:rotation=60/0/1/1
11.5771,reslice:kernel=cubic:rotation=60/0/1/1
8.26904,reslice:kernel=bspline:rotation=60/0/1/1
3.64946,reslice:kernel=sinc:rotation=60/0/1/1
5.42244,reslice:kernel=sinc:rotation=60/0/1/1:stencil
70.2169,gaussian:kernelsize=3
14.3122,convolve:kernelsize=3
7.81935,separable:kernelsize=3
111.109,resize:kernelsize=3
9.00278,median:kernelsize=3
1239.28,histogram
2086.42,histogram:stencil
11.6615,bspline:degree=3

When given a specific filter to benchmark, only results for that filter will be printed. Note that if --runs is greater than 2, the mean and standard deviation over all runs except the first will be computed (the first run is ignored because it is considered to be less reliable than the rest).

ImageBenchmark --filter median --runs 6 --header

R0,R1,R2,R3,R4,R5,Average,StdDev
8.77757,8.26124,8.04351,9.01285,9.078,8.90538,8.6602,0.473952

When given a range of thread counts, results will be given for each thread count:

ImageBenchmark --filter median --runs 6 --split-mode block --size 256x256x256

Threads,R0,R1,R2,R3,R4,R5,Average,StdDev
1,9.16398,9.24523,9.20793,9.23568,9.2334,9.31117,9.24668,0.0385995
2,17.2775,17.525,17.5154,17.3023,17.339,17.4302,17.4224,0.100752
3,24.709,24.9179,24.8035,24.9186,24.9367,23.4266,24.6007,0.658473
4,32.6784,32.9238,32.825,32.8771,32.9512,33.0941,32.9342,0.101429
5,32.6383,32.8922,33.0193,32.9771,30.1123,33.0321,32.4066,1.2837
6,45.0957,45.6274,45.5593,45.5477,45.6546,45.5925,45.5963,0.0450496
7,44.9937,45.5494,45.5663,45.0725,44.7736,45.2493,45.2422,0.334605
8,67.1954,67.4694,67.4336,67.7359,67.3987,67.7304,67.5536,0.165785
9,67.2081,67.6509,67.7047,67.4066,67.6089,67.4696,67.5681,0.125472
10,67.1577,67.7227,67.8311,67.7851,67.8229,67.8133,67.795,0.0440062
11,67.2433,67.735,67.4784,67.7421,67.7862,66.6508,67.4785,0.478284
12,56.548,90.3603,81.876,88.6338,83.4625,82.8865,85.4438,3.79284

ImageBenchmarkDriver

There is another benchmarking example called ImageBenchmarkDriver which calls ImageBenchmark over and over again with various parameters. You can use it to find the best combination of DesiredBytesPerPiece and SplitMode for a particular filter. It takes the same arguments as ImageBenchmark, but it also takes an output directory as a parameter since it writes several output files:

Usage: ImageBenchmarkDriver --prefix <path/prefix> ...

Options:
  --prefix <path/prefix>  Prefix for output filenames.
  Any options from ImageBenchmark can also be used.

As an an example, it can be used like this:

ImageBenchmarkDriver --filter median --runs 6 --prefix mybenchmarks/