ParaView/Users Guide/Python Programmable Filter

From KitwarePublic
Jump to navigationJump to search

Introduction

ParaView UG ProgrammableFilter.png

The Programmable Filter is a ParaView filter that processes one or more input datasets based on a Python script provided by the user. The parameters of the filter include the output data type, the script and a toggle that controls whether the input arrays are copied to the output. In this document, we introduce the use of the Programmable Filter and give a summary of the API available to the user.

Note that the Programmable Filter depends on Python. All ParaView binaries distributed by Kitware are built with Python enabled. If you have built ParaView yourself, you have to make sure that PARAVIEW_ENABLE_PYTHON is turned on when configuring the ParaView build.

Since the entire VTK API as well as any module that can be imported through Python is available through this filter, we can only skim the surface of what can be accomplished with this filter here. If you are not familiar with Python, we recommend first taking a look at one of the introductory guides such as the official Python Tutorial. Also, if you are going to do any programming beyond the very basic stuff, we recommend reading up on the VTK API. The VTK website has links to VTK books and online documentation. For reference, you may need to look at the VTK class documentation. There is also more information about the Programmable Filter and some good recipes on the ParaView Wiki (Python_Programmable_Filter).

Basic Use

Requirements:

  1. You are applying Programmable Filter to a "simple" dataset and not a composite dataset such as multi-block or AMR.
  2. You have NumPy installed.

The most basic reason to use the Programmable Filter is to add a new array by possible deriving it from arrays in the input. This can be achieved by using the Python Calculator. One reason to use the Programmable Filter instead may be that the calculation is more involved and trying to do it in one expression may be difficult. Another reason may be that you need access to a program flow construct such as if or for. In any case. the Programmable Filter can be used to do everything the Calculator does and more.

Note: Since what we describe here builds on some of the concepts introduced in the Python Calculator section, please read it first if you are not familiar with the Calculator.

If you leave the "Output Dataset Type" parameter in the default setting of "Same as Input", the Programmable Filter will copy the topology and geometry of the input to the output before calling your Python script. Therefore, if you Apply the filter without filling the script, you should see a copy of the input without any of its arrays in the output. If you also check the "Copy Arrays" option, the output will have all of the input arrays. This behavior allows you to focus on creating new arrays without worrying about the mesh.

Let's try an example. Create a Sphere source and then apply the Programmable Filter. Use the following script.

<source lang="python"> normals = inputs[0].PointData['Normals'] output.PointData.append(normals[:,0], "Normals_x") </source>

This should create a sphere with on array called "Normals_x". There a few things to note here:

  • You cannot refer to arrays directly by name as in the Python Calculator. You need to access arrays using the .PointData and .CellData qualifiers.
  • Unlike the Python Calculator, you have to explicitly add an array to the output using the append function. Note that this function takes the name of the array as the second argument.

You can use any of the functions available in the Calculator in the Programmable Filter. For example, the following code creates two new arrays and adds them to the output.

<source lang="python"> normals = inputs[0].PointData['Normals'] output.PointData.append(sin(normals[:,0]), "sin of Normals_x") output.PointData.append(normals[:,1] + 1, "Normals_y + 1") </source>

Intermediate Use

Mixing VTK and NumPy APIs

The examples above demonstrate how the Programmable Filter can be used as an advanced Python Calculator. However, the full power of the Programmable Filter can only be harnessed by using the VTK API. Let's start with a simple example. Create a Sphere source and apply the Programmable Filter with the following script.

<source lang="python"> input = inputs[0]

newPoints = vtk.vtkPoints() numPoints = input.GetNumberOfPoints() for i in range(numPoints):

   x, y, z = input.GetPoint(i)
   newPoints.InsertPoint(i, x, y, 1 + z*0.3)

output.SetPoints(newPoints) </source>

This requires some explanation. We start with creating a new instance of vtkPoints.

<source lang="python"> newPoints = vtk.vtkPoints() </source>

vtkPoints is a data structure that VTK uses to store the coordinates of points. Next, we loop over all points of the input and insert a new point in the output with coordinates (x, y, 1+z*0.3)

<source lang="python"> for i in range(numPoints):

   x, y, z = input.GetPoint(i)
   newPoints.InsertPoint(i, x, y, 1 + z*0.3)

</source>

Finally, we replace the output points with the new points we created using the following.

<source lang="python"> output.SetPoints(newPoints) </source>

Note: Python is an interpreted language and Python scripts do not execute as efficiently as compiled C++ code. Therefore, using a for loop that iterates over all points or cells may be a significant bottleneck when processing large datasets.

The NumPy and VTK APIs can be mixed to achieve good performance. Even though this may seem a bit complicated at first, it can be used with great effect. For example, the example above can be rewritten as follows.

<source lang="python"> from paraview.vtk.dataset_adapter import numpyTovtkDataArray

input = inputs[0]

newPoints = vtk.vtkPoints()

zs = 1 + input.Points[:,2]*0.3 coords = hstack([input.Points[:,0:2],zs])

newPoints.SetData(numpyTovtkDataArray(coords))

output.SetPoints(newPoints) </source>

Even though this produces exactly the same result, it is much more efficient because we eliminated the for loop. Well, we didn't really eliminate it. We rather moved it from Python to C. Under the hood, NumPy uses C and Fortran for tight loops.

If you read the Python Calculator documentation, this example is pretty straightforward except the use of numpyTovtkDataArray(). First of all, note that we are mixing two APIs here: the VTK API and NumPy. VTK and NumPy uses different types of objects to represents arrays. The basic examples we previously used carefully hide this from you. However, once you start manipulating VTK objects using NumPy, you have to start converting objects between two APIs. Note that for the most part this conversion happens without "deep copying" arrays, e.g. copying the raw contents from one memory location to another. Rather, pointers are passed between VTK and NumPy whenever possible.

The dataset_adapter provides two methods to do the conversions described above:

  • vtkDataArrayToVTKArray: This function create a NumPy compatible array from a vtkDataArray. Note that VTKArray is actually a subclass of numpy.matrix and can be used anywhere matrix can be used. This function always copies the pointer and not the contents. Important: You should not directly change the values of the resulting array if the argument is an array from the input.
  • numpyTovtkDataArray: Converts a NumPy array (or a VTKArray) to a vtkDataArray. This function copies the pointer if the argument is a contiguous array. There are various ways of creating discontinuous arrays with NumPy including using hstack and striding. See NumPy documentation for details.

Multiple Inputs

Like the Python Calculator, the Programmable Filter can accept multiple inputs. First select two or more pipeline objects in the pipeline browser and then apply the Programmable Filter. Then each input can be accessed using the inputs[] variable. Note that if the Output Dataset Type is set to Same as Input, the filter will copy the mesh from the first input to the output. If Copy Arrays is on, it will also copy arrays from the first input. As an example, the following script compares the Pressure attribute from two inputs using the difference operator.

<source lang="python"> output.append(inputs[1].PointData['Pressure'] - inputs[0].PointData['Pressure'], "difference") </source>

Dealing with Composite Datasets

So far, none of the example we used apply to multi-block or AMR datasets. When talking about the Python Calculator, we did not have to differentiate between simple and composite datasets. This is because the calculator loops over all of the leaf blocks of composite datasets and applies the expression to each one. Therefore, inputs in an expression are guaranteed to be simple datasets. On the other hand, the Programmable Filter does not perform this iteration and passes the input, composite or simple, as it is to the script. Even though this makes basic scripting harder for composite datasets, it provides enormous flexibility.

The only thing that you have to know to work with composite datasets is how to iterate over them to access the leaf nodes. Let's start with a simple example.

<source lang="python"> for block in inputs[0]: print block </source>

Here we iterate over all of the non-NULL leaf nodes (i.e. simple datasets) of the input and print them to the Output Messages console. Note that this will work only if the input is multi-block or AMR.

Note that when Output Dataset Type is set to "Same as Input", the Programmable Filter will copy composite dataset to the output - it will copy only the mesh unless Copy Arrays is on. Therefore, you can also iterate over the output. A simple trick is to turn on Copy Arrays and then use the arrays from the output when generating new ones. Below is an example. We use the can.ex2 file from the ParaView testing dataset collection.

<source lang="python"> def process_block(block): displ = block.PointData['DISPL'] block.PointData.append(displ[:,0], "displ_x")

for block in output: process_block(block) </source>

Alternatively, you can use the MultiCompositeDataIterator to iterate over the input and output block simultaneously. The following is equivalent to the previous example.

<source lang="python"> def process_block(input_block, output_block): displ = input_block.PointData['DISPL'] output_block.PointData.append(displ[:,0], "displ_x")

from paraview.vtk.dataset_adapter import MultiCompositeDataIterator iter = MultiCompositeDataIterator([inputs[0], output])

for input_block, output_block in iter: process_block(input_block, output_block) </source>

Advanced (but read it anyway)

Changing Output Type

So far, all of the examples we discussed depended on that the output type being the same as input and that the Programmable Filter copied the input mesh to the output. If we set the output type to something other than Same as Input, we are on own: the Programmable Filter will create an empty output of the type we specified but will not copy any information. Even though it may be more work, this gives us a lot of flexibility. Since we are approaching the realm of full-blown VTK filter authoring, we will pick a very simple example here. If you are already familiar with VTK API, you will realize that this is a great way of prototyping VTK filters. If you are not, reading up on VTK is a good idea.

Create a Wavelet source, apply a Programmable Filter, set the output type to vtkTable and use the following script:

<source lang="python"> rtdata = inputs[0].PointData['RTData']

output.RowData.append(min(rtdata), 'min') output.RowData.append(max(rtdata), 'max') </source>

Here, we added two columns to the output table. First one has one value - minimum of RTData - and the second one has the maximum of RTData. When you apply this filter, the output should automatically be shown in a Spreadsheet view. We could also use this sort of script to chart part of our input data. For example, the output of the following script can be display as a line chart.

<source lang="python"> rtdata = inputs[0].PointData['RTData'] output.RowData.append(rtdata, 'rtdata') </source>

Dealing with Structured Data Output

- example with resampling to image data