ParaView/Users Guide/Python Programmable Filter
The Programmable Filter is a ParaView filter that processes one or more input datasets based on a Python script provided by the user. The parameters of the filter include the output data type, the script and a toggle that controls whether the input arrays are copied to the output. In this document, we introduce the use of the Programmable Filter and give a summary of the API available to the user.
Note that the Programmable Filter depends on Python. All ParaView binaries distributed by Kitware are built with Python enabled. If you have built ParaView yourself, you have to make sure that PARAVIEW_ENABLE_PYTHON is turned on when configuring the ParaView build.
Since the entire VTK API as well as any module that can be imported through Python is available through this filter, we can only skim the surface of what can be accomplished with this filter here. If you are not familiar with Python, we recommend first taking a look at one of the introductory guides such as the official Python Tutorial. Also, if you are going to do any programming beyond the very basic stuff, we recommend reading up on the VTK API. The VTK website has links to VTK books and online documentation. For reference, you may need to look at the VTK class documentation. There is also more information about the Programmable Filter and some good recipes on the ParaView Wiki (Python_Programmable_Filter).
- You are applying Programmable Filter to a "simple" dataset and not a composite dataset such as multi-block or AMR.
- You have NumPy installed.
The most basic reason to use the Programmable Filter is to add a new array by possible deriving it from arrays in the input. This can be achieved by using the Python Calculator. One reason to use the Programmable Filter instead may be that the calculation is more involved and trying to do it in one expression may be difficult. Another reason may be that you need access to a program flow construct such as if or for. In any case. the Programmable Filter can be used to do everything the Calculator does and more.
Note: Since what we describe here builds on some of the concepts introduced in the Python Calculator section, please read it first if you are not familiar with the Calculator.
If you leave the "Output Dataset Type" parameter in the default setting of "Same as Input", the Programmable Filter will copy the topology and geometry of the input to the output before calling your Python script. Therefore, if you Apply the filter without filling the script, you should see a copy of the input without any of its arrays in the output. If you also check the "Copy Arrays" option, the output will have all of the input arrays. This behavior allows you to focus on creating new arrays without worrying about the mesh.
Let's try an example. Create a Sphere source and then apply the Programmable Filter. Use the following script.
normals = inputs.PointData['Normals'] output.PointData.append(normals[:,0], "Normals_x")
This should create a sphere with on array called "Normals_x". There a few things to note here:
- You cannot refer to arrays directly by name as in the Python Calculator. You need to access arrays using the .PointData and .CellData qualifiers.
- Unlike the Python Calculator, you have to explicitly add an array to the output using the append function. Note that this function takes the name of the array as the second argument.
You can use any of the functions available in the Calculator in the Programmable Filter. For example, the following code creates two new arrays and adds them to the output.
normals = inputs.PointData['Normals'] output.PointData.append(sin(normals[:,0]), "sin of Normals_x") output.PointData.append(normals[:,1] + 1, "Normals_y + 1")
Mixing VTK and NumPy APIs
The examples above demonstrate how the Programmable Filter can be used as an advanced Python Calculator. However, the full power of the Programmable Filter can only be harnessed by using the VTK API. Let's start with a simple example. Create a Sphere source and apply the Programmable Filter with the following script.
input = inputs newPoints = vtk.vtkPoints() numPoints = input.GetNumberOfPoints() for i in range(numPoints): x, y, z = input.GetPoint(i) newPoints.InsertPoint(i, x, y, 1 + z*0.3) output.SetPoints(newPoints)
This requires some explanation. We start with creating a new instance of vtkPoints.
newPoints = vtk.vtkPoints()
vtkPoints is a data structure that VTK uses to store the coordinates of points. Next, we loop over all points of the input and insert a new point in the output with coordinates (x, y, 1+z*0.3)
for i in range(numPoints): x, y, z = input.GetPoint(i) newPoints.InsertPoint(i, x, y, 1 + z*0.3)
Finally, we replace the output points with the new points we created using the following.
Note: Python is an interpreted language and Python scripts do not execute as efficiently as compiled C++ code. Therefore, using a for loop that iterates over all points or cells may be a significant bottleneck when processing large datasets.
The NumPy and VTK APIs can be mixed to achieve good performance. Even though this may seem a bit complicated at first, it can be used with great effect. For example, the example above can be rewritten as follows.
from paraview.vtk.dataset_adapter import numpyTovtkDataArray input = inputs newPoints = vtk.vtkPoints() zs = 1 + input.Points[:,2]*0.3 coords = hstack([input.Points[:,0:2],zs]) newPoints.SetData(numpyTovtkDataArray(coords)) output.SetPoints(newPoints)
Even though this produces exactly the same result, it is much more efficient because we eliminated the for loop. Well, we didn't really eliminate it. We rather moved it from Python to C. Under the hood, NumPy uses C and Fortran for tight loops.
If you read the Python Calculator documentation, this example is pretty straightforward except the use of numpyTovtkDataArray(). First of all, note that we are mixing two APIs here: the VTK API and NumPy. VTK and NumPy uses different types of objects to represents arrays. The basic examples we previously used carefully hide this from you. However, once you start manipulating VTK objects using NumPy, you have to start converting objects between two APIs. Note that for the most part this conversion happens without "deep copying" arrays, e.g. copying the raw contents from one memory location to another. Rather, pointers are passed between VTK and NumPy whenever possible.
The dataset_adapter provides two methods to do the conversions described above:
- vtkDataArrayToVTKArray: This function create a NumPy compatible array from a vtkDataArray. Note that VTKArray is actually a subclass of numpy.matrix and can be used anywhere matrix can be used. This function always copies the pointer and not the contents. Important: You should not directly change the values of the resulting array if the argument is an array from the input.
- numpyTovtkDataArray: Converts a NumPy array (or a VTKArray) to a vtkDataArray. This function copies the pointer if the argument is a contiguous array. There are various ways of creating discontinuous arrays with NumPy including using hstack and striding. See NumPy documentation for details.
Dealing with Composite Datasets
So far, none of the example we used apply to multi-block or AMR datasets. When talking about the Python Calculator, we did not have to differentiate between simple and composite datasets. This is because the calculator loops over all of the leaf blocks of composite datasets and applies the expression to each one. Therefore, inputs in an expression are guaranteed to be simple datasets. On the other hand, the Programmable Filter does not perform this iteration and passes the input, composite or simple, as it is to the script. Even though this makes basic scripting harder for composite datasets, it provides enormous flexibility.
The only thing that you have to know to work with composite datasets is how to iterate over them to access the leaf nodes. Let's start with a simple example.
for block in inputs: print block
Here we iterate over all of the non-NULL leaf nodes (i.e. simple datasets) of the input and print them to the Output Messages console. Note that this will work only if the input is multi-block or AMR.
Note that when Output Dataset Type is set to "Same as Input", the Programmable Filter will copy composite dataset to the output - it will copy only the mesh unless Copy Arrays is on. Therefore, you can also iterate over the output. A simple trick is to turn on Copy Arrays and then use the arrays from the output when generating new ones. Below is an example. We use the can.ex2 file from the ParaView testing dataset collection.
def process_block(block): displ = block.PointData['DISPL'] block.PointData.append(displ[:,0], "displ_x") for block in output: process_block(block)
Alternatively, you can use the MultiCompositeDataIterator to iterate over the input and output block simultaneously. The following is equivalent to the previous example.
def process_block(input_block, output_block): displ = input_block.PointData['DISPL'] output_block.PointData.append(displ[:,0], "displ_x") from paraview.vtk.dataset_adapter import MultiCompositeDataIterator iter = MultiCompositeDataIterator([inputs, output]) for input_block, output_block in iter: process_block(input_block, output_block)
Advanced (but read it anyway)
Changing Output Type
- example with a table
Dealing with Structured Data Output
- example with resampling to image data