VTK/Rendering Update
This page attempts to summarize internal discussions and current directions with the VTK rendering update. The update to our rendering infrastructure involves several distinct types of rendering that have been identified as important, along with several key pieces of rendering infrastructure to accomplish these goals and lay the groundwork for the next decade. Please bear in mind that these plans are fluid, are being entered onto the VTK wiki for both wider dissemination and feedback before too much has been set in stone.
The main rendering targets can be roughly divided into three major areas:
- Volumes
- Polygonal geometry
- Glyphs and implicitly ray-traced geometry (e.g. spheres generated with fragment shaders)
The first two are likely familiar, glyph rendering has been present for quite some time, and implicit geometry is another approach that takes advantage of the parallelism on fragment shaders to generate smooth shapes for those that can be expressed mathematically. On the OpenGL side the major change is moving from the deprecated fixed pipeline of OpenGL 1.1 to the modern programmable pipeline where most rendering is performed on GPU resident buffers using vertex and fragment shaders.
Overview of Changes
Some of the major changes proposed are:
- Move to OpenGL 2.1 on the desktop, and OpenGL ES 2.0 on embedded systems
- Use a common subset of APIs where possible to maximize code reuse
- GL 2.1 is fully supported in recent versions of Mesa, Regal, and ES 2.0 on Angle
- Major shift from uploading single vertices to batching copies of geometry
- Heavy use of vertex buffer objects, shaders, framebuffer objects
- Provide hooks for interop with OpenGL or CUDA generated buffers
- Move to a rendering scene to enable more optimization
- Ability to optimize rendering of large scenes
- Minimization of state changes, coalescing draws
- Advanced rendering techniques are easier to implement
Updating the rendering in VTK is a large undertaking, and will require shifts in how we perform rendering functions. It is also clear that we need a simpler, slimmer API where rendering can shift from tight loops with many branches to batched operations. It also becomes more important to support shared resources among OpenGL contexts, ideally shifting from a highly CPU-bound render loop to a highly GPU-bound render loop. The updates will make it simpler to add new OpenGL code to existing approaches, or override specific pieces.
Key Technologies
In order to minimize development time several libraries are being adopted, including Eigen for linear algebra and GLEW to handle extensions on the desktop. Newer libraries and abstractions such as Regal and Angle are also being examined in addition to Mesa to provide a wider number of systems where the OpenGL used for rendering can effectively be used. These choices are driven by previous experience in the VES and Avogadro projects which both explored approaches using scene graphs and more modern OpenGL APIs.
Initial Development Emphasis
Right now we are focused on several key areas to demonstrate viability and start discussions about the new API being developed.
- Implement a very simple scene API
- Bootstrapping from previous efforts, largely to house new OpenGL developments
- Get basic polydata rendering working
- Show the basics, get early measurements on performance, memory use
- Extend out to encompass features in existing APIs
- Concentrate on optimizing for batching of operations
- Use of GL 2.1 on the desktop, extend to ES 2.0 testing
- Get new volume rendering code working
- Take advantages of new OpenGL features, simplify API
- Bring in key features used in other toolkits/codes
- Double dispatch to enable runtime extension
- Central registration/management of shader code
- Shared resources between rendering contexts
Removing some things from the rendering API that are not available everywhere/deprecated:
- No glVertex, no matrix stacks, shift to only using triangles with VBOs for geometry
- Move to using uniforms, attribute arrays, shaders
- Subsections of buffers can be updates, are there new opportunities there?
- Much smaller OpenGL state machine, state changes are expensive to minimize/batch
New Scene API
None of this is set in stone, but represents the current API as it is in testing. The new code is currently in a topic on Gerrit with some further changes being prepared. It is in a new VTK module, with some tests already implemented. It uses and depends upon (external right now) Eigen and GLEW. Has basic geometry rendering implemented, needs wider testing but has been demonstrated on NVIDIA Quadro and an Intel embedded card. Initial tests show lower memory use and switch from CPU to GPU bounds rendering. The basic rendering API has two base classes - Node and Visitor.
Node
The Node class does very little, the GroupNode has child nodes, and the GeometryNode has Drawable children. The Drawable objects are derived objects that can be rendered in one way or another, such as MeshGeometry that can render a triangle mesh. The base class can be seen below, with the critical virtual functions being accept, traverse and ascend which are used by the visitors to move through the graph.
<source lang="cpp"> class Node { public:
Node(); virtual ~Node();
/** Accept a visit from our friendly visitor. */ virtual void accept(Visitor &) { return; }
/** Traverse any children the node might have, and call accept on them. */ virtual void traverse(Visitor &) { return; }
/** Ascend to the parent and call accept on that. */ virtual void ascend(Visitor &);
/** Get a pointer to the node's parent. */ const GroupNode * parent() const; GroupNode * parent();
}; </source>
Visitor
The Visitor class is the base class of anything that traverses the scene. There is a RenderVisitor that renders things in the scene, and a GeometryVisitor that calculates the overall geometry of a scene. Many more could be added, but these are enough to perform basic rendering at this stage. The base class can be seen below, with the visit virtuals for the different node types.
<source lang="cpp"> class Visitor { public:
Visitor(); virtual ~Visitor();
/** The overloaded visit functions, the base versions of which do nothing. */ virtual void visit(Node &) { return; } virtual void visit(GroupNode &) { return; } virtual void visit(GeometryNode &) { return; } virtual void visit(TransformNode &) { return; } virtual void visit(Drawable &) { return; } virtual void visit(MeshGeometry &) { return; }
}; </source>
Double Dispatch
The core of the rendering abstraction is the double dispatch, that is the steps where visit is called on the visitor object, and it calls accept on the type it wishes to visit. So a very simple render might look like the following if we were to render all opaque geometry in the scene, then translucent, and finally an overlay.
<source lang="cpp">
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); applyProjection();
RenderVisitor visitor(m_camera); // Setup for opaque geometry visitor.setRenderPass(OpaquePass); glEnable(GL_DEPTH_TEST); glDisable(GL_BLEND); m_scene.rootNode().accept(visitor);
// Setup for transparent geometry visitor.setRenderPass(TranslucentPass); glEnable(GL_BLEND); glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA); m_scene.rootNode().accept(visitor);
// Setup for 3d overlay rendering visitor.setRenderPass(Overlay3DPass); glClear(GL_DEPTH_BUFFER_BIT); m_scene.rootNode().accept(visitor);
// Setup for 2d overlay rendering visitor.setRenderPass(Overlay2DPass); visitor.setCamera(m_overlayCamera); glDisable(GL_DEPTH_TEST); m_scene.rootNode().accept(visitor);
</source>
Once inside the accept method, it will typically call visit as in the GroupNode, <source lang="cpp"> void GroupNode::accept(Visitor &visitor) {
visitor.visit(*this);
} </source> This will then call the GroupNode version of the visit member on the visitor (in this case the RenderVisitor), <source lang="cpp"> void RenderVisitor::visit(GroupNode &group) {
group.traverse(*this);
} </source> This causes the node to call accept on all child nodes, <source lang="cpp"> void GroupNode::traverse(Visitor &visitor) {
for (std::vector<Node *>::iterator it = m_children.begin(); it != m_children.end(); ++it) { (*it)->accept(visitor); }
} </source> The next interesting action is to call the visit for a Drawable, <source lang="cpp"> void RenderVisitor::visit(Drawable &geometry) {
if (geometry.renderPass() == m_renderPass) { geometry.render(m_camera); }
} </source> Similarly, a possible implementation for a transform node is, <source lang="cpp"> void RenderVisitor::visit(TransformNode &transform) {
Camera old = m_camera; m_camera.setModelView(m_camera.modelView() * transform.transform()); transform.traverse(*this); m_camera = old;
} </source>
These loop through, using the most derived types available, the transform is calculated once per traversal to all child nodes. The actual rendering code then uses OpenGL to render the geometry. The addition of runtime double dispatch registration enables users of the API to register new node types or visitor types at runtime, and traversal would always execute the most derived form of a type in the hierarchy. This allows us to override the RenderVisitor for our derived ProeduralMeshGeometry object, but leave the default implementation of GeometryVisitor call for the less derived MeshGeometry type as there is nothing different there.
User Facing API
So...what does the user need to do to render a simple mesh with this new API? I have added three tests, TestPLY, TestPLYLegacy and TestPLYMapper. The common part of the tests (loading everything, getting the render window etc),
<source lang="cpp">
vtkNew<vtkSceneActor> actor; vtkNew<vtkRenderer> renderer; renderer->SetBackground(0.0, 0.0, 0.0); vtkNew<vtkRenderWindow> renderWindow; renderWindow->SetSize(300, 300); renderWindow->AddRenderer(renderer.Get()); renderer->AddActor(actor.Get());
const char* fileName = vtkTestUtilities::ExpandDataFileName(argc, argv, "Data/dragon.ply"); vtkNew<vtkPLYReader> reader; reader->SetFileName(fileName); reader->Update(); vtkNew<vtkPolyDataNormals> computeNormals; computeNormals->SetInputConnection(reader->GetOutputPort()); computeNormals->Update(); vtkPolyData *poly = computeNormals->GetOutput();
</source>
Here the actor is a vtkSceneActor, in the other two tests this is a vtkActor. Just using the scene the current test will, <source lang="cpp">
vtkgl::Scene *scene = actor->GetScene(); vtkgl::GeometryNode *geometry(new vtkgl::GeometryNode); vtkgl::MeshGeometry *mesh(new vtkgl::MeshGeometry); mesh->setColor(vtkgl::Vector3ub(255, 255, 255)); ConvertTriangles(poly, mesh); geometry->addDrawable(mesh); scene->rootNode().addChild(geometry);
</source>
The version using a vtkScenePolyDataMapper as a scene factory does the following, <source lang="cpp">
vtkNew<vtkScenePolyDataMapper> mapper; mapper->SetInputConnection(computeNormals->GetOutputPort()); actor->SetMapper(mapper.Get());
</source> whereas the test using the old API does, <source lang="cpp">
vtkNew<vtkPolyDataMapper> mapper; mapper->SetInputConnection(computeNormals->GetOutputPort()); actor->SetMapper(mapper.Get());
</source>
For completeness, the rest of the test provides some timing, multiple renders to compare, and then allows normal interaction. <source lang="cpp">
vtkNew<vtkRenderWindowInteractor> interactor; interactor->SetRenderWindow(renderWindow.Get()); renderWindow->SetMultiSamples(0); interactor->Initialize();
vtkNew<vtkTimerLog> timer; double time(0.0); for (int i = 0; i < 10; ++i) { timer->StartTimer(); renderWindow->Render(); timer->StopTimer(); cout << "Rendering frame " << i << ": " << timer->GetElapsedTime() << endl; time += timer->GetElapsedTime(); } cout << "Average time: " << time / 10.0 << endl;
interactor->Start();
delete [] fileName;
return EXIT_SUCCESS;
</source>
I think there is still a strong case for keeping the concepts of mappers, and using them as factories to create objects in the scene from outputs of the pipeline. The main question then becomes should we use a default scene for any given renderer or have the user pass in the intended scene for the mapper (or support both).
New OpenGL API
The OpenGL API needs some updates to be able to use new features, and make it easier to support some of the new concepts introduced in the last few revisions. Other languages, such as Python, already have great wrapped GL features and so I propose excluding all new OpenGL VTK API and enabling the use of newer C++ concepts that the wrappers will not necessarily cope with. One example of this is the buffer object,
<source lang="cpp"> class BufferObject { public:
enum ObjectType { ArrayBuffer, ElementArrayBuffer };
BufferObject(ObjectType type = ArrayBuffer); ~BufferObject();
/** Get the type of the buffer object. */ ObjectType type() const;
/** Get the handle of the buffer object. */ int handle() const;
/** Determine if the buffer object is ready to be used. */ bool ready() const { return m_dirty == false; }
/** * Upload data to the buffer object. The BufferObject::type() must match * @a type or be uninitialized. * * The T type must have tightly packed values of T::value_type accessible by * reference via T::operator[]. Additionally, the standard size() and empty() * methods must be implemented. The std::vector class is an example of such a * supported containers. */ template <class T> bool upload(const T &array, ObjectType type);
/** * Bind the buffer object ready for rendering. * @note Only one ARRAY_BUFFER and one ELEMENT_ARRAY_BUFFER may be bound at * any time. */ bool bind();
/** * Release the buffer. This should be done after rendering is complete. */ bool release();
/** * Return a string describing errors. */ std::string error() const { return m_error; }
}; </source>
It makes use of templates and assumes the array is already tightly packed, higher level concepts can be layered on top but this provides very low level access to buffer objects (vertex or index buffers right now). The Camera class effectively uses Eigen to manage the projection and model-view matrices, and is really used as a container for these two things that are then passed to the shaders as uniforms. Effective ways of registering/sharing VBOs, shader programs etc are needed, and I have not yet done very much in this area. I also want to look at minimizing state changes, and ideally coalescing similar objects into a single VBO/draw call to enable efficient rendering of many small objects in the graph (this was demonstrated at GTC in 2014) with big improvements when VBO sizes were small without coalescing.
Looking to the RenderWindow, I would propose separating the implementation to a RenderWindow and a RenderWindowDevice. The RenderWindow can then be further derived by users (we have had several requests), and then device specific actions can be overridden in platform specific RenderWindowDevices. VTK can still take care of ensuring the correct device is instantiated for the platform, and I have begun work on an updated render window class to enable us to jump through the necessary hoops to get an OpenGL 3/4 context, I would really like to add EGL support so that we can test ES 2.0 on the desktop/dashboards too.