Difference between revisions of "Setting up a ParaView Server"
(Server reports: Failed to set up server socket)
(Pat Marion's trick for identifying which process has what display)
|Line 94:||Line 94:|
<font color="red">If you have set up a parallel job with multiple GPUs per node using a different MPI implementation, please contribute back by documenting it here.</font>
<font color="red">If you have set up a parallel job with multiple GPUs per node using a different MPI implementation, please contribute back by documenting it here.</font>
=== Sharing GPUs Amongst Processes ===
=== Sharing GPUs Amongst Processes ===
Revision as of 22:11, 15 March 2011
ParaView is designed to work well in client/server mode. In this way, users can have the full advantage of using a shared remote high-performance rendering cluster without leaving their offices. This document is designed to help get you started with build and setting up your own ParaView server. It also serves as a collection point for the "tribal knowledge" acquired to make parallel rendering and other aspects of parallel and client/server processing most efficient. You may also want to look at Configuring ParaView for Vis Clusters.
Ideally, we would like to provide precompiled binaries of ParaView for all of our users to make installing it more convenient. Unfortunately, the large variety of hardware, operating systems, and MPI implementations makes this task impossible. Thus, if you wish to use ParaView on a parallel server, you will have to compile ParaView from source.
After downloading ParaView, follow the Building and Installation instructions. When following these instructions, be sure to compile in MPI support by setting the PARAVIEW_USE_MPI CMake flag to ON and setting the appropriate paths to the MPI include directory and libraries.
One problem many people face when compiling with MPI is that their MPI implementation provides multiple libraries, many of which are required when compiling ParaView. If there are only two such libraries, you can add them separately in the MPI_LIBRARY and MPI_EXTRA_LIBRARY CMake variables. If you need to link in more than two libraries, you can specify multiple libraries in the MPI_LIBRARY variable by separating them with semicolons (;). You can apply the same trick to the MPI_INCLUDE_PATH to specify several include directories.
Another problem sometimes encountered is the lack of graphics libraries. There are many circumstances where you would want to compile the ParaView server on a parallel computer with no graphics hardware and thus no OpenGL implementation. In this case, most people use the Mesa 3D Graphics Library, which is a portable, software-only implementation of the OpenGL API. A cluster built using a Linux operating system probably already has a version of Mesa installed, but otherwise you can always download the source code from http://mesa3d.org.
One of the most difficult problems people face when installing a ParaView server is establishing XConnections. This whole problem can be circumvented by using the OSMesa library. However, Mesa is strictly a CPU rendering library so, use the OSMesa solution if and only if your server hardware does not have rendering hardware. If your cluster does not have graphics hardware, then compile ParaView with OSMesa support and use the --use-offscreen-rendering flag when launching the server.
The first step to compiling OSMesa support is to make sure that you are compiling with the Mesa 3D Graphics Library. It is difficult to tell an installation of Mesa from any other OpenGL implementation (although the existence of an osmesa.h header and a libOSMesa library is a good clue). If you are not sure, you can always download your own copy from http://mesa3d.org.
Now set the CMake variable OPENGL_INCLUDE_DIR to point to the Mesa include directory (the one containing the GL subdirectory), and set the OPENGL_gl_LIBRARY and OPENGL_glu_LIBRARY to the libGL and libGLU library files, respectively. Next, change the VTK_OPENGL_HAS_OSMESA variable to ON. After you configure again you will see a new CMake variable called OSMESA_LIBRARY. Set this to the libOSMesa library file. After you configure and generate your makefiles, you should be ready to build with OSMesa support.
Once again, once you build with OSMesa support, it will not take effect unless you launch the server with the --use-offscreen-rendering flag.
Please be aware that OSMesa support is not the same thing as mangled Mesa (although they are often used for the same thing). Mangled Mesa is not supported with ParaView. Mangled Mesa provides a mechanism to use either hardware acceleration or CPU-only rendering. Some organizations use this to provide a single build for multiple servers, some with and some without hardware rendering. We find it easier to simply provide a separate build for each server.
Running the Server
The ParaView client is a serial application and is always run with the paraview command. The server is a parallel MPI program that must be launched as a parallel job. Different implementations of MPI may have different ways to launch parallel programs, but the most common way is to use the mpirun command. Ask your system administrator if you are not sure how to launch your MPI programs. This document will assume you are using mpirun.
The ParaView server is almost always enabled with the pvserver command. Thus, the most simple configuration would have it launched as something like the following.
mpirun -np 4 ./pvserver
An integral part of configuring the ParaView server is setting up the client for starting the server. However, when initially configuring your server, it is best to do it in stages to better identify problems as they occur. Thus, as you are first trying to set up your server, set up your client for manual startup. That way, you can launch the server with mpirun at the command prompt. You will be able to immediatly see any output on the stdout and stderr streams and retry when something goes wrong.
Note that ParaView is designed to work well when the server and client are run remotely from each other. The idea is that the client can be run locally on the user's desktop/laptop
pvserver vs. pvrenderserver and pvdataserver
There are two modes in which you can launch the ParaView server. In the first mode, all data processing and rendering are handled in the same parallel job. This server is launched with the pvserver command. In the second mode, data processing is handled in one parallel job and the rendering is handled in another parallel job launched with the pvdataserver and pvrenderserver programs, respectively.
The point of having a separate data server and render server is the ability to use two different parallel computers, one with high performance CPUs and the other with GPU hardware. However, the server functionality split in two necessitates repartitioning and transfering the data from one to the other. This overhead is seldom much smaller than the cost of just performing both data processing and rendering in the same job.
Thus, we recommend on almost all instances simply using the single pvserver. This document does not describe how to launch data server / render server jobs. Even if you really feel like this mode is right for you, it is best to first to configure your server in single server mode. From there, establishing the data server / render server should be easier.
Connecting Through a Firewall
Often times security policies require either the ParaView server or client to be behind a firewall or some other network limiting technology. Such a configuration will add challenges to configuring your server to connect with a client. The type of networking safeguards, as well as the policies that govern them, vary greatly. We cannot possibly provide solutions for all of them, but we can give suggestions that might help you on your way. The main goal here is to establish a socket connection between client and (the first node of the) server. This socket by default is on port 11111.
Many firewalls will deny incoming connection requests but will allow outgoing connection requests. If only one side of the connection is behind such a firewall, then establishing the connection is easy. By default, the client connects to the server, so if the client is the one behind a firewall, nothing needs to be done. If the server is behind the firewall, you can reverse the connection direction: the server will connect back to the client. The server is instructed to perform a reverse connection by simply adding the -rc flag to its command line.
mpirun -np 4 ./pvserver -rc --client-host=myhost.mydomain.com
It is similarly straightforward to specify a reverse connection using the ParaView GUI or server configuration using XML configuration files.
If your firewall does not allow outbound connections or if client and server are each behind their own firewall, then no direct connection is possible. Now is probably a good time to talk to your system administrator about options.
One option that has proven to be effective when available is to use a VPN (virtual private network) connection on the client. A proper VPN connection can make the network of the client computer behave as if it is connected behind the firewall of the server, and thus the two can connect directly. Be aware that establishing a VPN connection will make the hostname and IP address for the client machine look different to the server, which may complicate specifying the connection.
If VPN is not available, you may be able to achieve a connection using the port forwarding feature of ssh. The ssh protocol allows for forwarding a socket request from one side of the connection to the other, assuming that the configuration settings allow for this. You might be able to use this feature to "punch through" the firewall. For more information, see the reverse connection and port forwarding page.
Even though X11 forwarding might be available, you should not run the client remotely and forward its X calls. ParaView will run much more efficiently if you run the client locally and you let ParaView directly handle the data transfer between local and remote machines.
One of the most common problems people have with setting up the ParaView server is allowing the server processes to open windows on the graphics card on each process's node. When ParaView needs to do parallel rendering, each process will create a window that it will use to render. This window is necessary because you need the X window before you can create an OpenGL context on the graphics hardware.
There is a way around this. If you are using the Mesa as your OpenGL implementation, then you can also use the supplemental OSMesa library to create an OpenGL context without an X window. However, Mesa is strictly a CPU rendering library so, use the OSMesa solution if and only if your server hardware does not have rendering hardware. If your cluster does not have graphics hardware, then compile ParaView with OSMesa support and use the --use-offscreen-rendering flag when launching the server.
Assuming that your cluster does have graphics hardware, you will need to establish the following three things.
- Have xdm run on each cluster node at startup. Although xdm is almost always run at startup on workstation installations, it is not as commonplace to be run on cluster nodes. Talk to your system administrators for help in setting this up.
- Disable all security on the X server. That is, allow any process to open a window on the x server without having to log in. Again, talk to your system administrators for help.
- Use the -display flag for pvserver to make sure that each process is connecting to the display localhost:0 (or just :0).
To enable the last condition, you would run something like
mpirun -np 4 ./pvserver -display localhost:0
An easy way to test your setup is to use the glxgears program. Unlike pvserver, it will quickly tell you (or, rather, fail to start) if it cannot connect to the local X server.
mpirun -np 4 /usr/X11R6/bin/glxgears -display localhost:0
Multiple GPUs Per Node
It is becoming commonplace to put multiple GPUs on each node in a cluster. Taking advantage of these multiple GPUs can be tricky.
Typically, each of these GPUs will have its own display. For example, if you have two GPUs on a node, they are probably referenced by the displays localhost:0.0 and localhost:0.1. When you run an X program with the display parameter or flag set to one of these, all X windows will open on that respective GPU and any graphics acceleration will also happen on that GPU. Thus, you can take advantage of both GPUs by launching different pvserver processes with different arguments to point to different displays.
Unfortunately, the method used to invoke an MPI job (usually through mpirun) is not part of the MPI specification and varies between implementations. In particular, the syntax used to declare different command lines for different processes can vary quite a bit.
OpenMPI (not to be confused with OpenMP, which is totally different) has a particularly easy way to specify multiple command lines. Simply separate the different command lines, along with the -np flag, with a colon (:). In our case, the command lines should be identical except with different display flags. We also need to use the -bynode flag, which assigns processes in a round robin fashion. Basically, this makes sure that each node is assigned a pair (or more) of processes that use the different displays. As an example, the following command line when run on an 8 node cluster launches a 16 process job with each node having two processes, each using a different display.
mpirun -bynode -np 8 ./pvserver -display :0.0 : -np 8 ./pvserver -display :0.1
If you have set up a parallel job with multiple GPUs per node using a different MPI implementation, please contribute back by documenting it here.
If you are trying to verify which displays each pvserver node is using, you can use the Programmable Source to identify which processes are using which displays. After connecting to your pvserver, create a Programmable Source and set the following script.
import os import subprocess display = os.getenv('DISPLAY') hostname = subprocess.Popen(['hostname'], stdout=subprocess.PIPE).communicate().strip() print hostname, display
After you apply the filter, this script will run on each process and output like the following will be printed to the pvserver terminal.
Process id: 3 >> vs8 :0.0 Process id: 4 >> vs14 :0.1 Process id: 7 >> vs8 :0.1 Process id: 0 >> vs14 :0.0 Process id: 5 >> vs30 :0.1 Process id: 1 >> vs30 :0.0 Process id: 2 >> vs2 :0.0 Process id: 6 >> vs2 :0.1
Sharing GPUs Amongst Processes
Visualization clusters that have GPUs are not always built with a one-to-one correspondence between CPUs and GPUs. In fact, industry trends at the time of this writing often lead to having more CPUs than GPUs on each node. For example, our current visualization cluster contains 4 CPUs per node but only 2 GPUs.
The pvserver is rather dumb about the number of GPUs. It assumes that each process has equal access to local rendering. This means that there is no special mechanism to, for example, coordinate the rendering between pairs of processes.
This leaves you with two options. Either you can launch one process per GPU and let CPUs go idle or you can launch one process per CPU and let multiple processes send rendering requests to the same GPU. The first option maximizes rendering speed but performs most other operations more slowly. The second option will maximize the speed of filter processing, but will throttle the rendering speed as GPU processors and buses must be shared.
There was once a time when rendering speed was the bottleneck for visualization. That, however, is no longer the case. The time spent in rendering is minimal, especially when compared to the time spent processing filters. The rendering speed can be throttled quite a bit before making a serious impact on visualization performance, even when running interactively. We thus recommend the second option, sharing GPUs.
GPUs can be implicitly shared by pointing multiple processes to the same display on the same host. One problem is that many GPUs will not correctly two windows on top of each other. The two windows share memory space and clobber each others memory. To get around this problem, use the --use-offscreen-rendering flag. This will create each rendering context in its own offscreen buffer and guarantees that the memory will not overrun that of another rendering context. As an example, here is the mpirun command you might use on a cluster with 8 nodes, each containing 4 cores (for a total of 32) and 1 GPU.
mpirun -np 32 ./pvserver -display :0.0 --use-offscreen-rendering
You can still share GPUs when you have #Multiple GPUs Per Node. You use the same techniques in the section above except that you allow processes to have the same display and you add the --use-offscreen-rendering flag for each command. So, for example, if your cluster has 8 nodes, each containing 4 cores and 2 GPUs, the OpenMPI mpirun command could look like the following.
mpirun -bynode -np 16 ./pvserver -display :0.0 --use-offscreen-rendering : \ -np 16 ./pvserver -display :0.1 --use-offscreen-rendering
Using a Tiled Display
ParaView has the ability to render directly to a tiled display. Furthermore, when rendering to a tiled display ParaView uses a built in library, IceT, to perform the rendering in a parallel and efficient manner.
To put ParaView in a tiled display mode, give pvserver (or pvrenderserver) the X and Y dimensions of the 2D grid of displays that make up the tiled display with the --tile-dimensions-x (or -tdx) and --tile-dimensions-y (or -tdy) arguments. For example, to drive a 3 X 2 tiled display, you launch the server with a command like the following.
mpirun -np 16 ./pvserver -display localhost:0 -tdx=3 -tdy=2
There must be at least as many processes in the MPI job as there are tiles in the display; however, adding more processes than tiles is recommended as they will all be used to perform the parallel rendering. In the above example, I arbitrarily picked 16 processes. As few as 6 processes would work, but 32 would be even better if you had that many GPUs.
ParaView assumes that the first T processes have their displays connected directly to one of the tiles in a T tile display. The processes are assigned in row major order from left to right and top to bottom. For example, in a 3 X 2 display the processes are assigned as follows.
The only way to adjust which processes are connected to which tiles is to reconfigure the machines configuration of MPI.
The tiled display will not be driven correctly if the server is run with the --use-offscreen-rendering flag for obvious reasons.
Here we capture the most common problems people run into with setting up client/server.
Specifying multiple MPI include directories
You can add multiple directories to the MPI_INCLUDE_PATH CMake variable by separating them with semicolons (;). See the #Compiling section for more details.
Specifying multiple MPI libraries
You can use both the MPI_LIBRARY and MPI_EXTRA_LIBRARY CMake variables for specifying MPI libraries. You can also add multiple libraries to MPI_LIBRARY by separating the files with semicolons (;). See the #Compiling section for more details.
Do not bother to use Mangled Mesa. Compiling a version of Mesa is typically more trouble than it is worth and is incompatible with the OSMesa support instructions given on this page.
ParaView does not scale
Many a user have reported to the mailing list that ParaView failed as they tried to scale up the data size on their server. First, let me assure you that ParaView’s parallel visualization and rendering are efficient and scalable. We (at Sandia National Laboratories) have been able to use ParaView to visualize 6 billion cell grids and have clocked rendering speeds of over 8 billion polygons per second.
When a user reports that ParaView is not scaling to large data sets, it is almost always because the server is misconfigured for parallel rendering. The problem is often misinterpreted as a scaling problem because ParaView will use serial rendering for small data and parallel rendering for large data. So when the server is misconfigured and cannot perform parallel rendering, it sometimes misbehaves when the data gets big enough to use parallel rendering.
Parallel rendering is build right into ParaView. There is nothing special you have to compile to set this up. However, to perform parallel rendering (or any rendering, for that matter), the ParaView server needs to have an OpenGL context. This is usually done through X Connections. However, most parallel programs have no need to open an X window, so most clusters are not configured to allow X connections. For help on how to configure your cluster, see the X Connections section.
Before reporting scalability problems with ParaView, please verify that parallel rendering is working correctly. You can do that with the following procedure.
- Open the Settings dialog box (Edit -> Settings) and go to the Server tab.
- Make sure the checkbox next to Remote Render Threshold is checked, and move the associated slider all the way to the left (0 MBytes).
- Make sure the checkbox next to Subsample Rate is checked and move the slider to the right (4 Pixels or more).
- Create or load any data (the cone source works fine) and rotate the data with the mouse. While rotating, the image should look pixelated (blocky). When you let go of the mouse, the full resolution picture is restored.
The Remote Render Threshold option tells ParaView to always use parallel rendering. The Subsample Rate tells ParaView to render smaller images during interaction to make the GUI more responsive. This latter feature is very noticeable when the subsample rate is high and only used during parallel rendering. So if you are not seeing the subsample effect (or if something went clearly wrong before that), then your parallel rendering is not working.
Reverse connection does not work
A "standard" connection has the server (pvserver) listen for a connection from the client (paraview). ParaView also has the ability to perform a "reverse connection" where the client waits for the server. Creating a reverse connection is straightforward. Simply use the --reverse-connection (-rc) command line option on the server and specify a reverse connection in the ParaView GUI (you will have to add a new server in the Choose Server dialog box; see Starting the server). If you can get the standard connection to work but not the reverse connection, one of the following may be occurring.
- A firewall or some other network configuration may be preventing you from connecting from server to client. To test this, try swapping the location of the server and client and test the forward connection again.
- Make sure that both the client and the server are set up to do a reverse connection. Make sure that the server is being launched with the reverse connection flag and that the GUI is configured to connect with a reverse connection.
- Make sure that the client is started first and ready to receive a connection before starting the server. When doing a reverse connection, the client must already be started and waiting for a connection before starting the server. If you try to start the server before the client is ready, it may fail to connect and then give up before the client starts waiting for the connection. If you are starting the server from the client GUI, this should not be an issue.
Cannot launch paraview with mpirun
Occasionally users report problems with trying to run the ParaView client (paraview) with mpirun like this:
mpirun paraview [args]
Don't do this! The ParaView client is a serial application. It is not meant to be run under mpirun. Only run the server (pvserver with mpirun.
The client only connects to one node on the server
Users sometimes ask how to get the client to make a socket connection to every process on the server. You don't. ParaView is not supposed to run like that. When running in client/server mode, ParaView connects to process 0 of the server, and nothing else. All communication with the client goes through process 0, and process 0 of the server uses the MPI interconnect to pass data to and from the other nodes in the server.
ParaView is implemented in this way for convenience and scalability. It is not scalable to have every process in the server to connect to the client because all communication will eventually have to go through the same network interface on the client side. Also, the MPI interconnect on the server is almost always faster than the socket communication between client and server.
Server processes always have 100% CPU usage
It has often been observed that when running pvserver under mpirun, many of the processes that are launched for this job always use a 100% of a CPU, even when it should be sitting idle. The most common pattern is for one processes (the root process) to actually be idle while the rest are constantly running.
This observed behavior is due to the implementation of the MPI layer. The OpenMPI and MPICH implementations, the two most common we encounter, both exhibit this behavior. In these implementations when a process is waiting for a message (which is the case when pvserver is supposed to be sitting idle waiting for a message), the process actually sits in a busy wait loop. (The root process is the single exception as it is waiting for a message on a socket, not an MPI message.)
This behavior is intentionally added by the MPI library developers (not so much the ParaView developers) for efficiency. The general idea is to keep each process "attached" to the core it is running on. Once a process goes idle, it is likely to be scheduled off that core by the OS process scheduler and then scheduled back on to a different core. This can have detrimental effects on things like memory access as each core can have its own cache hierarchy.
The parameters for controlling this behavior varies based on the MPI implementation. The following links provide some documentation on turning off this behavior for OpenMPI.
If you have information on disabling the busy wait using a different MPI implementation, please contribute back by documenting it here.
Server reports: Failed to set up server socket
You might get an error like the following
Listen on port: 11111 ERROR: In /Users/kmorel/src/ParaView/Servers/Common/vtkProcessModuleConnectionManager.cxx, line 191 vtkProcessModuleConnectionManager (0xb380540): Failed to set up server socket.
The error is meant to tell you that there is already some process using port 11111. If you are on Linux or a system that has similar command line tools (like Mac OS X), you can confirm this with
$ netstat -na | grep 11111
$ lsof -i:11111
The latter will tell you the name of the process that is blocking the port (the first won't). If that name is cropped and not unique try adding "+c15" to the lsof command line. Chances are there is still an old pvserver process running and waiting on that socket. Either kill the process that blocks the port or use another one yourself with
$ pvserver --server-port=11112
You can get the same problem on the client side when doing a reverse connection. The solution is corresponding.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.