[Paraview] paraview - client-server

burlen burlen.loring at gmail.com
Fri Apr 30 14:02:07 EDT 2010


I see what you mean. QProcess::close() only closes the ssh process 
spawned by PV, not the other processes spawned by ssh on the remote 
host. & for ssh to clean up after itself on the remote host it needs a 
tty there.

Thanks for the clarification.
Burlen

pat marion wrote:
> I have applied your patch.  I agree that paraview should explicity 
> close the child process.  But... what I am pointing out is that 
> calling QProcess::close() does not help in this situation.  What I am 
> saying is that, even when paraview does kill the process, any commands 
> run by ssh on the other side of the netpipe will be orphaned by sshd.  
> Are you sure you can't reproduce it?
>
> $ ssh localhost sleep 1d
> $ < press control-c >
> $ pidof sleep
> $ # sleep is still running
>
> Pat
>
> On Fri, Apr 30, 2010 at 2:08 AM, burlen <burlen.loring at gmail.com 
> <mailto:burlen.loring at gmail.com>> wrote:
>
>     Hi Pat,
>
>     From my point of view the issue is philosophical, because
>     practically speaking I couldn't reproduce the orphans with out
>     doing something a little odd namely, ssh ... &&  sleep 1d.
>     Although the fact that a user reported suggests that it may occur
>     in the real world as well. The question is this: should an
>     application explicitly clean up resources it allocates? or should
>     an application rely on the user not only knowing that there is the
>     potential for a resource leak but also knowing enough to do the
>     right thing to avoid it (eg ssh -tt ...)? In my opinion, as a
>     matter of principle, if PV spawns a process it should explicitly
>     clean it up and there should be no way it can become an orphan. In
>     this case the fact that the orphan can hold ports open is
>     particularly insidious, because further connection attempt on that
>     port fails with no helpful error information. Also it is not very
>     difficult to clean up a spawned process. What it comes down to is
>     a little book keeping to hang on to the qprocess handle and a few
>     lines of code called from pqCommandServerStartup destructor to
>     make certain it's cleaned up. This is from the patch I submitted
>     when I filed the bug report.
>
>     +    // close running process
>     +    if (this->Process->state()==QProcess::Running)
>     +      {
>     +      this->Process->close();
>     +      }
>     +    // free the object
>     +    delete this->Process;
>     +    this->Process=NULL;
>
>     I think if the cluster admins out there new which ssh options
>     (GatewayPorts etc) are important for ParView to work seamlessly,
>     then they might be willing to open them up. It's my impression
>     that the folks that build clusters want tools like PV to be easy
>     to use, but they don't necessarily know all the in's and out's of
>     confinguring and running PV.
>
>     Thanks for looking at this again! The -tt option to ssh is indeed
>     a good find.
>
>     Burlen
>
>     pat marion wrote:
>
>         Hi all!
>
>         I'm bringing this thread back- I have learned a couple new
>         things...
>
>         -----------------------
>         No more orphans:
>
>         Here is an easy way to create an orphan:
>
>           $ ssh localhost sleep 1d
>           $ <press control c>
>
>         The ssh process is cleaned up, but sshd orphans the sleep
>         process.  You can avoid this by adding '-t' to ssh:
>
>          $ ssh -t localhost sleep 1d
>
>         Works like a charm!  But then there is another problem... try
>         this command from paraview (using QProcess) and it still
>         leaves an orphan, doh!  Go back and re-read ssh's man page and
>         you have the solution, use '-t' twice: ssh -tt
>
>         -------------------------
>         GatewayPorts and portfwd workaround:
>
>         In this scenario we have 3 machines: workstation,
>         service-node, and compute-node.  I want to ssh from
>         workstation to service-node and submit a job that will run
>         pvserver on compute-node.  When pvserver starts on
>         compute-node I want it to reverse connect to service-node and
>         I want service-node to forward the connection to workstation.
>          So here I go:
>
>           $ ssh -R11111:localhost:11111 service-node qsub
>         start_pvserver.sh
>
>         Oops, the qsub command returns immediately and closes my ssh
>         tunnel.  Let's pretend that the scheduler doesn't provide an
>         easy way to keep the command alive, so I have resorted to
>         using 'sleep 1d'.  So here I go, using -tt to prevent orphans:
>
>          $ ssh -tt -R11111:localhost:11111 service-node "qsub
>         start_pvserver.sh && sleep 1d"
>
>         Well, this will only work if GatewayPorts is enabled in
>         sshd_config on service-node.  If GatewayPorts is not enabled,
>         the ssh tunnel will only accept connections from localhost, it
>         will not accept a connection from compute-node.  We can ask
>         the sysadmin to enable GatewayPorts, or we could use portfwd.
>          You can run portfwd on service-node to forward port 22222 to
>         port 11111, then have compute-node connect to
>         service-node:22222.  So your job script would launch pvserver
>         like this:
>
>          pvserver -rc -ch=service-node -sp=22222
>
>         Problem solved!  Also convenient, we can use portfwd to
>         replace 'sleep 1d'.  So the final command, executed by
>         paraview client:
>
>          ssh -tt -R 11111:localhost:11111 service-node "qsub
>         start_pvserver.sh && portfwd -g -c fwd.cfg"
>
>         Where fwd.cfg contains:
>
>          tcp { 22222 { => localhost:11111 } }
>
>
>         Hope this helps!
>
>         Pat
>
>         On Fri, Feb 12, 2010 at 7:06 PM, burlen
>         <burlen.loring at gmail.com <mailto:burlen.loring at gmail.com>
>         <mailto:burlen.loring at gmail.com
>         <mailto:burlen.loring at gmail.com>>> wrote:
>
>
>                Incidentally, this brings up an interesting point about
>                ParaView with client/server.  It doesn't try to clean
>         up it's
>                child processes, AFAIK.  For example, if you set up
>         this ssh
>                tunnel inside the ParaView GUI (e.g., using a command
>         instead
>                of a manual connection), and you cancel the connection, it
>                will leave the ssh running.  You have to track down the ssh
>                process and kill it yourself.  It's minor thing, but it can
>                also prevent future connections if you don't realize
>         there's a
>                zombie ssh that kept your ports open.
>
>            I attempted to reproduce on my kubuntu 9.10, qt 4.5.2
>         system, with
>            slightly different results, which may be qt/distro/os specific.
>
>            On my system as long as the process ParaView spawns finishes on
>            its own there is no problem. That's usually how one would
>         expect
>            things to work out since when the client disconnects the server
>            closes followed by ssh. But, you are right that PV never
>            explicitly kills or otherwise cleans up after the process it
>            starts. So if the spawned process for some reason doesn't
>         finish
>            orphan processes are introduced.
>
>            I was able to produce orphan ssh processes, giving the PV
>         client a
>            server start up command that doesn't finish. eg
>
>              ssh ... pvserver ... && sleep 100d
>
>            I get the situation you described which prevents further
>            connection on the same ports. Once PV tries and fails to
>         connect
>            on th eopen ports, there is crash soon after.
>
>            I filed a bug report with a patch:
>            http://www.paraview.org/Bug/view.php?id=10283
>
>
>
>            Sean Ziegeler wrote:
>
>                Most batch systems have an option to wait until the job is
>                finished before the submit command returns.  I know PBS
>         uses
>                "-W block=true" and that SGE and LSF have similar
>         options (but
>                I don't recall the precise flags).
>
>                If your batch system doesn't provide that, I'd recommend
>                adding some shell scripting to loop through checking
>         the queue
>                for job completion and not return until it's done.  The
>         sleep
>                thing would work, but wouldn't exit when the server
>         finishes,
>                leaving the ssh tunnels (and other things like portfwd
>         if you
>                put them in your scripts) lying around.
>
>                Incidentally, this brings up an interesting point about
>                ParaView with client/server.  It doesn't try to clean
>         up it's
>                child processes, AFAIK.  For example, if you set up
>         this ssh
>                tunnel inside the ParaView GUI (e.g., using a command
>         instead
>                of a manual connection), and you cancel the connection, it
>                will leave the ssh running.  You have to track down the ssh
>                process and kill it yourself.  It's minor thing, but it can
>                also prevent future connections if you don't realize
>         there's a
>                zombie ssh that kept your ports open.
>
>
>                On 02/08/10 21:03, burlen wrote:
>
>                    I am curious to hear what Sean has to say.
>
>                    But, say the batch system returns right away after
>         the job
>                    is submitted,
>                    I think we can doctor the command so that it will
>         live for
>                    a while
>                    longer, what about something like this:
>
>                    ssh -R XXXX:localhost:YYYY remote_machine
>                    "submit_my_job.sh && sleep
>                    100d"
>
>
>                    pat marion wrote:
>
>                        Hey just checked out the wiki page, nice! One
>                        question, wouldn't this
>                        command hang up and close the tunnel after
>         submitting
>                        the job?
>                        ssh -R XXXX:localhost:YYYY remote_machine
>         submit_my_job.sh
>                        Pat
>
>                        On Mon, Feb 8, 2010 at 8:12 PM, pat marion
>                        <pat.marion at kitware.com
>         <mailto:pat.marion at kitware.com> <mailto:pat.marion at kitware.com
>         <mailto:pat.marion at kitware.com>>
>                        <mailto:pat.marion at kitware.com
>         <mailto:pat.marion at kitware.com>
>
>                        <mailto:pat.marion at kitware.com
>         <mailto:pat.marion at kitware.com>>>> wrote:
>
>                        Actually I didn't write the notes at the
>         hpc.mil <http://hpc.mil>
>                        <http://hpc.mil> <http://hpc.mil>
>
>                        link.
>
>                        Here is something- and maybe this is the
>         problem that
>                        Sean refers
>                        to- in some cases, when I have set up a reverse ssh
>                        tunnel from
>                        login node to workstation (command executed from
>                        workstation) then
>                        the forward does not work when the compute node
>                        connects to the
>                        login node. However, if I have the compute node
>                        connect to the
>                        login node on port 33333, then use portfwd to
>         forward
>                        that to
>                        localhost:11111, where the ssh tunnel is
>         listening on
>                        port 11111,
>                        it works like a charm. The portfwd tricks it into
>                        thinking the
>                        connection is coming from localhost and allow
>         the ssh
>                        tunnel to
>                        work. Hope that made a little sense...
>
>                        Pat
>
>
>                        On Mon, Feb 8, 2010 at 6:29 PM, burlen
>                        <burlen.loring at gmail.com
>         <mailto:burlen.loring at gmail.com>
>         <mailto:burlen.loring at gmail.com <mailto:burlen.loring at gmail.com>>
>                        <mailto:burlen.loring at gmail.com
>         <mailto:burlen.loring at gmail.com>
>                        <mailto:burlen.loring at gmail.com
>         <mailto:burlen.loring at gmail.com>>>> wrote:
>
>                        Nice, thanks for the clarification. I am
>         guessing that
>                        your
>                        example should probably be the recommended
>         approach rather
>                        than the portfwd method suggested on the PV
>         wiki. :) I
>                        took
>                        the initiative to add it to the Wiki. KW let me
>         know
>                        if this
>                        is not the case!
>
>                      
>          http://paraview.org/Wiki/Reverse_connection_and_port_forwarding#Reverse_connection_over_an_ssh_tunnel
>
>
>
>                        Would you mind taking a look to be sure I
>         didn't miss
>                        anything
>                        or bollix it up?
>
>                        The sshd config options you mentioned may be
>         why your
>                        method
>                        doesn't work on the Pleiades system, either that or
>                        there is a
>                        firewall between the front ends and compute
>         nodes. In
>                        either
>                        case I doubt the NAS sys admins are going to
>                        reconfigure for
>                        me :) So at least for now I'm stuck with the
>         two hop ssh
>                        tunnels and interactive batch jobs. if there were
>                        someway to
>                        script the ssh tunnel in my batch script I would be
>                        golden...
>
>                        By the way I put the details of the two hop ssh
>         tunnel
>                        on the
>                        wiki as well, and a link to Pat's hpc.mil
>         <http://hpc.mil>
>                        <http://hpc.mil> <http://hpc.mil>
>
>                        notes. I don't dare try to summarize them since
>         I've never
>                        used portfwd and it refuses to compile both on my
>                        workstation
>                        and the cluster.
>
>                        Hopefully putting these notes on the Wiki will
>         save future
>                        ParaView users some time and headaches.
>
>
>                        Sean Ziegeler wrote:
>
>                        Not quite- the pvsc calls ssh with both the
>         tunnel options
>                        and the commands to submit the batch job. You
>         don't even
>                        need a pvsc; it just makes the interface
>         fancier. As long
>                        as you or PV executes something like this from your
>                        machine:
>                        ssh -R XXXX:localhost:YYYY remote_machine
>         submit_my_job.sh
>
>                        This means that port XXXX on remote_machine
>         will be the
>                        port to which the server must connect. Port
>         YYYY (e.g.,
>                        11111) on your client machine is the one on
>         which PV
>                        listens. You'd have to tell the server (in the
>         batch
>                        submission script, for example) the name of the
>         node and
>                        port XXXX to which to connect.
>
>                        One caveat that might be causing you problems, port
>                        forwarding (and "gateway ports" if the server
>         is running
>                        on a different node than the login node) must
>         be enabled
>                        in the remote_machine's sshd_config. If not, no ssh
>                        tunnels will work at all (see: man ssh and man
>                        sshd_config). That's something that an
>         administrator
>                        would need to set up for you.
>
>                        On 02/08/10 12:26, burlen wrote:
>
>                        So to be sure about what you're saying: Your .pvsc
>                        script ssh's to the
>                        front end and submits a batch job which when it's
>                        scheduled , your batch
>                        script creates a -R style tunnel and starts
>         pvserver
>                        using PV reverse
>                        connection. ? or are you using portfwd or a
>         second ssh
>                        session to
>                        establish the tunnel ?
>
>                        If you're doing this all from your .pvsc script
>                        without a second ssh
>                        session and/or portfwd that's awesome! I
>         haven't been
>                        able to script
>                        this, something about the batch system prevents the
>                        tunnel created
>                        within the batch job's ssh session from working. I
>                        don't know if that's
>                        particular to this system or a general fact of life
>                        about batch systems.
>
>                        Question: How are you creating the tunnel in your
>                        batch script?
>
>                        Sean Ziegeler wrote:
>
>                        Both ways will work for me in most cases, i.e. a
>                        "forward" connection
>                        with ssh -L or a reverse connection with ssh -R.
>
>                        However, I find that the reverse method is more
>                        scriptable. You can
>                        set up a .pvsc file that the client can load and
>                        will call ssh with
>                        the appropriate options and commands for the
>                        remote host, all from the
>                        GUI. The client will simply wait for the reverse
>                        connection from the
>                        server, whether it takes 5 seconds or 5 hours for
>                        the server to get
>                        through the batch queue.
>
>                        Using the forward connection method, if the server
>                        isn't started soon
>                        enough, the client will attempt to connect and
>                        then fail. I've always
>                        had to log in separately, wait for the server to
>                        start running, then
>                        tell my client to connect.
>
>                        -Sean
>
>                        On 02/06/10 12:58, burlen wrote:
>
>                        Hi Pat,
>
>                        My bad. I was looking at the PV wiki, and
>                        thought you were talking about
>                        doing this without an ssh tunnel and using
>                        only port forward and
>                        paraview's --reverse-connection option . Now
>                        that I am reading your
>                        hpc.mil <http://hpc.mil> <http://hpc.mil>
>         <http://hpc.mil> post I see
>
>                        what you
>                        mean :)
>
>                        Burlen
>
>
>                        pat marion wrote:
>
>                        Maybe I'm misunderstanding what you mean
>                        by local firewall, but
>                        usually as long as you can ssh from your
>                        workstation to the login node
>                        you can use a reverse ssh tunnel.
>
>
>                        _______________________________________________
>                        Powered by www.kitware.com
>         <http://www.kitware.com> <http://www.kitware.com>
>                        <http://www.kitware.com>
>
>                        Visit other Kitware open-source projects at
>                        http://www.kitware.com/opensource/opensource.html
>
>                        Please keep messages on-topic and check the
>                        ParaView Wiki at:
>                        http://paraview.org/Wiki/ParaView
>
>                        Follow this link to subscribe/unsubscribe:
>                        http://www.paraview.org/mailman/listinfo/paraview
>
>
>
>
>
>
>
>



More information about the ParaView mailing list