[Paraview] paraview - client-server
burlen
burlen.loring at gmail.com
Fri Apr 30 14:02:07 EDT 2010
I see what you mean. QProcess::close() only closes the ssh process
spawned by PV, not the other processes spawned by ssh on the remote
host. & for ssh to clean up after itself on the remote host it needs a
tty there.
Thanks for the clarification.
Burlen
pat marion wrote:
> I have applied your patch. I agree that paraview should explicity
> close the child process. But... what I am pointing out is that
> calling QProcess::close() does not help in this situation. What I am
> saying is that, even when paraview does kill the process, any commands
> run by ssh on the other side of the netpipe will be orphaned by sshd.
> Are you sure you can't reproduce it?
>
> $ ssh localhost sleep 1d
> $ < press control-c >
> $ pidof sleep
> $ # sleep is still running
>
> Pat
>
> On Fri, Apr 30, 2010 at 2:08 AM, burlen <burlen.loring at gmail.com
> <mailto:burlen.loring at gmail.com>> wrote:
>
> Hi Pat,
>
> From my point of view the issue is philosophical, because
> practically speaking I couldn't reproduce the orphans with out
> doing something a little odd namely, ssh ... && sleep 1d.
> Although the fact that a user reported suggests that it may occur
> in the real world as well. The question is this: should an
> application explicitly clean up resources it allocates? or should
> an application rely on the user not only knowing that there is the
> potential for a resource leak but also knowing enough to do the
> right thing to avoid it (eg ssh -tt ...)? In my opinion, as a
> matter of principle, if PV spawns a process it should explicitly
> clean it up and there should be no way it can become an orphan. In
> this case the fact that the orphan can hold ports open is
> particularly insidious, because further connection attempt on that
> port fails with no helpful error information. Also it is not very
> difficult to clean up a spawned process. What it comes down to is
> a little book keeping to hang on to the qprocess handle and a few
> lines of code called from pqCommandServerStartup destructor to
> make certain it's cleaned up. This is from the patch I submitted
> when I filed the bug report.
>
> + // close running process
> + if (this->Process->state()==QProcess::Running)
> + {
> + this->Process->close();
> + }
> + // free the object
> + delete this->Process;
> + this->Process=NULL;
>
> I think if the cluster admins out there new which ssh options
> (GatewayPorts etc) are important for ParView to work seamlessly,
> then they might be willing to open them up. It's my impression
> that the folks that build clusters want tools like PV to be easy
> to use, but they don't necessarily know all the in's and out's of
> confinguring and running PV.
>
> Thanks for looking at this again! The -tt option to ssh is indeed
> a good find.
>
> Burlen
>
> pat marion wrote:
>
> Hi all!
>
> I'm bringing this thread back- I have learned a couple new
> things...
>
> -----------------------
> No more orphans:
>
> Here is an easy way to create an orphan:
>
> $ ssh localhost sleep 1d
> $ <press control c>
>
> The ssh process is cleaned up, but sshd orphans the sleep
> process. You can avoid this by adding '-t' to ssh:
>
> $ ssh -t localhost sleep 1d
>
> Works like a charm! But then there is another problem... try
> this command from paraview (using QProcess) and it still
> leaves an orphan, doh! Go back and re-read ssh's man page and
> you have the solution, use '-t' twice: ssh -tt
>
> -------------------------
> GatewayPorts and portfwd workaround:
>
> In this scenario we have 3 machines: workstation,
> service-node, and compute-node. I want to ssh from
> workstation to service-node and submit a job that will run
> pvserver on compute-node. When pvserver starts on
> compute-node I want it to reverse connect to service-node and
> I want service-node to forward the connection to workstation.
> So here I go:
>
> $ ssh -R11111:localhost:11111 service-node qsub
> start_pvserver.sh
>
> Oops, the qsub command returns immediately and closes my ssh
> tunnel. Let's pretend that the scheduler doesn't provide an
> easy way to keep the command alive, so I have resorted to
> using 'sleep 1d'. So here I go, using -tt to prevent orphans:
>
> $ ssh -tt -R11111:localhost:11111 service-node "qsub
> start_pvserver.sh && sleep 1d"
>
> Well, this will only work if GatewayPorts is enabled in
> sshd_config on service-node. If GatewayPorts is not enabled,
> the ssh tunnel will only accept connections from localhost, it
> will not accept a connection from compute-node. We can ask
> the sysadmin to enable GatewayPorts, or we could use portfwd.
> You can run portfwd on service-node to forward port 22222 to
> port 11111, then have compute-node connect to
> service-node:22222. So your job script would launch pvserver
> like this:
>
> pvserver -rc -ch=service-node -sp=22222
>
> Problem solved! Also convenient, we can use portfwd to
> replace 'sleep 1d'. So the final command, executed by
> paraview client:
>
> ssh -tt -R 11111:localhost:11111 service-node "qsub
> start_pvserver.sh && portfwd -g -c fwd.cfg"
>
> Where fwd.cfg contains:
>
> tcp { 22222 { => localhost:11111 } }
>
>
> Hope this helps!
>
> Pat
>
> On Fri, Feb 12, 2010 at 7:06 PM, burlen
> <burlen.loring at gmail.com <mailto:burlen.loring at gmail.com>
> <mailto:burlen.loring at gmail.com
> <mailto:burlen.loring at gmail.com>>> wrote:
>
>
> Incidentally, this brings up an interesting point about
> ParaView with client/server. It doesn't try to clean
> up it's
> child processes, AFAIK. For example, if you set up
> this ssh
> tunnel inside the ParaView GUI (e.g., using a command
> instead
> of a manual connection), and you cancel the connection, it
> will leave the ssh running. You have to track down the ssh
> process and kill it yourself. It's minor thing, but it can
> also prevent future connections if you don't realize
> there's a
> zombie ssh that kept your ports open.
>
> I attempted to reproduce on my kubuntu 9.10, qt 4.5.2
> system, with
> slightly different results, which may be qt/distro/os specific.
>
> On my system as long as the process ParaView spawns finishes on
> its own there is no problem. That's usually how one would
> expect
> things to work out since when the client disconnects the server
> closes followed by ssh. But, you are right that PV never
> explicitly kills or otherwise cleans up after the process it
> starts. So if the spawned process for some reason doesn't
> finish
> orphan processes are introduced.
>
> I was able to produce orphan ssh processes, giving the PV
> client a
> server start up command that doesn't finish. eg
>
> ssh ... pvserver ... && sleep 100d
>
> I get the situation you described which prevents further
> connection on the same ports. Once PV tries and fails to
> connect
> on th eopen ports, there is crash soon after.
>
> I filed a bug report with a patch:
> http://www.paraview.org/Bug/view.php?id=10283
>
>
>
> Sean Ziegeler wrote:
>
> Most batch systems have an option to wait until the job is
> finished before the submit command returns. I know PBS
> uses
> "-W block=true" and that SGE and LSF have similar
> options (but
> I don't recall the precise flags).
>
> If your batch system doesn't provide that, I'd recommend
> adding some shell scripting to loop through checking
> the queue
> for job completion and not return until it's done. The
> sleep
> thing would work, but wouldn't exit when the server
> finishes,
> leaving the ssh tunnels (and other things like portfwd
> if you
> put them in your scripts) lying around.
>
> Incidentally, this brings up an interesting point about
> ParaView with client/server. It doesn't try to clean
> up it's
> child processes, AFAIK. For example, if you set up
> this ssh
> tunnel inside the ParaView GUI (e.g., using a command
> instead
> of a manual connection), and you cancel the connection, it
> will leave the ssh running. You have to track down the ssh
> process and kill it yourself. It's minor thing, but it can
> also prevent future connections if you don't realize
> there's a
> zombie ssh that kept your ports open.
>
>
> On 02/08/10 21:03, burlen wrote:
>
> I am curious to hear what Sean has to say.
>
> But, say the batch system returns right away after
> the job
> is submitted,
> I think we can doctor the command so that it will
> live for
> a while
> longer, what about something like this:
>
> ssh -R XXXX:localhost:YYYY remote_machine
> "submit_my_job.sh && sleep
> 100d"
>
>
> pat marion wrote:
>
> Hey just checked out the wiki page, nice! One
> question, wouldn't this
> command hang up and close the tunnel after
> submitting
> the job?
> ssh -R XXXX:localhost:YYYY remote_machine
> submit_my_job.sh
> Pat
>
> On Mon, Feb 8, 2010 at 8:12 PM, pat marion
> <pat.marion at kitware.com
> <mailto:pat.marion at kitware.com> <mailto:pat.marion at kitware.com
> <mailto:pat.marion at kitware.com>>
> <mailto:pat.marion at kitware.com
> <mailto:pat.marion at kitware.com>
>
> <mailto:pat.marion at kitware.com
> <mailto:pat.marion at kitware.com>>>> wrote:
>
> Actually I didn't write the notes at the
> hpc.mil <http://hpc.mil>
> <http://hpc.mil> <http://hpc.mil>
>
> link.
>
> Here is something- and maybe this is the
> problem that
> Sean refers
> to- in some cases, when I have set up a reverse ssh
> tunnel from
> login node to workstation (command executed from
> workstation) then
> the forward does not work when the compute node
> connects to the
> login node. However, if I have the compute node
> connect to the
> login node on port 33333, then use portfwd to
> forward
> that to
> localhost:11111, where the ssh tunnel is
> listening on
> port 11111,
> it works like a charm. The portfwd tricks it into
> thinking the
> connection is coming from localhost and allow
> the ssh
> tunnel to
> work. Hope that made a little sense...
>
> Pat
>
>
> On Mon, Feb 8, 2010 at 6:29 PM, burlen
> <burlen.loring at gmail.com
> <mailto:burlen.loring at gmail.com>
> <mailto:burlen.loring at gmail.com <mailto:burlen.loring at gmail.com>>
> <mailto:burlen.loring at gmail.com
> <mailto:burlen.loring at gmail.com>
> <mailto:burlen.loring at gmail.com
> <mailto:burlen.loring at gmail.com>>>> wrote:
>
> Nice, thanks for the clarification. I am
> guessing that
> your
> example should probably be the recommended
> approach rather
> than the portfwd method suggested on the PV
> wiki. :) I
> took
> the initiative to add it to the Wiki. KW let me
> know
> if this
> is not the case!
>
>
> http://paraview.org/Wiki/Reverse_connection_and_port_forwarding#Reverse_connection_over_an_ssh_tunnel
>
>
>
> Would you mind taking a look to be sure I
> didn't miss
> anything
> or bollix it up?
>
> The sshd config options you mentioned may be
> why your
> method
> doesn't work on the Pleiades system, either that or
> there is a
> firewall between the front ends and compute
> nodes. In
> either
> case I doubt the NAS sys admins are going to
> reconfigure for
> me :) So at least for now I'm stuck with the
> two hop ssh
> tunnels and interactive batch jobs. if there were
> someway to
> script the ssh tunnel in my batch script I would be
> golden...
>
> By the way I put the details of the two hop ssh
> tunnel
> on the
> wiki as well, and a link to Pat's hpc.mil
> <http://hpc.mil>
> <http://hpc.mil> <http://hpc.mil>
>
> notes. I don't dare try to summarize them since
> I've never
> used portfwd and it refuses to compile both on my
> workstation
> and the cluster.
>
> Hopefully putting these notes on the Wiki will
> save future
> ParaView users some time and headaches.
>
>
> Sean Ziegeler wrote:
>
> Not quite- the pvsc calls ssh with both the
> tunnel options
> and the commands to submit the batch job. You
> don't even
> need a pvsc; it just makes the interface
> fancier. As long
> as you or PV executes something like this from your
> machine:
> ssh -R XXXX:localhost:YYYY remote_machine
> submit_my_job.sh
>
> This means that port XXXX on remote_machine
> will be the
> port to which the server must connect. Port
> YYYY (e.g.,
> 11111) on your client machine is the one on
> which PV
> listens. You'd have to tell the server (in the
> batch
> submission script, for example) the name of the
> node and
> port XXXX to which to connect.
>
> One caveat that might be causing you problems, port
> forwarding (and "gateway ports" if the server
> is running
> on a different node than the login node) must
> be enabled
> in the remote_machine's sshd_config. If not, no ssh
> tunnels will work at all (see: man ssh and man
> sshd_config). That's something that an
> administrator
> would need to set up for you.
>
> On 02/08/10 12:26, burlen wrote:
>
> So to be sure about what you're saying: Your .pvsc
> script ssh's to the
> front end and submits a batch job which when it's
> scheduled , your batch
> script creates a -R style tunnel and starts
> pvserver
> using PV reverse
> connection. ? or are you using portfwd or a
> second ssh
> session to
> establish the tunnel ?
>
> If you're doing this all from your .pvsc script
> without a second ssh
> session and/or portfwd that's awesome! I
> haven't been
> able to script
> this, something about the batch system prevents the
> tunnel created
> within the batch job's ssh session from working. I
> don't know if that's
> particular to this system or a general fact of life
> about batch systems.
>
> Question: How are you creating the tunnel in your
> batch script?
>
> Sean Ziegeler wrote:
>
> Both ways will work for me in most cases, i.e. a
> "forward" connection
> with ssh -L or a reverse connection with ssh -R.
>
> However, I find that the reverse method is more
> scriptable. You can
> set up a .pvsc file that the client can load and
> will call ssh with
> the appropriate options and commands for the
> remote host, all from the
> GUI. The client will simply wait for the reverse
> connection from the
> server, whether it takes 5 seconds or 5 hours for
> the server to get
> through the batch queue.
>
> Using the forward connection method, if the server
> isn't started soon
> enough, the client will attempt to connect and
> then fail. I've always
> had to log in separately, wait for the server to
> start running, then
> tell my client to connect.
>
> -Sean
>
> On 02/06/10 12:58, burlen wrote:
>
> Hi Pat,
>
> My bad. I was looking at the PV wiki, and
> thought you were talking about
> doing this without an ssh tunnel and using
> only port forward and
> paraview's --reverse-connection option . Now
> that I am reading your
> hpc.mil <http://hpc.mil> <http://hpc.mil>
> <http://hpc.mil> post I see
>
> what you
> mean :)
>
> Burlen
>
>
> pat marion wrote:
>
> Maybe I'm misunderstanding what you mean
> by local firewall, but
> usually as long as you can ssh from your
> workstation to the login node
> you can use a reverse ssh tunnel.
>
>
> _______________________________________________
> Powered by www.kitware.com
> <http://www.kitware.com> <http://www.kitware.com>
> <http://www.kitware.com>
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the
> ParaView Wiki at:
> http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://www.paraview.org/mailman/listinfo/paraview
>
>
>
>
>
>
>
>
More information about the ParaView
mailing list