[Paraview-developers] 100% CPU usage once again
Paul Kapinos
kapinos at itc.rwth-aachen.de
Mon Oct 9 03:08:12 EDT 2017
(cross-post from 'paraview at paraview.org' 10/04/2017)
Dear ParaView developer,
in [1] you say
> If you have information on disabling the busy wait using a different MPI
> implementation, please contribute back by documenting it here.
Here we go.
a)
It is possible to build 'pvserver' using Intel MPI (and GCC compilers).
By setting I_MPI_WAIT_MODE environment variable to 'enable' value you can
effectively prevent busy waiting, see [2] p.10 (Tested with Intel MPI
5.1(.3.181)).
Q/FR.1 Could you please provide a version of 'pvserver' build using Intel MPI in
official release of ParaView? [4]
b)
It is (obviously) possible to build 'pvserver' using MPICH (and GCC compilers).
Itself, MPICH can be *configured* with '--with-device=ch3:sock' option. This is
described to be 'slower' as '--with-device=ch3:nemesis'
For our experiments it turned out, that 'pvserver' compiled and linked using
MPICH with 'ch3:sock' configure option did not have the busy waiting aka
100%-CPU behaviour.
Note that this is *configure-time of MPI library* parameter.
(Tested with MPICH 3.2)
Q/FR.2 Could you please provide a version of 'pvserver' build using MPICH
*configured with '--with-device=ch3:sock' option* in official release of
ParaView? [4]
This binary will be very likely be even compatible with 'standard' MPICH
installations; we're able to start it even with IntelMPI's 'mpiexec' with
success and no busy waiting. However YES you will need to have a MPICH release
be build
c)
It is possible to build 'pvserver' using Open MPI (and GCC compilers).
In [1] you document how-to 'turn off' the busy waiting behaviour in Open MPI
(cf. [5],[6]).
Unfortunately this *did not work* in our environment (InfiniBand QDR and Intel
OmniPath clusters, OpenMPI 1.10.4).
Note that likely we're not alone, cf. [7].
Note that the switch likely just move the spinning location from MPI library
itself to the fabrics library (cf. the screenshots, without 'mpi_yield_when_idle
1' (=default) the 'pvserver' processes spin with 'green' aka user 100%, and with
'mpi_yield_when_idle 1' the processes stays spinning but now with a lot of 'red'
aka kernel time portion.
Conclusion: the way to disable busy waiting for Open MPI which is documented in
[1], is not useful for us. We do not know this is a speciality of our site or a
general issue; we think some survey at this point could be useful.
Q/FR.3 In case you should want to provide precompiled versions ov 'pvserver' for
Open MPI, remember that there are ABI changes in major version changes. So you
would likely need to compile+link *three* versions of 'pvserver' with Open MPI
1.10.x (still default in Linux), 2.x (current), 3.x (new).
d) In [10] there is a phrase about how-to disable busy waiting on yet another
two MPI implementaions,
> ... IBM MPI has the MP_WAIT_MODE and
> the SUPER-UX MPI/XS library has MPISUSPEND to choose the waiting mode,
> either polling or sleeping.
Somebody with access to there MPIs could evaluate this, maybe.
Have a nice day,
Paul Kapinos
[1]
https://www.paraview.org/Wiki/Setting_up_a_ParaView_Server#Server_processes_always_have_100.25_CPU_usage
[2]
https://software.intel.com/sites/products/Whitepaper/Clustertools/amplxe_inspxe_interop_with_mpi.pdf
(cf. p.10)
[3]
https://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_Why_does_my_MPI_program_run_much_slower_when_I_use_more_processes.3F
[4] https://www.paraview.org/download/
[5] http://www.open-mpi.org/faq/?category=running#oversubscribing
[6] http://www.open-mpi.org/faq/?category=running#force-aggressive-degraded
[7] https://www.paraview.org/pipermail/paraview/2008-December/010349.html
[8] http://blogs.cisco.com/performance/polling-vs-blocking-message-passingprogress
[9] https://www.open-mpi.org/community/lists/users/2010/10/14505.php
[10] http://comp.parallel.mpi.narkive.com/3oXMDXno/non-busy-waiting-barrier-in-mpi
[11] http://blogs.cisco.com/performance/polling-vs-blocking-message-passingprogress
[12] https://stackoverflow.com/questions/14560714/probe-seems-to-consume-the-cpu
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4891 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://public.kitware.com/pipermail/paraview-developers/attachments/20171009/c3017a82/attachment.bin>
More information about the Paraview-developers
mailing list