<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Hi Michel, <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div id="m_8550353998187298182m_3945460361337263553divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif" dir="ltr"><div id="m_8550353998187298182m_3945460361337263553divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif" dir="ltr"><p>Indeed, I built PVSB 5.2 with the <span>intel 2016.2.181 and intelmpi <span>5.1.3.181 compilers, then </span></span><span style="font-size:12pt">ran the resulting pvserver </span><span style="font-size:12pt">on
</span><span style="font-size:12pt">Haswell CPU nodes (</span><span style="font-size:12pt">Intel E5-2680v3) which supports AVX2 instructions. </span><span style="font-size:12pt">So this fits exactly the known issue you mentioned in your email. </span></p></div></div></div></blockquote><div>Yep, that'll do it. The problem is due to a bug in the Intel compiler performing over-agressive vectorized code generation. I'm not sure if it's fixed in >= 17 or not but I definitely know it's broken in <= 16.x. GALLIUM_DRIVER=SWR is going to give you the best performance in this situation anyways though and is the recommended osmesa driver on x86_64 CPUs.<br></div><div> <br><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div id="m_8550353998187298182m_3945460361337263553divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif" dir="ltr"><div id="m_8550353998187298182m_3945460361337263553divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif" dir="ltr">
<p><span style="font-size:12pt"><span><span>
</span></span></span></p>
<p><span style="font-size:12pt"><span><span>Exporting the GALIIUM_DRIVER env variable to swr then leads to an interesting behavior. </span></span></span><span style="font-size:12pt">With the swr driver</span><span style="font-size:12pt">, the good news
is that I can connect my pvserver built in release mode without crashing. </span><span style="font-size:12pt"></span></p></div></div></div></blockquote><div>Great!<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div id="m_8550353998187298182m_3945460361337263553divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif" dir="ltr"><div id="m_8550353998187298182m_3945460361337263553divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif" dir="ltr"><p><span style="font-size:12pt">For the recollection, the llvmpipe driver compiled in release mode crashes during the client/server connection, whereas the llvmpipe driver compiled
in debug mode works fine.</span></p></div></div></div></blockquote><div>This lines up with the issue being bad vectorization since the compiler won't be doing m,ost of those optimizations in a debug build.<br><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div id="m_8550353998187298182m_3945460361337263553divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif" dir="ltr"><div id="m_8550353998187298182m_3945460361337263553divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif" dir="ltr">
<p>
</p>
<p>However, our PBS scheduling killed quickly my interactive job because the virtual memory was exhausted, which was puzzling. <span style="font-size:12pt">Increasing the number of cores requested for my job and keeping some of them idle allowed me to increase
the available memory at the cost of wasted cpu resources.</span></p></div></div></div></blockquote><div>I suspect the problem is is a massive oversubscription of threads by swr. The default behavior of swr is to use all available CPU cores on the node. However, when running multiple MPI processes per node, they have no way of knowing about each other. So if you've got 24 cores per node and run 24 pvservers, you'll end up with 24^2 = 576 rendering threads on a nodes; not so great. You can control this with the KNOB_MAX_WORKER_THREADS environment variable. Typically you'll want to set it to the inverse of processes per node your job is running. So if yor node has 24 cores and you run 24 processes per node, then set KNOB_MAX_WORKER_THREADS to 1, but if you're running 4 processes per node, then set it to 6; you get the idea. That should address the virtual memory problem. It's a balance since typically rendering will perform better with fewer ppn and mroe threads per process, but the filters, like Contour, parallelize at the MPI level and work better with a higher ppn. You'll need to find the right balance for your use case depending on whether it's render-heaver or pipeline pricessing heavy.<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div id="m_8550353998187298182m_3945460361337263553divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif" dir="ltr"><div id="m_8550353998187298182m_3945460361337263553divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif" dir="ltr">Would you also know if this known issue with the llvmpipe driver will be fixed of PV 5.3 (agreeing on the fact that the swr driver should be faster on intel CPU provided that it does not exhaust the memory consumption).</div></div></div></blockquote><div><br></div><div>It's actually an Intel compiler bug and not a ParaView (or even Mesa for that matter) issue, so probably not. It may be fixed in future releases of icc but I wouldn't know withotu testng it.<br></div><br><br></div><div class="gmail_quote">- Chuck<br></div><br></div></div>