[vtkusers] Large performance difference between vtkResampleWithDataSet in VTK 8.1.0 and Resample with Dataset filter in Paraview 5.5

Sujin Philip sujin.philip at kitware.com
Wed May 16 15:46:46 EDT 2018


Its not just on Windows. Vtk 8.1.0 performs poorly on my Linux machine
also. In my earlier tests I was building the wrong version that I thought
had the old locator but actually had the new cell locator. That's why I was
seeing good performance. Once I correctly checked out and built version
8.1, I could see the poor performance.

Thanks
Sujin


On Wed, May 16, 2018 at 3:33 PM, Evan Kao <tossin at gmail.com> wrote:

> Hi Sujin,
>
> That makes sense.  I had a colleague run the test script on a Mac using
> VTK 8.0.1, and it was fast.
>
> Just curious, but why does the old cell locator perform so badly on
> Windows?
>
> - Evan
>
> On Wed, May 16, 2018 at 8:48 AM, Sujin Philip <sujin.philip at kitware.com>
> wrote:
>
>> Hi Evan,
>>
>> I was finally able to reproduce this. So this is due to the poor
>> performance of the old cell locator in 8.1.0. The new cell locator code is
>> in ParaView 5.5 and VTK master. Sorry about the confusion. Let me know if
>> you have any further questions.
>>
>> Thanks
>> Sujin
>>
>>
>> On Tue, May 15, 2018 at 6:28 PM, Evan Kao <tossin at gmail.com> wrote:
>>
>>> Hi Sujin,
>>>
>>> I thought I had put in the correct source/input, but I suppose I should
>>> have checked more closely.  Still, it didn't make that much of a difference
>>> (634s or 10.5 min).  I'll continue checking if this problem persists for me
>>> on other platforms.
>>>
>>> I also tried using vtkResampleWithDataSet inside a Programmable Filter
>>> in ParaView, and it performed quickly (0.98s).
>>>
>>> Is it possible to see the build flags for your version of VTK, or the
>>> ones that were used for the ParaView binaries?  Were you using testing on
>>> VTK 8.1 or the latest version?
>>>
>>> Thanks,
>>> Evan Kao
>>>
>>> On Tue, May 15, 2018 at 2:36 PM, Sujin Philip <sujin.philip at kitware.com>
>>> wrote:
>>>
>>>> Hi Evan,
>>>>
>>>> Thanks for sharing the data. I tried your script on my Linux desktop
>>>> and the performance I see on both VTK and ParaView is similar and <1
>>>> second. This is even without threading enabled. I haven't tried this on
>>>> Windows yet.
>>>>
>>>> BTW, there is an error in the script you shared. The inputs to the
>>>> resample filter are in the wrong order and the
>>>> "vtkXMLUnstructuredGridWriter" throws an error saying that the data passed
>>>> to it is not an unstructured grid. I assume you want to resample the data
>>>> values from the structured grid on to the geometry provided by the
>>>> unstructured grid. The result will be an unstructured grid, which can be
>>>> written by the "vtkXMLUnstructuredGridWrite". For this the input should be
>>>> the "mesh" data and the source should be "image".
>>>>
>>>> So, currently I don't have a good explanation for what is causing the
>>>> performance degradation for you. It might be some issues with the builds,
>>>> or the Windows setup. You can maybe try building VTK yourself (make sure to
>>>> build in "Release" mode), or try another machine and see if the problem
>>>> persists. I will also try to reproduce this on a Windows machine.
>>>>
>>>> Thanks
>>>> Sujin
>>>>
>>>>
>>>> On Tue, May 15, 2018 at 4:47 PM, Evan Kao <tossin at gmail.com> wrote:
>>>>
>>>>> Hello Sujin,
>>>>>
>>>>> Using the TimerLog, I got the following time from ParaView:
>>>>>
>>>>> Execute vtkResampleWithDataSet id: 10345, 0.42 seconds
>>>>>
>>>>> As for the timeit module, you can see how I use it in the attached
>>>>> Python script.  I only use timeit's default_timer function to grab the time
>>>>> before and after completion of the vtkResampleWithDataSet method and take
>>>>> the difference as the time elapsed.  Regardless, qualitatively ParaView is
>>>>> near-instant while VTK takes a while.
>>>>>
>>>>> Google drive links to the datasets themselves are here (hopefully this
>>>>> doesn't trigger any mailing list filters): Unstructured Grid (35MB)
>>>>> <https://drive.google.com/open?id=1jvjiDlMJEJihB8OQneOeBzJXFiZKKYsR>
>>>>> | Structured Grid (70MB)
>>>>> <https://drive.google.com/open?id=1RYz4eORPWWf23n6G5am-9_F44zxEOHMl>
>>>>>
>>>>> If I get a chance, I'll take a look at using smaller data sets.
>>>>>
>>>>> - Evan
>>>>>
>>>>>
>>>>> On Tue, May 15, 2018 at 12:32 PM, Sujin Philip <
>>>>> sujin.philip at kitware.com> wrote:
>>>>>
>>>>>> Hi Evan,
>>>>>>
>>>>>> I tried testing this on my end and I am seeing expected performance
>>>>>> from VTK and ParaView. But the performance is dependent on the datasets
>>>>>> used. Is it possible for you to share your datasets and scripts with us?
>>>>>> Could you try this with smaller versions of your datasets and see if you
>>>>>> are able to reproduce this?
>>>>>>
>>>>>> I am not familiar with the timeit module in Python. From the
>>>>>> documentation it looks like it runs the code multiple times by default and
>>>>>> prints the total time. Can you confirm if you have taken this into
>>>>>> consideration in your script?
>>>>>>
>>>>>> A simple way to time operations in ParaView is to refer to the "Timer
>>>>>> Log" under the "Tools" menu. You should see a line like:
>>>>>>
>>>>>> Execute vtkResampleWithDataSet id: 6788, 2.70556 seconds
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Sujin
>>>>>>
>>>>>>
>>>>>> On Tue, May 15, 2018 at 1:05 PM, Evan Kao <tossin at gmail.com> wrote:
>>>>>>
>>>>>>> Hi Shawn and Sujin,
>>>>>>>
>>>>>>> Thanks for the quick responses.  The CPU on the computer I'm using
>>>>>>> is an i7-6700
>>>>>>> <https://ark.intel.com/products/88196/Intel-Core-i7-6700-Processor-8M-Cache-up-to-4_00-GHz>
>>>>>>> with 4 cores, 8 threads, and 3.4 GHz frequency.
>>>>>>>
>>>>>>> Multi-threading may be a factor, but it's hard to tell because
>>>>>>> resampling in ParaView is so quick.  ParaView is capable of using 100% of
>>>>>>> the CPU, while VTK (in Python) will max out at 12-13%.  However, for these
>>>>>>> particular datasets, resampling doesn't appear to stress ParaView that much
>>>>>>> (11-16% when observing the Windows Task Manager, and some of that may be
>>>>>>> because of the rendering).  However, I was under the impression that at
>>>>>>> best multi-threading could only reduce the time it takes by N threads (ie
>>>>>>> 8x), while the speed difference here is almost 1000x.  I measured the times
>>>>>>> for ParaView 5.5, VTK 8.1 (compiled elsewhere), and VTK 7.1 (compiled by
>>>>>>> our group):
>>>>>>>
>>>>>>>    1. ParaView 5.5 - 1.1s, using a stopwatch, multiple trials.
>>>>>>>    Timing started the moment I clicked "Apply".
>>>>>>>    2. VTK 8.1 - 922.47s, timed using Python's timeit module,
>>>>>>>    measuring only the vtkResampleWithDataSet.Update() method.
>>>>>>>    3. VTK 7.1 - 950.47s, timed the same way as above.
>>>>>>>
>>>>>>> I'm aware of the difference in labeling between VTK and ParaView for
>>>>>>> Source and Input (which confuses me all the time).  I can verify the
>>>>>>> correct data sets were assigned by saving the output (which should an
>>>>>>> unstructured grid) and viewing it in ParaView - it looks identical to the
>>>>>>> resampled data generated in ParaView (although it overwrites the point
>>>>>>> scalars array and adds some ghost information that needs to be removed).
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Evan
>>>>>>>
>>>>>>> On Tue, May 15, 2018 at 7:38 AM, Sujin Philip <
>>>>>>> sujin.philip at kitware.com> wrote:
>>>>>>>
>>>>>>>> Hi Evan,
>>>>>>>>
>>>>>>>> As Shawn mentioned it could be due to lack of multi-threading.
>>>>>>>> Could you provide us the configuration of the system you are using? Like
>>>>>>>> the number of cores/threads and the CPU frequency? Also please share the
>>>>>>>> actual time that ParaView and VTK are taking. Is it possible for you to try
>>>>>>>> out a slightly older VTK version and see if the performance difference is
>>>>>>>> still there?
>>>>>>>>
>>>>>>>> Which dataset are you setting as input and which as source? The
>>>>>>>> names are unfortunately opposite between VTK-m and ParaView due to legacy
>>>>>>>> reasons. Probing with the unstructured grid as the source is much slower
>>>>>>>> than probing with the structured grid as the source. So please confirm that
>>>>>>>> the VTK pipeline is set up properly.
>>>>>>>>
>>>>>>>> Please let me know if none these seem to be the cause of your
>>>>>>>> problem and I will dig deeper.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Sujin
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, May 15, 2018 at 9:52 AM, Shawn Waldon <
>>>>>>>> shawn.waldon at kitware.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Evan,
>>>>>>>>>
>>>>>>>>> I suspect the differece is that the ParaView binaries were
>>>>>>>>> compiled with TBB multithreading support and the Anaconda VTK was not.
>>>>>>>>> vtkResampleWithDataSet is set up to use TBB multithreading if available.
>>>>>>>>> Check the utilization of the cores on your computer when running each and
>>>>>>>>> you will see ParaView using all available cores and Anaconda's VTK probably
>>>>>>>>> only using one.  It is also possible the cell locator change improved
>>>>>>>>> things further but I'm not familiar with that.
>>>>>>>>>
>>>>>>>>> HTH,
>>>>>>>>>
>>>>>>>>> Shawn
>>>>>>>>>
>>>>>>>>> On Mon, May 14, 2018 at 7:54 PM, Evan Kao <tossin at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hello all,
>>>>>>>>>>
>>>>>>>>>> I am trying to resample a structured grid data (~1.4M points,
>>>>>>>>>> 1.3M cells) with an unstructured grid (~320K points, 480K cells).  In
>>>>>>>>>> Paraview 5.5, this resampling is nearly instant with the Resample With
>>>>>>>>>> Dataset filter.  Yet in a Python script using vtkResampleWithDataSet from
>>>>>>>>>> VTK 8.1.0, the same operation takes about 15 minutes (>2 orders of
>>>>>>>>>> magnitude difference in speed).  As far as I can tell from the VTK
>>>>>>>>>> repository on Gitlab, the only difference between the Paraview/release
>>>>>>>>>> version and the 8.1.0 or 8.1.1 tagged releases is a switch in the cell
>>>>>>>>>> locator.  Is this enough to explain the difference in the performance?  If
>>>>>>>>>> not, could someone enlighten me as to what the possible factors are here?
>>>>>>>>>>
>>>>>>>>>> Also, if it matters, this is all on a Windows 7 64-bit machine.
>>>>>>>>>> Paraview is installed from binaries, while VTK was downloaded from an
>>>>>>>>>> Anaconda distribution compiled by a third party.
>>>>>>>>>>
>>>>>>>>>> Thanks for your time,
>>>>>>>>>> Evan Kao
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Powered by www.kitware.com
>>>>>>>>>>
>>>>>>>>>> Visit other Kitware open-source projects at
>>>>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>>>>
>>>>>>>>>> Please keep messages on-topic and check the VTK FAQ at:
>>>>>>>>>> http://www.vtk.org/Wiki/VTK_FAQ
>>>>>>>>>>
>>>>>>>>>> Search the list archives at: http://markmail.org/search/?q=
>>>>>>>>>> vtkusers
>>>>>>>>>>
>>>>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>>>>> https://vtk.org/mailman/listinfo/vtkusers
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Powered by www.kitware.com
>>>>>>>>>
>>>>>>>>> Visit other Kitware open-source projects at
>>>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>>>
>>>>>>>>> Please keep messages on-topic and check the VTK FAQ at:
>>>>>>>>> http://www.vtk.org/Wiki/VTK_FAQ
>>>>>>>>>
>>>>>>>>> Search the list archives at: http://markmail.org/search/?q=
>>>>>>>>> vtkusers
>>>>>>>>>
>>>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>>>> https://vtk.org/mailman/listinfo/vtkusers
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://vtk.org/pipermail/vtkusers/attachments/20180516/e5983c91/attachment.html>


More information about the vtkusers mailing list