[vtkusers] Large performance difference between vtkResampleWithDataSet in VTK 8.1.0 and Resample with Dataset filter in Paraview 5.5

Evan Kao tossin at gmail.com
Wed May 16 15:33:03 EDT 2018


Hi Sujin,

That makes sense.  I had a colleague run the test script on a Mac using VTK
8.0.1, and it was fast.

Just curious, but why does the old cell locator perform so badly on Windows?

- Evan

On Wed, May 16, 2018 at 8:48 AM, Sujin Philip <sujin.philip at kitware.com>
wrote:

> Hi Evan,
>
> I was finally able to reproduce this. So this is due to the poor
> performance of the old cell locator in 8.1.0. The new cell locator code is
> in ParaView 5.5 and VTK master. Sorry about the confusion. Let me know if
> you have any further questions.
>
> Thanks
> Sujin
>
>
> On Tue, May 15, 2018 at 6:28 PM, Evan Kao <tossin at gmail.com> wrote:
>
>> Hi Sujin,
>>
>> I thought I had put in the correct source/input, but I suppose I should
>> have checked more closely.  Still, it didn't make that much of a difference
>> (634s or 10.5 min).  I'll continue checking if this problem persists for me
>> on other platforms.
>>
>> I also tried using vtkResampleWithDataSet inside a Programmable Filter in
>> ParaView, and it performed quickly (0.98s).
>>
>> Is it possible to see the build flags for your version of VTK, or the
>> ones that were used for the ParaView binaries?  Were you using testing on
>> VTK 8.1 or the latest version?
>>
>> Thanks,
>> Evan Kao
>>
>> On Tue, May 15, 2018 at 2:36 PM, Sujin Philip <sujin.philip at kitware.com>
>> wrote:
>>
>>> Hi Evan,
>>>
>>> Thanks for sharing the data. I tried your script on my Linux desktop and
>>> the performance I see on both VTK and ParaView is similar and <1 second.
>>> This is even without threading enabled. I haven't tried this on Windows yet.
>>>
>>> BTW, there is an error in the script you shared. The inputs to the
>>> resample filter are in the wrong order and the
>>> "vtkXMLUnstructuredGridWriter" throws an error saying that the data passed
>>> to it is not an unstructured grid. I assume you want to resample the data
>>> values from the structured grid on to the geometry provided by the
>>> unstructured grid. The result will be an unstructured grid, which can be
>>> written by the "vtkXMLUnstructuredGridWrite". For this the input should be
>>> the "mesh" data and the source should be "image".
>>>
>>> So, currently I don't have a good explanation for what is causing the
>>> performance degradation for you. It might be some issues with the builds,
>>> or the Windows setup. You can maybe try building VTK yourself (make sure to
>>> build in "Release" mode), or try another machine and see if the problem
>>> persists. I will also try to reproduce this on a Windows machine.
>>>
>>> Thanks
>>> Sujin
>>>
>>>
>>> On Tue, May 15, 2018 at 4:47 PM, Evan Kao <tossin at gmail.com> wrote:
>>>
>>>> Hello Sujin,
>>>>
>>>> Using the TimerLog, I got the following time from ParaView:
>>>>
>>>> Execute vtkResampleWithDataSet id: 10345, 0.42 seconds
>>>>
>>>> As for the timeit module, you can see how I use it in the attached
>>>> Python script.  I only use timeit's default_timer function to grab the time
>>>> before and after completion of the vtkResampleWithDataSet method and take
>>>> the difference as the time elapsed.  Regardless, qualitatively ParaView is
>>>> near-instant while VTK takes a while.
>>>>
>>>> Google drive links to the datasets themselves are here (hopefully this
>>>> doesn't trigger any mailing list filters): Unstructured Grid (35MB)
>>>> <https://drive.google.com/open?id=1jvjiDlMJEJihB8OQneOeBzJXFiZKKYsR> | Structured
>>>> Grid (70MB)
>>>> <https://drive.google.com/open?id=1RYz4eORPWWf23n6G5am-9_F44zxEOHMl>
>>>>
>>>> If I get a chance, I'll take a look at using smaller data sets.
>>>>
>>>> - Evan
>>>>
>>>>
>>>> On Tue, May 15, 2018 at 12:32 PM, Sujin Philip <
>>>> sujin.philip at kitware.com> wrote:
>>>>
>>>>> Hi Evan,
>>>>>
>>>>> I tried testing this on my end and I am seeing expected performance
>>>>> from VTK and ParaView. But the performance is dependent on the datasets
>>>>> used. Is it possible for you to share your datasets and scripts with us?
>>>>> Could you try this with smaller versions of your datasets and see if you
>>>>> are able to reproduce this?
>>>>>
>>>>> I am not familiar with the timeit module in Python. From the
>>>>> documentation it looks like it runs the code multiple times by default and
>>>>> prints the total time. Can you confirm if you have taken this into
>>>>> consideration in your script?
>>>>>
>>>>> A simple way to time operations in ParaView is to refer to the "Timer
>>>>> Log" under the "Tools" menu. You should see a line like:
>>>>>
>>>>> Execute vtkResampleWithDataSet id: 6788, 2.70556 seconds
>>>>>
>>>>>
>>>>> Thanks
>>>>> Sujin
>>>>>
>>>>>
>>>>> On Tue, May 15, 2018 at 1:05 PM, Evan Kao <tossin at gmail.com> wrote:
>>>>>
>>>>>> Hi Shawn and Sujin,
>>>>>>
>>>>>> Thanks for the quick responses.  The CPU on the computer I'm using is
>>>>>> an i7-6700
>>>>>> <https://ark.intel.com/products/88196/Intel-Core-i7-6700-Processor-8M-Cache-up-to-4_00-GHz>
>>>>>> with 4 cores, 8 threads, and 3.4 GHz frequency.
>>>>>>
>>>>>> Multi-threading may be a factor, but it's hard to tell because
>>>>>> resampling in ParaView is so quick.  ParaView is capable of using 100% of
>>>>>> the CPU, while VTK (in Python) will max out at 12-13%.  However, for these
>>>>>> particular datasets, resampling doesn't appear to stress ParaView that much
>>>>>> (11-16% when observing the Windows Task Manager, and some of that may be
>>>>>> because of the rendering).  However, I was under the impression that at
>>>>>> best multi-threading could only reduce the time it takes by N threads (ie
>>>>>> 8x), while the speed difference here is almost 1000x.  I measured the times
>>>>>> for ParaView 5.5, VTK 8.1 (compiled elsewhere), and VTK 7.1 (compiled by
>>>>>> our group):
>>>>>>
>>>>>>    1. ParaView 5.5 - 1.1s, using a stopwatch, multiple trials.
>>>>>>    Timing started the moment I clicked "Apply".
>>>>>>    2. VTK 8.1 - 922.47s, timed using Python's timeit module,
>>>>>>    measuring only the vtkResampleWithDataSet.Update() method.
>>>>>>    3. VTK 7.1 - 950.47s, timed the same way as above.
>>>>>>
>>>>>> I'm aware of the difference in labeling between VTK and ParaView for
>>>>>> Source and Input (which confuses me all the time).  I can verify the
>>>>>> correct data sets were assigned by saving the output (which should an
>>>>>> unstructured grid) and viewing it in ParaView - it looks identical to the
>>>>>> resampled data generated in ParaView (although it overwrites the point
>>>>>> scalars array and adds some ghost information that needs to be removed).
>>>>>>
>>>>>> Thanks,
>>>>>> Evan
>>>>>>
>>>>>> On Tue, May 15, 2018 at 7:38 AM, Sujin Philip <
>>>>>> sujin.philip at kitware.com> wrote:
>>>>>>
>>>>>>> Hi Evan,
>>>>>>>
>>>>>>> As Shawn mentioned it could be due to lack of multi-threading. Could
>>>>>>> you provide us the configuration of the system you are using? Like the
>>>>>>> number of cores/threads and the CPU frequency? Also please share the actual
>>>>>>> time that ParaView and VTK are taking. Is it possible for you to try out a
>>>>>>> slightly older VTK version and see if the performance difference is still
>>>>>>> there?
>>>>>>>
>>>>>>> Which dataset are you setting as input and which as source? The
>>>>>>> names are unfortunately opposite between VTK-m and ParaView due to legacy
>>>>>>> reasons. Probing with the unstructured grid as the source is much slower
>>>>>>> than probing with the structured grid as the source. So please confirm that
>>>>>>> the VTK pipeline is set up properly.
>>>>>>>
>>>>>>> Please let me know if none these seem to be the cause of your
>>>>>>> problem and I will dig deeper.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Sujin
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, May 15, 2018 at 9:52 AM, Shawn Waldon <
>>>>>>> shawn.waldon at kitware.com> wrote:
>>>>>>>
>>>>>>>> Hi Evan,
>>>>>>>>
>>>>>>>> I suspect the differece is that the ParaView binaries were compiled
>>>>>>>> with TBB multithreading support and the Anaconda VTK was not.
>>>>>>>> vtkResampleWithDataSet is set up to use TBB multithreading if available.
>>>>>>>> Check the utilization of the cores on your computer when running each and
>>>>>>>> you will see ParaView using all available cores and Anaconda's VTK probably
>>>>>>>> only using one.  It is also possible the cell locator change improved
>>>>>>>> things further but I'm not familiar with that.
>>>>>>>>
>>>>>>>> HTH,
>>>>>>>>
>>>>>>>> Shawn
>>>>>>>>
>>>>>>>> On Mon, May 14, 2018 at 7:54 PM, Evan Kao <tossin at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> I am trying to resample a structured grid data (~1.4M points, 1.3M
>>>>>>>>> cells) with an unstructured grid (~320K points, 480K cells).  In Paraview
>>>>>>>>> 5.5, this resampling is nearly instant with the Resample With Dataset
>>>>>>>>> filter.  Yet in a Python script using vtkResampleWithDataSet from VTK
>>>>>>>>> 8.1.0, the same operation takes about 15 minutes (>2 orders of magnitude
>>>>>>>>> difference in speed).  As far as I can tell from the VTK repository on
>>>>>>>>> Gitlab, the only difference between the Paraview/release version and the
>>>>>>>>> 8.1.0 or 8.1.1 tagged releases is a switch in the cell locator.  Is this
>>>>>>>>> enough to explain the difference in the performance?  If not, could someone
>>>>>>>>> enlighten me as to what the possible factors are here?
>>>>>>>>>
>>>>>>>>> Also, if it matters, this is all on a Windows 7 64-bit machine.
>>>>>>>>> Paraview is installed from binaries, while VTK was downloaded from an
>>>>>>>>> Anaconda distribution compiled by a third party.
>>>>>>>>>
>>>>>>>>> Thanks for your time,
>>>>>>>>> Evan Kao
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Powered by www.kitware.com
>>>>>>>>>
>>>>>>>>> Visit other Kitware open-source projects at
>>>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>>>
>>>>>>>>> Please keep messages on-topic and check the VTK FAQ at:
>>>>>>>>> http://www.vtk.org/Wiki/VTK_FAQ
>>>>>>>>>
>>>>>>>>> Search the list archives at: http://markmail.org/search/?q=
>>>>>>>>> vtkusers
>>>>>>>>>
>>>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>>>> https://vtk.org/mailman/listinfo/vtkusers
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Powered by www.kitware.com
>>>>>>>>
>>>>>>>> Visit other Kitware open-source projects at
>>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>>
>>>>>>>> Please keep messages on-topic and check the VTK FAQ at:
>>>>>>>> http://www.vtk.org/Wiki/VTK_FAQ
>>>>>>>>
>>>>>>>> Search the list archives at: http://markmail.org/search/?q=vtkusers
>>>>>>>>
>>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>>> https://vtk.org/mailman/listinfo/vtkusers
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://vtk.org/pipermail/vtkusers/attachments/20180516/03201a2e/attachment.html>


More information about the vtkusers mailing list