[vtkusers] Large performance difference between vtkResampleWithDataSet in VTK 8.1.0 and Resample with Dataset filter in Paraview 5.5

Sujin Philip sujin.philip at kitware.com
Wed May 16 16:56:06 EDT 2018


Hi David,

Yes, the new code is not in 8.1.1. It's there in master.

Thanks
Sujin


On Wed, May 16, 2018 at 4:42 PM, David Gobbi <david.gobbi at gmail.com> wrote:

> Does 8.1.1 also have the bad cell locator code?
>
>
> On Wed, May 16, 2018 at 1:46 PM, Sujin Philip <sujin.philip at kitware.com>
> wrote:
>
>> Its not just on Windows. Vtk 8.1.0 performs poorly on my Linux machine
>> also. In my earlier tests I was building the wrong version that I thought
>> had the old locator but actually had the new cell locator. That's why I was
>> seeing good performance. Once I correctly checked out and built version
>> 8.1, I could see the poor performance.
>>
>> Thanks
>> Sujin
>>
>>
>> On Wed, May 16, 2018 at 3:33 PM, Evan Kao <tossin at gmail.com> wrote:
>>
>>> Hi Sujin,
>>>
>>> That makes sense.  I had a colleague run the test script on a Mac using
>>> VTK 8.0.1, and it was fast.
>>>
>>> Just curious, but why does the old cell locator perform so badly on
>>> Windows?
>>>
>>> - Evan
>>>
>>> On Wed, May 16, 2018 at 8:48 AM, Sujin Philip <sujin.philip at kitware.com>
>>> wrote:
>>>
>>>> Hi Evan,
>>>>
>>>> I was finally able to reproduce this. So this is due to the poor
>>>> performance of the old cell locator in 8.1.0. The new cell locator code is
>>>> in ParaView 5.5 and VTK master. Sorry about the confusion. Let me know if
>>>> you have any further questions.
>>>>
>>>> Thanks
>>>> Sujin
>>>>
>>>>
>>>> On Tue, May 15, 2018 at 6:28 PM, Evan Kao <tossin at gmail.com> wrote:
>>>>
>>>>> Hi Sujin,
>>>>>
>>>>> I thought I had put in the correct source/input, but I suppose I
>>>>> should have checked more closely.  Still, it didn't make that much of a
>>>>> difference (634s or 10.5 min).  I'll continue checking if this problem
>>>>> persists for me on other platforms.
>>>>>
>>>>> I also tried using vtkResampleWithDataSet inside a Programmable Filter
>>>>> in ParaView, and it performed quickly (0.98s).
>>>>>
>>>>> Is it possible to see the build flags for your version of VTK, or the
>>>>> ones that were used for the ParaView binaries?  Were you using testing on
>>>>> VTK 8.1 or the latest version?
>>>>>
>>>>> Thanks,
>>>>> Evan Kao
>>>>>
>>>>> On Tue, May 15, 2018 at 2:36 PM, Sujin Philip <
>>>>> sujin.philip at kitware.com> wrote:
>>>>>
>>>>>> Hi Evan,
>>>>>>
>>>>>> Thanks for sharing the data. I tried your script on my Linux desktop
>>>>>> and the performance I see on both VTK and ParaView is similar and <1
>>>>>> second. This is even without threading enabled. I haven't tried this on
>>>>>> Windows yet.
>>>>>>
>>>>>> BTW, there is an error in the script you shared. The inputs to the
>>>>>> resample filter are in the wrong order and the
>>>>>> "vtkXMLUnstructuredGridWriter" throws an error saying that the data passed
>>>>>> to it is not an unstructured grid. I assume you want to resample the data
>>>>>> values from the structured grid on to the geometry provided by the
>>>>>> unstructured grid. The result will be an unstructured grid, which can be
>>>>>> written by the "vtkXMLUnstructuredGridWrite". For this the input should be
>>>>>> the "mesh" data and the source should be "image".
>>>>>>
>>>>>> So, currently I don't have a good explanation for what is causing the
>>>>>> performance degradation for you. It might be some issues with the builds,
>>>>>> or the Windows setup. You can maybe try building VTK yourself (make sure to
>>>>>> build in "Release" mode), or try another machine and see if the problem
>>>>>> persists. I will also try to reproduce this on a Windows machine.
>>>>>>
>>>>>> Thanks
>>>>>> Sujin
>>>>>>
>>>>>>
>>>>>> On Tue, May 15, 2018 at 4:47 PM, Evan Kao <tossin at gmail.com> wrote:
>>>>>>
>>>>>>> Hello Sujin,
>>>>>>>
>>>>>>> Using the TimerLog, I got the following time from ParaView:
>>>>>>>
>>>>>>> Execute vtkResampleWithDataSet id: 10345, 0.42 seconds
>>>>>>>
>>>>>>> As for the timeit module, you can see how I use it in the attached
>>>>>>> Python script.  I only use timeit's default_timer function to grab the time
>>>>>>> before and after completion of the vtkResampleWithDataSet method and take
>>>>>>> the difference as the time elapsed.  Regardless, qualitatively ParaView is
>>>>>>> near-instant while VTK takes a while.
>>>>>>>
>>>>>>> Google drive links to the datasets themselves are here (hopefully
>>>>>>> this doesn't trigger any mailing list filters): Unstructured Grid
>>>>>>> (35MB)
>>>>>>> <https://drive.google.com/open?id=1jvjiDlMJEJihB8OQneOeBzJXFiZKKYsR>
>>>>>>> | Structured Grid (70MB)
>>>>>>> <https://drive.google.com/open?id=1RYz4eORPWWf23n6G5am-9_F44zxEOHMl>
>>>>>>>
>>>>>>> If I get a chance, I'll take a look at using smaller data sets.
>>>>>>>
>>>>>>> - Evan
>>>>>>>
>>>>>>>
>>>>>>> On Tue, May 15, 2018 at 12:32 PM, Sujin Philip <
>>>>>>> sujin.philip at kitware.com> wrote:
>>>>>>>
>>>>>>>> Hi Evan,
>>>>>>>>
>>>>>>>> I tried testing this on my end and I am seeing expected performance
>>>>>>>> from VTK and ParaView. But the performance is dependent on the datasets
>>>>>>>> used. Is it possible for you to share your datasets and scripts with us?
>>>>>>>> Could you try this with smaller versions of your datasets and see if you
>>>>>>>> are able to reproduce this?
>>>>>>>>
>>>>>>>> I am not familiar with the timeit module in Python. From the
>>>>>>>> documentation it looks like it runs the code multiple times by default and
>>>>>>>> prints the total time. Can you confirm if you have taken this into
>>>>>>>> consideration in your script?
>>>>>>>>
>>>>>>>> A simple way to time operations in ParaView is to refer to the
>>>>>>>> "Timer Log" under the "Tools" menu. You should see a line like:
>>>>>>>>
>>>>>>>> Execute vtkResampleWithDataSet id: 6788, 2.70556 seconds
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Sujin
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, May 15, 2018 at 1:05 PM, Evan Kao <tossin at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Shawn and Sujin,
>>>>>>>>>
>>>>>>>>> Thanks for the quick responses.  The CPU on the computer I'm using
>>>>>>>>> is an i7-6700
>>>>>>>>> <https://ark.intel.com/products/88196/Intel-Core-i7-6700-Processor-8M-Cache-up-to-4_00-GHz>
>>>>>>>>> with 4 cores, 8 threads, and 3.4 GHz frequency.
>>>>>>>>>
>>>>>>>>> Multi-threading may be a factor, but it's hard to tell because
>>>>>>>>> resampling in ParaView is so quick.  ParaView is capable of using 100% of
>>>>>>>>> the CPU, while VTK (in Python) will max out at 12-13%.  However, for these
>>>>>>>>> particular datasets, resampling doesn't appear to stress ParaView that much
>>>>>>>>> (11-16% when observing the Windows Task Manager, and some of that may be
>>>>>>>>> because of the rendering).  However, I was under the impression that at
>>>>>>>>> best multi-threading could only reduce the time it takes by N threads (ie
>>>>>>>>> 8x), while the speed difference here is almost 1000x.  I measured the times
>>>>>>>>> for ParaView 5.5, VTK 8.1 (compiled elsewhere), and VTK 7.1 (compiled by
>>>>>>>>> our group):
>>>>>>>>>
>>>>>>>>>    1. ParaView 5.5 - 1.1s, using a stopwatch, multiple trials.
>>>>>>>>>    Timing started the moment I clicked "Apply".
>>>>>>>>>    2. VTK 8.1 - 922.47s, timed using Python's timeit module,
>>>>>>>>>    measuring only the vtkResampleWithDataSet.Update() method.
>>>>>>>>>    3. VTK 7.1 - 950.47s, timed the same way as above.
>>>>>>>>>
>>>>>>>>> I'm aware of the difference in labeling between VTK and ParaView
>>>>>>>>> for Source and Input (which confuses me all the time).  I can verify the
>>>>>>>>> correct data sets were assigned by saving the output (which should an
>>>>>>>>> unstructured grid) and viewing it in ParaView - it looks identical to the
>>>>>>>>> resampled data generated in ParaView (although it overwrites the point
>>>>>>>>> scalars array and adds some ghost information that needs to be removed).
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Evan
>>>>>>>>>
>>>>>>>>> On Tue, May 15, 2018 at 7:38 AM, Sujin Philip <
>>>>>>>>> sujin.philip at kitware.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Evan,
>>>>>>>>>>
>>>>>>>>>> As Shawn mentioned it could be due to lack of multi-threading.
>>>>>>>>>> Could you provide us the configuration of the system you are using? Like
>>>>>>>>>> the number of cores/threads and the CPU frequency? Also please share the
>>>>>>>>>> actual time that ParaView and VTK are taking. Is it possible for you to try
>>>>>>>>>> out a slightly older VTK version and see if the performance difference is
>>>>>>>>>> still there?
>>>>>>>>>>
>>>>>>>>>> Which dataset are you setting as input and which as source? The
>>>>>>>>>> names are unfortunately opposite between VTK-m and ParaView due to legacy
>>>>>>>>>> reasons. Probing with the unstructured grid as the source is much slower
>>>>>>>>>> than probing with the structured grid as the source. So please confirm that
>>>>>>>>>> the VTK pipeline is set up properly.
>>>>>>>>>>
>>>>>>>>>> Please let me know if none these seem to be the cause of your
>>>>>>>>>> problem and I will dig deeper.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Sujin
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, May 15, 2018 at 9:52 AM, Shawn Waldon <
>>>>>>>>>> shawn.waldon at kitware.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Evan,
>>>>>>>>>>>
>>>>>>>>>>> I suspect the differece is that the ParaView binaries were
>>>>>>>>>>> compiled with TBB multithreading support and the Anaconda VTK was not.
>>>>>>>>>>> vtkResampleWithDataSet is set up to use TBB multithreading if available.
>>>>>>>>>>> Check the utilization of the cores on your computer when running each and
>>>>>>>>>>> you will see ParaView using all available cores and Anaconda's VTK probably
>>>>>>>>>>> only using one.  It is also possible the cell locator change improved
>>>>>>>>>>> things further but I'm not familiar with that.
>>>>>>>>>>>
>>>>>>>>>>> HTH,
>>>>>>>>>>>
>>>>>>>>>>> Shawn
>>>>>>>>>>>
>>>>>>>>>>> On Mon, May 14, 2018 at 7:54 PM, Evan Kao <tossin at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> I am trying to resample a structured grid data (~1.4M points,
>>>>>>>>>>>> 1.3M cells) with an unstructured grid (~320K points, 480K cells).  In
>>>>>>>>>>>> Paraview 5.5, this resampling is nearly instant with the Resample With
>>>>>>>>>>>> Dataset filter.  Yet in a Python script using vtkResampleWithDataSet from
>>>>>>>>>>>> VTK 8.1.0, the same operation takes about 15 minutes (>2 orders of
>>>>>>>>>>>> magnitude difference in speed).  As far as I can tell from the VTK
>>>>>>>>>>>> repository on Gitlab, the only difference between the Paraview/release
>>>>>>>>>>>> version and the 8.1.0 or 8.1.1 tagged releases is a switch in the cell
>>>>>>>>>>>> locator.  Is this enough to explain the difference in the performance?  If
>>>>>>>>>>>> not, could someone enlighten me as to what the possible factors are here?
>>>>>>>>>>>>
>>>>>>>>>>>> Also, if it matters, this is all on a Windows 7 64-bit
>>>>>>>>>>>> machine.  Paraview is installed from binaries, while VTK was downloaded
>>>>>>>>>>>> from an Anaconda distribution compiled by a third party.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for your time,
>>>>>>>>>>>> Evan Kao
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Powered by www.kitware.com
>>>>>>>>>>>>
>>>>>>>>>>>> Visit other Kitware open-source projects at
>>>>>>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>>>>>>
>>>>>>>>>>>> Please keep messages on-topic and check the VTK FAQ at:
>>>>>>>>>>>> http://www.vtk.org/Wiki/VTK_FAQ
>>>>>>>>>>>>
>>>>>>>>>>>> Search the list archives at: http://markmail.org/search/?q=
>>>>>>>>>>>> vtkusers
>>>>>>>>>>>>
>>>>>>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>>>>>>> https://vtk.org/mailman/listinfo/vtkusers
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Powered by www.kitware.com
>>>>>>>>>>>
>>>>>>>>>>> Visit other Kitware open-source projects at
>>>>>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>>>>>
>>>>>>>>>>> Please keep messages on-topic and check the VTK FAQ at:
>>>>>>>>>>> http://www.vtk.org/Wiki/VTK_FAQ
>>>>>>>>>>>
>>>>>>>>>>> Search the list archives at: http://markmail.org/search/?q=
>>>>>>>>>>> vtkusers
>>>>>>>>>>>
>>>>>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>>>>>> https://vtk.org/mailman/listinfo/vtkusers
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Powered by www.kitware.com
>>
>> Visit other Kitware open-source projects at
>> http://www.kitware.com/opensource/opensource.html
>>
>> Please keep messages on-topic and check the VTK FAQ at:
>> http://www.vtk.org/Wiki/VTK_FAQ
>>
>> Search the list archives at: http://markmail.org/search/?q=vtkusers
>>
>> Follow this link to subscribe/unsubscribe:
>> https://vtk.org/mailman/listinfo/vtkusers
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://vtk.org/pipermail/vtkusers/attachments/20180516/8c46ef3e/attachment.html>


More information about the vtkusers mailing list