[Paraview] Slow XML readers
Berk Geveci
berk.geveci at kitware.com
Fri Jul 3 15:34:39 EDT 2009
Hi Bam,
You are partially correct. The output is the base64 encoding of
compressed binary data (not utf8 encoded). Still, your point is valid
and writing data this was is slow (and stupid). The XML writers can
write data in a few ways:
* Data is inlined in the XML section - the data, if binary, has to be
base64 encoded in this case because XML parsers cannot handle raw
binary data. I don't know why we even support this mode. Don't use it.
* Data is appended at the end of the XML section. Here you have the
option of compressing (using zlib) and base64 encoding the data. Why
would you do that? Maybe you want to upload it somewhere as valid XML
or something. I don't know.
The issue was the fact that we did not expose the right options to
write data as appended, raw binary without compression and encoding. I
fixed this by exposing these setting in the user interface. As of 5
minutes ago, the default behavior is to write the data as appended
raw. I tested the performance. It is still not as good as the old
writers due to reasons too long to explain here but it is close. If
you want to try it out, you will have to build ParaView from source
using the cvs version.
Best,
-berk
On Thu, Jul 2, 2009 at 3:25 PM, Bam Ting<bampingting at gmail.com> wrote:
> I have noticed that when saving data from ParaView in the XML binary format,
> a binary file is not produced. Please check me on this! I think it is a bug!
>
> to reproduce:
>
> open ParaView
> sources->mandlebrot
> dims 205,250,250
> File->Save Data-> vti, binary
> File->Save Data-> legacy binary
>
> look at the files produced, the XML vti is clearly not a binary file (and by
> the way on windows unix line feeds are used, not sure if that is
> desireable), while the legacy file clearly is a binary file.
>
> If I underrstand the XML reader/writer tuple correctly here is a potential
> performance caveat when using the XML reader/writer tuple: Be sure file is
> really binary!!!!
>
> When ParaView saves data as vti XML binary, the file is not really binary at
> all but rather base 64 encoded, utf8. This combination could not be slower!
> As the reader reads each character writen this way, it first tests to see
> how many bytes the character has, then does the base 64 conversion to native
> float/double. Operation thus on each character is of course a trerrible idea
> if one cares about performance!
>
>
> delete the pipeline
> Open both files
> tools->timer log
>
> You will likely see that the binary legacy file loads twice as fast as the
> XML.
>
> Experts, plz verify I am not smoking crack
> Thx
> Bam
>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ParaView Wiki at:
> http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://www.paraview.org/mailman/listinfo/paraview
>
>
More information about the ParaView
mailing list