[Paraview] Slow XML readers

Bam Ting bampingting at gmail.com
Fri Jul 3 18:31:52 EDT 2009


Hi Berk,

I was single stepping through the code and it looked to me like xpat is
infact treating the characters as utf8, even though ParaView is only using
ascii characters. There is a test on each character. But this won't matter
anymore as your fixes should remove xpat from the equation when writing
binary data. As a new user I am finding the list invaluable. Thanks so much
for you help!

Bam

On Fri, Jul 3, 2009 at 12:34 PM, Berk Geveci <berk.geveci at kitware.com>wrote:

> Hi Bam,
>
> You are partially correct. The output is the base64 encoding of
> compressed binary data (not utf8 encoded). Still, your point is valid
> and writing data this was is slow (and stupid). The XML writers can
> write data in a few ways:
>
> * Data is inlined in the XML section - the data, if binary, has to be
> base64 encoded in this case because XML parsers cannot handle raw
> binary data. I don't know why we even support this mode. Don't use it.
> * Data is appended at the end of the XML section. Here you have the
> option of compressing (using zlib) and base64 encoding the data. Why
> would you do that? Maybe you want to upload it somewhere as valid XML
> or something. I don't know.
>
> The issue was the fact that we did not expose the right options to
> write data as appended, raw binary without compression and encoding. I
> fixed this by exposing these setting in the user interface. As of 5
> minutes ago, the default behavior is to write the data as appended
> raw. I tested the performance. It is still not as good as the old
> writers due to reasons too long to explain here but it is close. If
> you want to try it out, you will have to build ParaView from source
> using the cvs version.
>
> Best,
> -berk
>
> On Thu, Jul 2, 2009 at 3:25 PM, Bam Ting<bampingting at gmail.com> wrote:
> > I have noticed that when saving data from ParaView in the XML binary
> format,
> > a binary file is not produced. Please check me on this! I think it is a
> bug!
> >
> >  to reproduce:
> >
> > open ParaView
> > sources->mandlebrot
> > dims 205,250,250
> > File->Save Data-> vti, binary
> > File->Save Data-> legacy binary
> >
> > look at the files produced, the XML vti is clearly not a binary file (and
> by
> > the way on windows unix line feeds are used, not sure if that is
> > desireable), while the legacy file clearly is a binary file.
> >
> > If I underrstand the XML reader/writer tuple correctly here is a
> potential
> > performance caveat when using the XML reader/writer tuple: Be sure file
> is
> > really binary!!!!
> >
> > When ParaView saves data as vti XML binary, the file is not really binary
> at
> > all but rather base 64 encoded, utf8. This combination could not be
> slower!
> > As the reader reads each character writen this way, it first tests to see
> > how many bytes the character has, then does the base 64 conversion to
> native
> > float/double. Operation thus on each character is of course a trerrible
> idea
> > if one cares about performance!
> >
> >
> > delete the pipeline
> > Open both files
> > tools->timer log
> >
> > You will likely see that the binary legacy file loads twice as fast as
> the
> > XML.
> >
> > Experts, plz verify I am not smoking crack
> > Thx
> > Bam
> >
> > _______________________________________________
> > Powered by www.kitware.com
> >
> > Visit other Kitware open-source projects at
> > http://www.kitware.com/opensource/opensource.html
> >
> > Please keep messages on-topic and check the ParaView Wiki at:
> > http://paraview.org/Wiki/ParaView
> >
> > Follow this link to subscribe/unsubscribe:
> > http://www.paraview.org/mailman/listinfo/paraview
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.paraview.org/pipermail/paraview/attachments/20090703/4109f264/attachment-0001.htm>


More information about the ParaView mailing list