This is a very tough nut to crack and from my experience is not completely solved in most systems.<div><br></div><div>You can and probably should put some lightweight sanity checks into the readers (and sprinkled elsewhere) to validate data. But believe me this is not enough.</div>
<div><br></div><div>For example, one thing I've come across a lot is a polydata with a valid vtkCellArray and a valid vtkPoints and yet it still crashes. Why? because the person writing the file thought that VTK is 1-offset, meaning that somewhere in the vtkCellArray are cells that refer to a non-existent point, and crash!</div>
<div><br></div><div>Okay we can probably fix that, but what about the case of a badly-ordered hex or wedge (jacobian is negative and/or the cell is mangled)? Self-intersectiing mesh? Zero volume/area cell? And I could probably come up with a list of a dozen or more problems related to topology, geometry and assumptions about attributes (normalized normals, etc.) that might bring down a particular filter. Some can be checked for pretty quickly, while some are very expensive to address and could badly impact the reader's performance. So if you put in a lot of validity checking you then start hearing "VTK is slow...." You can't win :-)</div>
<div><br></div><div>Not only this, but if we are not careful, the validity checking code might be duplicated across multiple readers;you'd like to contain this somehow.</div><div><br></div><div>So I think it's a design balancing act, some code in the readers and sprinkled through the code base for quick sanity checks, with other heavy-duty filters/functions used for debugging.</div>
<div><br></div><div>Will<br><br><div class="gmail_quote">On Sun, Feb 14, 2010 at 2:15 PM, Berk Geveci <span dir="ltr"><<a href="mailto:berk.geveci@kitware.com">berk.geveci@kitware.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br><div class="gmail_quote"><div class="im">On Sun, Feb 14, 2010 at 7:18 AM, Will Schroeder <span dir="ltr"><<a href="mailto:will.schroeder@kitware.com" target="_blank">will.schroeder@kitware.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
Another option here is to create a filter (or family of filters) that can be inserted into the pipeline to check for validity (there could be different sorts of checks). The filter could be removed, or disabled with a flag, once the application is being tuned for performance and the data is known to be good. In my mind these validity checks are more like debugging tools and should be able to easily removed when debugging is complete.</blockquote>
</div><div><br>I somewhat disagree. Yes, these are debugging tools BUT they shouldn't be always optional. If I am a scientists who wants to load a dataset that I wrote using my code and ParaView crashes somewhere far far downstream, how am I supposed to know that my file is wrong? I would probably send a message to the mailing list, then some poor developer would ask for the file and debug and then find out that I don't know how to read the file format spec. That's assuming that the file is not 20 GB or something. If, instead, the reader said:<br>
<br> - Hey dude, there is something wrong with this file, please read the file format spec. Also, to debug this further, you can use this nice program that can give you more details about the problem.<br><br>then the user wouldn't even have to come to the mailing list.<br>
<br>-berk<br></div></div>
</blockquote></div><br><br clear="all"><br>-- <br>William J. Schroeder, PhD<br>Kitware, Inc.<br>28 Corporate Drive<br>Clifton Park, NY 12065<br><a href="mailto:will.schroeder@kitware.com">will.schroeder@kitware.com</a><br>
<a href="http://www.kitware.com">http://www.kitware.com</a><br>(518) 881-4902<br>
</div>