VTK/improved unicode support

From KitwarePublic
Revision as of 17:56, 14 October 2010 by Prwolfe (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Improved Unicode read/write support

A series of e-mails on the TitanDevelopers list about unicode support occurred this week leading to several proposals and a consensus was reached on October 7th 2010. Since the actual implementation of the basic functionality is in VTK I am posting here to get feedback on the method and classes proposed. The final upshot of all this is that anyone not linking to Qt will get the same support as always, anyone linking to Qt will get about 25 new unicode encodings, and anyone who has a new one that needs support can do it without modifying any of the readers or writers they wish to use.

Proposed method

Current readers and writers (mainly the delimited text reader/writer) use either code specifically written to handle UTF16 and ASCII or use a 3 party library for UTF8 support. Knowledge of the specifics is written into each reader and adding a new method means directly editing each class. Since we expect that there will be more reader/writers requiring Unicode support and since we also expect that new encodings will continue to be added to the unicode standard we propose adding a factory method for finding a codec for a given format and modifying readers to accept that codec for use in translation.

Factory Pattern

We plan on a factory pattern similar to the one used to discover SQL Database capabilities in the vtkSQLDatabase class. Each class will register a callback at link time using a static object constructor.


Abstract Base class for codecs

A virtual base class for each codec to conform to will be needed so that a given reader/writer can deal with all of them as one. Note that the name is Codec, but that the main use at this point is decoding, adding functions to return ascii, utf8 etc will have to happen over time.


Sample use of a codec for read and write

This is the main usage for my customer so I used it to show how things will change. While this file currently resides in VTK/Infovis the consensus was that it should be in IO. It is not being moved at this time pending an upcoming build re-organization. The old codecs will come in no matter what, new ones from Qt will show up when linking against the library with them in it (the Titan library will now have QtIO to handle the interface to it's codecs.)


Please feel free to comment/kibitz!

--prwolfe 14:49, 7 October 2010 (EDT)