[Insight-developers] Image writing with unicode filename impossible with MSVC

Bill Lorensen bill.lorensen at gmail.com
Mon Oct 26 16:38:12 EDT 2009


Tom,

Is there a way to try the portability of this solution without
touching any itk classes. Can you check in a test that verifies the
functionality and portability of your proposed solution before we
commit to this solution?

Bill

On Mon, Oct 26, 2009 at 4:14 PM, Tom Vercauteren
<tom.vercauteren at m4x.org> wrote:
> Hi all,
>
> I'm back on the unicode filenames topic and really need your feedback
> before I start touching every IO class...
>
> I have uploaded a preliminary patch on the bug tracker that allows the
> use of utf-8 encoded strings on windows for several IO classes:
>  http://public.kitware.com/Bug/file_download.php?file_id=2601&type=bug
>
> Namely, what is working already is writing (and maybe reading) of
> unicode filenames on windows for the following formats:
> - jpeg
> - png
> - meta (mhd and mha)
> - tiff
>
> My approach was to convert the utf-8 encoded std::string to a utf-16
> encoded wstring (on windows only) when it becomes necessay. This is
> done using the utfcpp library:
>  http://utfcpp.sourceforge.net/
>
> For backward compatibility reasons, this conversion is activated by a
> cmake variable:
>  ITK_USE_REVIEW_UTF8_STRINGS
>
> For png, jpeg and tiff, no modification were necessary to the
> underlying third party libraries.
>
> For metaio, one file had to be modified. For backward compatibility
> reasons, the new behavior is only activated if
>  METAIO_USE_REVIEW_UTF8_STRINGS
> is defined. Of course, turning ITK_USE_REVIEW_UTF8_STRINGS on in
> cmake, turns METAIO_USE_REVIEW_UTF8_STRINGS on in the c++ code.
>
> Could you give a look at my preliminary patch and tell me if something
> along those lines could be accepted into ITK?
>
> Cheers,
> Tom
>
> On Tue, Oct 20, 2009 at 18:40, Tom Vercauteren <tom.vercauteren at m4x.org> wrote:
>> Hi all,
>>
>> Thanks for your constructive feedback.
>>
>> Benjamin and I have looked a bit further into this issue and into utfcpp.
>>
>> Unfortunately utfcpp does not really provide the features we would
>> really like, namely:
>>
>> - It does not define a separate utf8 string class, it uses std::string
>> as a container
>>
>> - It does not allow the creation of a utf8 encoded std::string from a
>> std::string encoded with the default encoding
>>
>>
>> That being said, we can still make efficient use of it. Here is a proposal:
>>
>> 1) We keep the current API that only allows users to set char* or
>> std::string filenames
>>
>> 2) We specify in the documentation that these strings have to be
>> encoded in utf8 on MSVC (and other utf8-based systems as previously)
>>
>> 3) On MSVC, we use utfcpp to check whether the filename actually is
>> encoded in utf8 and we throw an exception otherwise
>>
>> 4) We write fopen-like functions in ITK (say itk::fopen) that works
>> with utf8 filenames (For MSVC, this will basically use utfcpp to
>> convert the utf8 encoded string to a utf16 encoded wstring and call
>> _wopen)
>>
>> 5) We use itk::fopen when possible instead of fopen
>>
>> Some preliminary experiments are shown here:
>> http://public.kitware.com/Bug/file_download.php?file_id=2574&type=bug
>>
>> The only drawback of this approach is that it is not strictly backward
>> compatible for MSVC. More specifically it will work as previously with
>> ASCII filenames but will not work without prior utf8 conversion for
>> non-ASCII filenames that could be represented in the local codepage.
>> We could of course add a cmake switch to maintain strict backward
>> compatibilty if deemed necessary.
>>
>> Thoughts?
>>
>> Tom
>>
>>
>>
>> On Tue, Oct 20, 2009 at 14:56, Brad King <brad.king at kitware.com> wrote:
>>> Sean McBride wrote:
>>>>
>>>> On 10/19/09 11:15 AM, Brad King said:
>>>>
>>>>> As the primary maintainer of KWSys I prefer to put as little
>>>>> in the library as possible.
>>>>
>>>> Perhaps I haven't been following closely enough, but do you mean you
>>>> wouldn't want to create a utf8 lib from scratch in KWSys or that you
>>>> don't even want a thin wrapper over utf-cpp in KWSys?
>>>
>>> Both.
>>>
>>> There is no reason to create a utf8 lib from scratch when there are
>>> plenty of third-party libraries available.  We cannot do a thin-wrapper
>>> because KWSys cannot have third-party dependencies.
>>>
>>> IMO KWSys already has too much.  Originally it was just supposed to avoid
>>> duplicate Kitware-written code that was copied between VTK and ITK.  It
>>> was a/my mistake to add things like the MD5 hash implementation to it.
>>>
>>>> If the latter, that means we'd end up with both a vtkUnicodeString and
>>>> itkUnicodeString?  What if CMake needs to process utf8?
>>>
>>> We already have zlib in all three projects, named vtkzlib, itkzlib, and
>>> cmzlib.  Each project mangles the symbols to avoid conflicts, and they
>>> all support sharing a system-installed version.
>>>
>>> -Brad
>>>
>>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Kitware offers ITK Training Courses, for more information visit:
> http://kitware.com/products/protraining.html
>
> Please keep messages on-topic and check the ITK FAQ at:
> http://www.itk.org/Wiki/ITK_FAQ
>
> Follow this link to subscribe/unsubscribe:
> http://www.itk.org/mailman/listinfo/insight-developers
>


More information about the Insight-developers mailing list