[Insight-developers] Image writing with unicode filename impossible with MSVC

Tom Vercauteren tom.vercauteren at m4x.org
Tue Oct 20 13:40:07 EDT 2009


Hi all,

Thanks for your constructive feedback.

Benjamin and I have looked a bit further into this issue and into utfcpp.

Unfortunately utfcpp does not really provide the features we would
really like, namely:

- It does not define a separate utf8 string class, it uses std::string
as a container

- It does not allow the creation of a utf8 encoded std::string from a
std::string encoded with the default encoding


That being said, we can still make efficient use of it. Here is a proposal:

1) We keep the current API that only allows users to set char* or
std::string filenames

2) We specify in the documentation that these strings have to be
encoded in utf8 on MSVC (and other utf8-based systems as previously)

3) On MSVC, we use utfcpp to check whether the filename actually is
encoded in utf8 and we throw an exception otherwise

4) We write fopen-like functions in ITK (say itk::fopen) that works
with utf8 filenames (For MSVC, this will basically use utfcpp to
convert the utf8 encoded string to a utf16 encoded wstring and call
_wopen)

5) We use itk::fopen when possible instead of fopen

Some preliminary experiments are shown here:
http://public.kitware.com/Bug/file_download.php?file_id=2574&type=bug

The only drawback of this approach is that it is not strictly backward
compatible for MSVC. More specifically it will work as previously with
ASCII filenames but will not work without prior utf8 conversion for
non-ASCII filenames that could be represented in the local codepage.
We could of course add a cmake switch to maintain strict backward
compatibilty if deemed necessary.

Thoughts?

Tom



On Tue, Oct 20, 2009 at 14:56, Brad King <brad.king at kitware.com> wrote:
> Sean McBride wrote:
>>
>> On 10/19/09 11:15 AM, Brad King said:
>>
>>> As the primary maintainer of KWSys I prefer to put as little
>>> in the library as possible.
>>
>> Perhaps I haven't been following closely enough, but do you mean you
>> wouldn't want to create a utf8 lib from scratch in KWSys or that you
>> don't even want a thin wrapper over utf-cpp in KWSys?
>
> Both.
>
> There is no reason to create a utf8 lib from scratch when there are
> plenty of third-party libraries available.  We cannot do a thin-wrapper
> because KWSys cannot have third-party dependencies.
>
> IMO KWSys already has too much.  Originally it was just supposed to avoid
> duplicate Kitware-written code that was copied between VTK and ITK.  It
> was a/my mistake to add things like the MD5 hash implementation to it.
>
>> If the latter, that means we'd end up with both a vtkUnicodeString and
>> itkUnicodeString?  What if CMake needs to process utf8?
>
> We already have zlib in all three projects, named vtkzlib, itkzlib, and
> cmzlib.  Each project mangles the symbols to avoid conflicts, and they
> all support sharing a system-installed version.
>
> -Brad
>


More information about the Insight-developers mailing list