View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0009623ITKpublic2009-09-30 10:152012-02-17 09:47
ReporterBenjamin Tourne 
Assigned ToBrad King 
PrioritynormalSeveritymajorReproducibilityalways
StatusassignedResolutionopen 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0009623: ITK cannot use Unicode filenames on Visual Studio
DescriptionHi all,

It seems that the ITK API does not support Unicode filenames on Visual Studio environment.

Here is a patch for itk with a new test called itkImageFileWriterUnicodeTest.
This test tries to create a ImageFileWriter object with a name that contains the greek symbol '?' (lowcase alpha). This is done by 4 different ways:

1 Using SetFilename(std::string filename, ...) function
2 Using SetFilename(char * filename, ...) function
3 Using SetFileName with an UTF-8 encoded char array

(The last test is for Visual Studio users only:)
4 Using SetFileName with an array encoded with windows system local codepage.

On Visual Studio, the 4 tests fail: The file cannot be created, or the file is created with an incorrect name.

A good solution to this problem would bo add an overloaded function
SetFileName(wchar_t* filename) or SetFileName(std::wstring filename).

Best regards,

Benjamin Tourne.
Additional InformationThe patch with the new test is joined to this report.
TagsUnicode, visual
Resolution Date
Sprint
Sprint Status
Attached Filespatch file icon itk-unicodewritetest-2009-09-30.patch [^] (8,153 bytes) 2009-09-30 10:15 [Show Content]
zip file icon utfcpptest.zip [^] (11,535 bytes) 2009-10-20 13:22
patch file icon itk-msvc-unicode-2009-10-26.patch [^] (70,177 bytes) 2009-10-26 15:56 [Show Content]
zip file icon itkUnicodeIOTest.zip [^] (13,460 bytes) 2009-10-27 14:46
zip file icon itkUnicodeIOTest-2009-11-02.zip [^] (5,097 bytes) 2009-11-02 13:04
cxx file icon itkUnicodeIOTest.cxx [^] (13,128 bytes) 2009-11-11 08:38
patch file icon itk-unicodeio-2010-01-12.patch [^] (19,782 bytes) 2010-01-12 06:38 [Show Content]

 Relationships

  Notes
(0017844)
Tom Vercauteren (developer)
2009-09-30 12:05

For the record, on linux, this unit test ( itk-unicodewritetest-2009-09-30.patch) passes without issue.
(0018243)
Tom Vercauteren (developer)
2009-10-26 15:58

I have attached for review a preliminary patch (itk-msvc-unicode-2009-10-26.patch) that allows the use of utf-8 encoded filenames on windows for the following formats:
- jpeg
- png
- meta (mhd and mha)
- tiff

Feedback is welcome!
(0018248)
Brad King (manager)
2009-10-27 10:26

I applied itk-msvc-unicode-2009-10-26.patch locally and scrolled through the changes. I think blocks like

+#ifdef _MSC_VER
+ // Convert to utf16

should test _WIN32 instead...we want to convert to utf16 and use the wide character *windows* API. TIFFOpenW does this already.
(0018249)
Brad King (manager)
2009-10-27 10:28

For reference, here is the mailing list thread in which this bug is discussed:

  http://www.itk.org/mailman/private/insight-developers/2009-October/013464.html [^]
(0018251)
Tom Vercauteren (developer)
2009-10-27 14:47

Apparently things are a bit more complex than I thought.

* cygwin (latest stable version) has no unicode support at all:
  * _wfopen and _wunlink are NOT available
  * std::ofstream and std::ifstream have NO open(wchar_t * filename) function

* mingw (latest stable version) has partial unicode support:
  * _wfopen and _wunlink are available
  * std::ofstream and std::ifstream have NO open(wchar_t * filename) function

My proposal fully works only on MSVC. Making it work on mingw will require more change to metaio, i.e. moving from std::ofstream and std::ifstream to FILE * approaches.

The attached test project (itkUnicodeIOTest.zip) shows the results of my experiments.
(0018252)
Brad King (manager)
2009-10-27 15:13

FYI, I was able to read a unicode filename with the GNU compiler and C++ streams on cygwin like this:

$ cat myfile.txt
hello, world
$ cat stdio_filebuf.cxx
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/fcntl.h>
#include <ext/stdio_filebuf.h>
#include <iostream>
#include <io.h>

int main()
{
  int fd = _wopen(L"myfile.txt", O_RDONLY);
  __gnu_cxx::stdio_filebuf<char> ibuf(fd, std::ios::in);
  std::istream in(&ibuf);
  std::cout << in.rdbuf();
  return 0;
}
$ g++ -mno-cygwin stdio_filebuf.cxx
$ ./a.exe
hello, world

I think it also works with stdio.h-style C FILE* buffers. However, if there is no _wfopen then that may not be an option.
(0018256)
Tom Vercauteren (developer)
2009-10-28 10:59

Thanks for the information Brad!

I managed to use this code with MinGW but not with Cygwin (with gcc 4). On Cygwin I get:
  '_wopen' was not declared in this scope
(This is exactly the same as what I got for _wfopen).

This seems to be related to the -mno-cygwin flag:
  http://www.delorie.com/howto/cygwin/mno-cygwin-howto.html [^]
However adding the -mno-cygwin flag leads to
  g++: The -mno-cygwin flag has been removed; use a mingw-targeted cross-compiler.

This is apparently a know issue of cygwin's gcc-4:
  http://cygwin.com/ml/cygwin/2009-10/msg00061.html [^]

Anyhow, I also found an alternative to __gnu_cxx::stdio_filebuf, that consist of a single header file and is apparently portable to the platforms that we target:
  http://www.josuttis.com/cppcode/fdstream.html [^]

More experimenting is required, I'll keep information coming on the bug tracker when I get some more time to work on it.
(0018318)
Tom Vercauteren (developer)
2009-11-02 13:16

I have been experimenting with fdstream. fdstream allows the creation of an istream or ostream from a file descriptor. It seems to work just fine on all plateforms I tried (linux 32 bit with gcc, windows with MSVC, cygwin's gcc and mingw).

Therefore if file with a unicode encoded filename can be opened, performing IO operations on a stream should work.

The attached test (itkUnicodeIOTest-2009-11-02.zip) showed that IO operations on file with unicode filenames works on:
* linux
* windows with MSVC
* windows with MinGW


-----
Note that cygwin doesn't work. This does not contradict Brad's experiment because adding the -mno-cygwin flag to cygwin's compiler essentially turns the compiler into the mingw compiler as __MINGW32__ becomes defined and __CYGWIN__ becomes undefined:

/cygdrive/c/cygwin/bin/gcc-3.exe -mno-cygwin -dM -E- < /dev/null | sort

#define WIN32 1
#define WINNT 1
#define _WIN32 1
#define _X86_ 1
#define __CHAR_BIT__ 8
#define __DBL_DENORM_MIN__ 4.9406564584124654e-324
#define __DBL_DIG__ 15
#define __DBL_EPSILON__ 2.2204460492503131e-16
#define __DBL_HAS_INFINITY__ 1
#define __DBL_HAS_QUIET_NAN__ 1
#define __DBL_MANT_DIG__ 53
#define __DBL_MAX_10_EXP__ 308
#define __DBL_MAX_EXP__ 1024
#define __DBL_MAX__ 1.7976931348623157e+308
#define __DBL_MIN_10_EXP__ (-307)
#define __DBL_MIN_EXP__ (-1021)
#define __DBL_MIN__ 2.2250738585072014e-308
#define __DECIMAL_DIG__ 21
#define __FINITE_MATH_ONLY__ 0
#define __FLT_DENORM_MIN__ 1.40129846e-45F
#define __FLT_DIG__ 6
#define __FLT_EPSILON__ 1.19209290e-7F
#define __FLT_EVAL_METHOD__ 2
#define __FLT_HAS_INFINITY__ 1
#define __FLT_HAS_QUIET_NAN__ 1
#define __FLT_MANT_DIG__ 24
#define __FLT_MAX_10_EXP__ 38
#define __FLT_MAX_EXP__ 128
#define __FLT_MAX__ 3.40282347e+38F
#define __FLT_MIN_10_EXP__ (-37)
#define __FLT_MIN_EXP__ (-125)
#define __FLT_MIN__ 1.17549435e-38F
#define __FLT_RADIX__ 2
#define __GNUC_MINOR__ 4
#define __GNUC_PATCHLEVEL__ 4
#define __GNUC__ 3
#define __GXX_ABI_VERSION 1002
#define __INT_MAX__ 2147483647
#define __LDBL_DENORM_MIN__ 3.64519953188247460253e-4951L
#define __LDBL_DIG__ 18
#define __LDBL_EPSILON__ 1.08420217248550443401e-19L
#define __LDBL_HAS_INFINITY__ 1
#define __LDBL_HAS_QUIET_NAN__ 1
#define __LDBL_MANT_DIG__ 64
#define __LDBL_MAX_10_EXP__ 4932
#define __LDBL_MAX_EXP__ 16384
#define __LDBL_MAX__ 1.18973149535723176502e+4932L
#define __LDBL_MIN_10_EXP__ (-4931)
#define __LDBL_MIN_EXP__ (-16381)
#define __LDBL_MIN__ 3.36210314311209350626e-4932L
#define __LONG_LONG_MAX__ 9223372036854775807LL
#define __LONG_MAX__ 2147483647L
#define __MINGW32__ 1
#define __MSVCRT__ 1
#define __NO_INLINE__ 1
#define __PTRDIFF_TYPE__ int
#define __REGISTER_PREFIX__
#define __SCHAR_MAX__ 127
#define __SHRT_MAX__ 32767
#define __SIZE_TYPE__ unsigned int
#define __STDC_HOSTED__ 1
#define __USER_LABEL_PREFIX__ _
#define __USING_SJLJ_EXCEPTIONS__ 1
#define __VERSION__ "3.4.4 (cygming special, gdc 0.12, using dmd 0.125)"
#define __WCHAR_MAX__ 65535U
#define __WCHAR_TYPE__ short unsigned int
#define __WIN32 1
#define __WIN32__ 1
#define __WINT_TYPE__ unsigned int
#define __cdecl __attribute__((__cdecl__))
#define __declspec(x) __attribute__((x))
#define __fastcall __attribute__((__fastcall__))
#define __i386 1
#define __i386__ 1
#define __stdcall __attribute__((__stdcall__))
#define __tune_i686__ 1
#define __tune_pentiumpro__ 1
#define _cdecl __attribute__((__cdecl__))
#define _fastcall __attribute__((__fastcall__))
#define _stdcall __attribute__((__stdcall__))
#define i386 1
(0018319)
Brad King (manager)
2009-11-02 13:26

Does "-mwin32" help on cygwin?
(0018320)
Tom Vercauteren (developer)
2009-11-02 14:01

Unfortunately, adding the "-mwin32" flag does not help on cygwin. As far as I understand it, this is really a cygwin limitation that cannot be overcome. See also this (old) email thread:
http://www.mail-archive.com/cygwin@cygwin.com/msg66767.html [^]
(0018326)
Tom Vercauteren (developer)
2009-11-03 03:22

Good news for cygwin. The new 1.7 version that is currently in beta gets closer to the linux/mac behavior. Namely, the default encoding for filenames is set to utf-8 and things work out of the box (as on linux and mac).
http://cygwin.com/1.7/cygwin-ug-net/ov-new1.7.html [^]
(0019094)
Tom Vercauteren (developer)
2010-01-12 06:43

In an attempt to move a little further on this issue, I would like to put all the helpers functions from my unit test
  http://www.itk.org/cgi-bin/viewcvs.cgi/Testing/Code/IO/itkUnicodeIOTest.cxx?root=Insight&sortby=date&view=markup [^]
to one header file. I was thinking of using Code/Common/itkI18nIOHelpers.h and putting the functions in the itk::I18n namespace:
  https://public.kitware.com/Bug/file/2757/itk-unicodeio-2010-01-12.patch [^]

Thoughts?
(0019097)
Brad King (manager)
2010-01-12 09:06

Fine with me.

BTW, I noticed the use of the "boost" namespace in "itkExtHdrs/fdstream.hpp". When the header was only included in .cxx files that was okay. Now that it may be included through a header we need to be more careful about conflicts. If an application really tries to use boost or has its own version of that header it may conflict. Can you move the code into an itk namespace?
(0019099)
Tom Vercauteren (developer)
2010-01-12 09:58

fdstream.hpp is not part of boost (http://www.josuttis.com/cppcode/fdstream.html [^]). It has been proposed to boost but was not accepted. Anyway, changing the namespace to itk also seems cleaner to me. I'll commit it together with itkI18nIOHelpers.h tomorrow if I don't get any negative feedback from the itk list.
(0019527)
edice (reporter)
2010-02-15 01:20

Can this work be extended to add unicode support to all of the VTK file readers, etc vtkPNGReader ?

It would be good to be able to pass a std::wstring

 Issue History
Date Modified Username Field Change
2009-09-30 10:15 Benjamin Tourne New Issue
2009-09-30 10:15 Benjamin Tourne File Added: itk-unicodewritetest-2009-09-30.patch
2009-09-30 10:16 Benjamin Tourne Tag Attached: visual
2009-09-30 10:16 Benjamin Tourne Tag Attached: Unicode
2009-09-30 12:05 Tom Vercauteren Note Added: 0017844
2009-10-20 13:22 Tom Vercauteren File Added: utfcpptest.zip
2009-10-26 15:56 Tom Vercauteren File Added: itk-msvc-unicode-2009-10-26.patch
2009-10-26 15:58 Tom Vercauteren Note Added: 0018243
2009-10-27 10:26 Brad King Note Added: 0018248
2009-10-27 10:28 Brad King Note Added: 0018249
2009-10-27 14:46 Tom Vercauteren File Added: itkUnicodeIOTest.zip
2009-10-27 14:47 Tom Vercauteren Note Added: 0018251
2009-10-27 15:13 Brad King Note Added: 0018252
2009-10-28 10:59 Tom Vercauteren Note Added: 0018256
2009-11-02 13:04 Tom Vercauteren File Added: itkUnicodeIOTest-2009-11-02.zip
2009-11-02 13:16 Tom Vercauteren Note Added: 0018318
2009-11-02 13:26 Brad King Note Added: 0018319
2009-11-02 14:01 Tom Vercauteren Note Added: 0018320
2009-11-03 03:22 Tom Vercauteren Note Added: 0018326
2009-11-10 20:19 Tom Vercauteren File Added: itkUnicodeIOTest.cxx
2009-11-11 08:38 Tom Vercauteren File Deleted: itkUnicodeIOTest.cxx
2009-11-11 08:38 Tom Vercauteren File Added: itkUnicodeIOTest.cxx
2010-01-12 06:38 Tom Vercauteren File Added: itk-unicodeio-2010-01-12.patch
2010-01-12 06:43 Tom Vercauteren Note Added: 0019094
2010-01-12 09:06 Brad King Note Added: 0019097
2010-01-12 09:58 Tom Vercauteren Note Added: 0019099
2010-02-15 01:20 edice Note Added: 0019527
2010-11-07 09:01 Hans Johnson Status new => assigned
2010-11-07 09:01 Hans Johnson Assigned To => Brad King


Copyright © 2000 - 2018 MantisBT Team