[vtk-developers] VTK strings

Todd Martin nztoddler at yahoo.com
Thu May 10 20:28:07 EDT 2018


I was worried that might be the case. I recently came across an article where the author suggested using UTF8 for all internal C/C++ strings. Using UTF-8 as the internal representation for strings in C and C++ with Visual Studio | Nubaria Blog 

This seems like a sensible approach to me. It is also the solution adopted by the free Pascal compiler (FPC) team.
If the VTK readers/writers were instantiated with a particular encoding via a constructor parameter, would this not permit flexibility and certainty? Has there been any discussion about this? Is all this work a lower priority?



| 
| 
| 
|  |  |

 |

 |
| 
|  | 
Using UTF-8 as the internal representation for strings in C and C++ with...


 |

 |

 |





Todd Martin, Ph.D.
Freelance Engineer/Software Architect.
 

    On Friday, May 11, 2018, 12:02:41 PM GMT+12, David Gobbi <david.gobbi at gmail.com> wrote:  
 
 For the most part, VTK strings are in the local 8-bit encoding, whatever that happens to be.  On Linux and Mac, the local 8-bit encoding is pretty much guaranteed to be utf-8.  On Windows, if you're in North America or western Europe, its latin1 or more precisely Windows-1252.
The reason this is so is that the IO classes (readers, writers, etc) simply use 8-bit strings filenames etc. when calling system IO functions.  VTK uses ifstream(const char *fname, ...) and let's the system decide how "fname" is encoded.  But this is not consistent across all of the readers, since some readers use third-party libraries to handle the IO and then you're at the mercy of whatever encoding that third-party library uses.
On the display side of things (e.g. when using the VTK text mapper classes, I believe that VTK actually does use utf-8, but I haven't experimented to be sure that all the VTK display classes work the same.
In other words, strings are a bit of a mess in VTK unless you're willing to be satisfied with ASCII.
The vtkUnicodeString is UCS-4 (32-bit code points).
 - David

On Thu, May 10, 2018 at 5:35 PM, Todd via vtk-developers <vtk-developers at vtk.org> wrote:

Can someone please tell me the default/expected encoding for a std::string in VTK. I'm assuming it is UTF8. Therefore I expect vtkUnicodeString (a terrible name) is encoded as UTF16. Is that correct?

  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://vtk.org/pipermail/vtk-developers/attachments/20180511/8f366d2c/attachment.html>


More information about the vtk-developers mailing list