[cmake-developers] [PATCH] [RFC] handle c dependicies for files with utf-8 BOM

Brad King brad.king at kitware.com
Mon Oct 14 11:02:18 EDT 2013


On 10/14/2013 10:47 AM, clinton at elemtech.com wrote:
> The patch appears to not handle empty files or files with less than 3
> characters.  Does it need to?
[snip]
> ----- Reply message -----
> From: "Evgeniy Dushistov" <dushistov at mail.ru>
> Here (in attachment) is possible solution of this problem (it passes all
> tests, except two, but they fails and without this patch).

Incidentally I was just recently working on a fix to read CMake source
files with a leading BOM.  See below for a draft function to read a BOM.
This is just work-in-progress, but it could be reviewed and ported to
C++ streams.

-Brad


#include <stdio.h>

enum cmBOM_e
{
  cmBOM_None,
  cmBOM_UTF8,
  cmBOM_UTF16BE,
  cmBOM_UTF16LE,
  cmBOM_UTF32BE,
  cmBOM_UTF32LE
};
typedef enum cmBOM_e cmBOM;
static cmBOM cmBOM_Read(FILE* f)
{
  unsigned char b[2];
  if(fread(b, 1, 2, f) == 2)
    {
    if(b[0] == 0xEF && b[1] == 0xBB)
      {
      if(fread(b, 1, 1, f) == 1 && b[0] == 0xBF)
        {
        return cmBOM_UTF8;
        }
      }
    else if(b[0] == 0xFE && b[1] == 0xFF)
      {
      return cmBOM_UTF16BE;
      }
    else if(b[0] == 0 && b[1] == 0)
      {
      if(fread(b, 1, 2, f) == 2 && b[0] == 0xFE && b[1] == 0xFF)
        {
        return cmBOM_UTF32BE;
        }
      }
    else if(b[0] == 0xFF && b[1] == 0xFE)
      {
      fpos_t p;
      fgetpos(f, &p);
      if(fread(b, 1, 2, f) == 2 && b[0] == 0 && b[1] == 0)
        {
        return cmBOM_UTF32LE;
        }
      fsetpos(f, &p);
      return cmBOM_UTF16LE;
      }
    }
  rewind(f);
  return cmBOM_None;
}



More information about the cmake-developers mailing list