[Insight-developers] Spaces in filenames for metaImage regular expression

Padfield, Dirk R (GE Global Research) padfield at research.ge.com
Mon Dec 3 13:58:40 EST 2012


Hi Mark,

Thanks a lot for your thoughts.  I would be happy to change the double-quotes to backslashes instead.  But, on further reflection, this will cause problems when the referenced file is in a different location than the mhd file and one therefore needs to include the path in the filename.  Doing this is currently perfectly valid using both forward and backward slashes.  For example, if the list of files is in a subdirectory of the mhd file, you would write the following in the mhd file:

ElementDataFile = Test\myFileName%02d.tif 20 22 1

Where the subdirectory is called "Test".  This currently works, but it would break if we used backslash as an escape character.

Enabling regular expressions with filenames with spaces in them is very useful especially when a large number of files need to be read.  Rather than having to write a list for all of the files or convert the files first to some other format, a simple mhd file with a regular expression can read them all correctly.  Lists of files with spaces in the names is common, especially for the output of many microscopes.  Also, filenames with spaces are legal when they are in a LIST, so it makes sense for this also to be possible when using a regular expression.

I have another solution that will not affect any backward compatibility.  Instead of changing the metaUtils.cxx code, we can just let it read all arguments and separate them by spaces.  In metaImage.cxx, when considering regular expressions (by searching for "%") there are already checks for how many arguments are extracted.  This code expects between 1 and 4 arguments representing filename, minV, maxV, and stepV.  If the number of arguments is less than 4, it provides defaults for the remaining arguments.  We could add one more check to this code to determine whether there are more than 4 arguments; in this case, it means that the filename had spaces and those spaces need to be joined back together into the correct filename.  This would require that the user always provide the three optional minV, maxV, and stepV arguments when using a filename that has spaces, but that is not a big price to pay, especially since, without this change, it is not possible to read files with spaces in the names at all.  And this approach will not affect any backward compatibility.

What do others think?  I am happy to go with the group consensus.

Thanks,
Dirk

________________________________
From: Mark Tsuchida [marktsuchida at gmail.com]
Sent: Wednesday, November 21, 2012 8:42 PM
To: Bill Lorensen
Cc: Padfield, Dirk R (GE Global Research); insight-developers at itk.org
Subject: Re: [Insight-developers] Spaces in filenames for metaImage regular expression

On Wed, Nov 21, 2012 at 2:10 PM, Padfield, Dirk R (GE Global Research)
<padfield at research.ge.com<mailto:padfield at research.ge.com>> wrote:
> Before the “MET_StringToWordArray”
> parsed words based only on spaces, but now if it sees a double quote it
> ignores spaces until it sees the next double quote.  Surrounding the
> filename with quotes is consistent with how executables handle file names
> with spaces.

I would point out that this would cause problems if anybody ever needed to handle filenames containing quote characters (which is perfectly legal on most systems).

I am just an occasional user of MetaIO and I don't know if the core developers feel that such a level of generality is necessary, but (short of implementing a complicated lexer for a C-like or shell-like quoted string syntax) one simple and general solution would be to use backslash as an escape character ("\ " = non-delimiting space, "\\" = single backslash, "\x" = "x" for any character x, all without the double quotes). This would also be consistent with one way in which file names with spaces and other special characters are handled on the (Unix) command line. The problem does remain of what to do if a backslash is encountered at the end of the (already split out) line.

But then again, it might be considered weird that "\%" would behave just like "%"; there would still be no generally applicable way to handle filenames containing a "%" (a quick look at the code (metaImage.cxx) suggests that "50%dose.dat" would behave unexpectedly, as would "50%%dose.dat" and "50%%section.dat", while "50%section.dat" would result in a segfault). There may or may not be a (sufficiently) backward-compatible way to fix all of these issues.

Mark



On Wed, Nov 21, 2012 at 11:57 AM, Bill Lorensen <bill.lorensen at gmail.com<mailto:bill.lorensen at gmail.com>> wrote:
Dirk,

If you do submit a patch, beware that you will not be able to merge it
since it is in the ThirdParty metaio tree. This is not a problem for
reviewers, just that someone from Kitware who has the permission to do
so will have to merge it. This is because meta io is used in several
packages other than ITK.

Looks like a good idea. You should probbaly add Stephen Aylward as a reviewer.

Bill

On Wed, Nov 21, 2012 at 2:10 PM, Padfield, Dirk R (GE Global Research)
<padfield at research.ge.com<mailto:padfield at research.ge.com>> wrote:
> Hi ITK Developers,
>
>
>
> I want to gather your thoughts before submitting a patch.  I frequently read
> lists of 2D images into ITK as a volume, and I use metaImage .mhd files to
> specify the header information.  Unfortunately, my code needs to handle
> images that have spaces in the names, and this doesn’t currently work with
> the regular expression functionality in metaImage.  I end up having to use
> “LIST” instead, but this becomes cumbersome.  So I modified the code of
> Modules\ThirdParty\MetaIO\src\MetaIO\metaUtils.cxx slightly to handle file
> names surrounded by double quotes.  Before the “MET_StringToWordArray”
> parsed words based only on spaces, but now if it sees a double quote it
> ignores spaces until it sees the next double quote.  Surrounding the
> filename with quotes is consistent with how executables handle file names
> with spaces.
>
>
>
> For example, to read the first 90 slices of a series, I can now do this:
>
> ElementDataFile = "B - 2(fld 1 z %02d).tif" 1 90 1
>
>
>
> My question is: does anyone see any problem with making such a change?  I
> just wanted to gather some feedback before submitting the patch.
>
>
>
> Also, if anyone wants me to add them as a reviewer to the patch, please let
> me know.
>
>
>
> Thanks,
>
> Dirk
>
>
> _______________________________________________
> Powered by www.kitware.com<http://www.kitware.com>
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Kitware offers ITK Training Courses, for more information visit:
> http://kitware.com/products/protraining.php
>
> Please keep messages on-topic and check the ITK FAQ at:
> http://www.itk.org/Wiki/ITK_FAQ
>
> Follow this link to subscribe/unsubscribe:
> http://www.itk.org/mailman/listinfo/insight-developers
>



--
Unpaid intern in BillsBasement at noware dot com
_______________________________________________
Powered by www.kitware.com<http://www.kitware.com>

Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html

Kitware offers ITK Training Courses, for more information visit:
http://kitware.com/products/protraining.php

Please keep messages on-topic and check the ITK FAQ at:
http://www.itk.org/Wiki/ITK_FAQ

Follow this link to subscribe/unsubscribe:
http://www.itk.org/mailman/listinfo/insight-developers



More information about the Insight-developers mailing list