[Insight-users] moving image and memory

Fri Jul 9 08:15:59 EDT 2004

Hi Luis,

first of all, thanks for your reply.

Of course I was thinking about computing some kind of bounding box as 
well. However, I then thought that some sophisticated memory handling 
would be more appropriate. First of all, it's rather difficult to 
estimate the size of the security margin. If the region should be the 
same during the whole registration run and if we consider that there 
might be regions where the region covered by the initial transform and 
the region covered by the final transform don't even overlap, I can't 
really think about a good way to estimate the security margin.

Calculating the region for every iteration and reloading it every time 
is probably not quite efficient with respect to performance. Only 
loading that part from disk that has changed from iteration n-1 to 
iteration n is rather difficult, because it can't just be "appended" to 
the image (or can it?). So I would have to allocate a new region for the 
image and to copy the old and the new data (i.e the data loaded from 
disk in iteration n) in there. This is, however, time consuming as well...

Therefore I thought about using some caching strategy. Caching (between 
harddisk and memory) is already provided by the operating system in the 
form of demand paging. However, I guess this won't work well with our 
images as I've explained in the first mail. Therefore I guess it's 
better to implement some caching methods either independent from or 
supporting the OS paging.

Here some ideas that I've had concerning the problem:

1) A metric takes a fixed and a moving image mask. In my opinion it 
would be reasonable to have an option that ensures that the requested 
region of the fixed and the moving image is no larger than the bounding 
box of this mask. That way, a user has a chance to control the region of 
the moving image that has to be in memory. Maybe it's better to add a 
member "moving image region" that does this task (in order not to misuse 
the mask). Of course, the same can be reached by the user by inserting 
some filter that does this job right before connecting the pipeline to 
the registration method.

2) Concerning caching I thought about the following:
If I'm not mistaken, all metric functions in the basic registration 
framework indirectely access the moving image by the GetPixel() method. 
Indirectely, because it's acctually the interpolator that does it (in 
the case of the bspline interpolator not the moving but the coefficient 
image is accessed, as far as I've seen). If this is not correct, please 
let me know. I didn't go into too much detail yet, so I might be wrong. 
Considering this way of accessing pixels, it would be possible to write 
an image class that stores its data in blocks rather than row by row:

 -------------------
¦ A  ¦ B  ¦ C  ¦ D  ¦
¦1 2 ¦5 6 ¦... ¦    ¦
¦3 4 ¦7 8 ¦    ¦    ¦
¦-------------------¦
¦... ¦    ¦    ¦    ¦
¦    ¦    ¦    ¦    ¦
¦    ¦    ¦    ¦    ¦
¦-------------------¦
¦    ¦    ¦    ¦    ¦
¦    ¦    ¦    ¦    ¦
¦    ¦    ¦    ¦    ¦
 -------------------

It shouldn't be too much work to write a GetPixel() method that can 
access such a data structure. The advantage of such a structure is, that 
regions that are close to each other in the image are also close to each 
other in memory.

Using such a structure, I see basically see two ways to improve the 
situation. A first way is to rely on the OS paging. Since, less pages 
are needed to store a (typical) region of interest, paging would 
probably work much better with such a structure than with the original 
structure. Maybe a memory mapped file could be used to store the image 
on disk and maybe a more efficient ordering of the blocks (than A, B, C, 
...) could further increase locality of disk accesses.

Another option would be to store the blocks separately, that is not at 
contiguous memory locations. This would make it easier to dynamically 
load (and append) regions to the image. If a pixel on a block that is 
not in main memory is accessed (by GetPixel()), this block could then be 
loaded from disk and if a block has not been accessed for a long time 
the memory could be freed.

Do you see any major problems connected with such a data structure or 
any of the proposed options to use it? Of course, the GetPixel() method 
would become a bit less efficient than the GetPixel() method in 
itk::Image. However the lower number of disk accesses hopefully more 
than outweighs this.

Thanks for any comments,

Michael

Luis Ibanez wrote:

>
> Hi Michael,
>
> A naive option is to compute the bounding box of the projection
> of the fixed image onto the moving image given the current transform.
>
> Then add some security margins to that bounding box in order to
> define a region and use that region for extracting that piece of the
> moving image.
>
> If you are doing this in a distributed environment you could probably
> have one or several machines dedicated to providing the right subregions
> of the  moving image to the machines that are actually computing
> contributions to the image metric.
>
> This naive method will not be very effective if your transform has
> rotations with large angles. In that case you may even save some time
> by creating an intermediate resampled version of the moving image
> in order to get a better aligned dataset. At the end for the
> registration you should compensate for the transfrom used for creating
> the intermediate resampled image.
>
>
> We will be quite interested to hear back from you regarding the
> solution you find to this problem, since this situation will become
> more and more common as dataset sizes grow bigger.
>
>
>  Regards,
>
>
>    Luis
>
>
> --------------
> Michael wrote:
>
>>  Hi,
>>
>> I'm thinking about the following problem... In a registration task, 
>> there might be a fixed image that is much smaller than the 
>> corresponding moving image, i.e. it contains only a part of what is 
>> represented by the moving image. In this case, it's obviously a waste 
>> of memory to keep the entire moving image in memory during the whole 
>> registration run, since during each iteration only a small part is 
>> accessed.
>>
>> In my case (which differs slightly from above explanation), I want to 
>> calculate partial results of the metric value based on parts of the 
>> fixed image (similar as done in the ThreadedGetValue of the 
>> MatchCardinalityImageToImageMetric) on different computers. The case 
>> is similar to above situation, because a small fixed image (a 
>> subimage of the original fixed image) is mapped to a large moving 
>> image on each host. However, the case differs in that it there might 
>> be parts where the region of the initial mapping (i.e. initial 
>> transform parameters applied) is entirely outside of the region of 
>> the final mapping. Consider for example a 2D rigid transform: For a 
>> slightly wrong angle, a (transformed) region around the center of the 
>> transform is almost identical to the region for the correct angle. If 
>> the region is far away from the center, however, they might not even 
>> overlap. I'd say this situation is rather unusual for a registration 
>> where a small fixed image is registered to a large moving image. In 
>> my opinion, this fact makes it (even more) difficult to estimate the 
>> region of the moving image that will be accessed during the whole 
>> registration run.
>>
>> So, the question is, how is it efficiently possible to keep the right 
>> parts of the moving image in memory? Considering the access 
>> characteristics it seems to me that some kind of "caching" would be 
>> appropriate. That is, keeping those regions in memory that have been 
>> accessed recently and store regions that have not been accessed for a 
>> long time on disk. If a region is accessed and not in main memory, it 
>> would just have to be loaded from disk.
>>
>> Obviously, the operating system provides some support for this 
>> (paging). However, I'm not quite sure whether this support is quite 
>> optimal for the described problem. Consider the following (2D-) image:
>>
>> -----------------------------
>> ¦01 02 03 04 05 06 07 08 09 10¦
>> ¦                             ¦
>> ¦11 12 13 14 15 16 17 18 19 20¦
>> ¦                             ¦
>> ¦21 22 23 24 25 26 27 28 29 30¦
>> ¦                             ¦
>> ¦31 32 33 34 35 36 37 38 39 40¦
>> ¦                             ¦
>> ¦41 42 43 44 45 46 47 48 49 50¦
>> ¦                             ¦
>> ¦51 52 53 54 55 56 57 58 59 60¦
>> ¦                             ¦
>> ¦61 62 63 64 65 66 67 68 69 70¦
>> ¦-----------------------------¦
>>
>> and the indicated subimage:
>>
>> ------------------------------
>> ¦01 02 03 04 05 06 07 08 09 10¦
>> ¦         -----               ¦
>> ¦11 12 13¦14 15¦16 17 18 19 20¦
>> ¦        ¦     ¦              ¦
>> ¦21 22 23¦24 25¦26 27 28 29 30¦
>> ¦        ¦     ¦              ¦
>> ¦31 32 33¦34 35¦36 37 38 39 40¦
>> ¦        ¦     ¦              ¦
>> ¦41 42 43¦44 45¦46 47 48 49 50¦
>> ¦        ¦     ¦              ¦
>> ¦51 52 53¦54 55¦56 57 58 59 60¦
>> ¦         -----               ¦
>> ¦61 62 63 64 65 66 67 68 69 70¦
>> ¦-----------------------------¦
>>
>> I expect that the data is stored as a contiugous block in the order 
>> of the numbers in the pictures.
>>
>> Say one of the numbers in above images represents x bytes and we have 
>> a page size of 8*x bytes. In order to keep the indicated area in 
>> memory, we have to have 5 pages (one for each line of the region of 
>> interest) in main memory, that is 5*8*x bytes. However, we only 
>> require 5*2*x bytes (size of the region), which is 4 times less.
>>
>> Another problem with the OS support is, how can I avoid that large 
>> amounts of swap-space are used to store data that might never be 
>> accessed? Maybe the image could be stored on disk as an ordinary file 
>> instead of having (parts of) it swapped out on disk. That way swap 
>> space is free for other processes. Are there any reasons that argue 
>> for having the data in swap space instead of having it in an ordinary 
>> file?
>>
>> Because of above considerations, I think that it might be better to 
>> manually (that is, not by the OS) control which blocks have to be 
>> stored on disk and which ones have to be kept in memory. This would 
>> maybe even allow to apply some prediction methods (that might be more 
>> appropriate than those of the OS).
>>
>> Is there some built-in support in ITK, that helps to solve this 
>> problem (e.g. a way to store an image in blocks that can be loaded 
>> from disk if necessary)? Could maybe memory mapped files help to 
>> provide optimal access to the image data (it seems to me that the 
>> problem about page size remains...)? If yes, is there some support 
>> for this in ITK? Does anybody see a better solution than what I've 
>> called "caching" or any fundamental problems with that approach?
>>
>> Any comments are appreciated.
>>
>> Thanks,
>>
>> Michael
>>
>> _______________________________________________
>> Insight-users mailing list
>> Insight-users at itk.org
>> http://www.itk.org/mailman/listinfo/insight-users
>>
>
>
>
> _______________________________________________
> Insight-users mailing list
> Insight-users at itk.org
> http://www.itk.org/mailman/listinfo/insight-users
>
>