[Insight-users] Small problem using LBFGSB

Rupert Brooks rupe.brooks at gmail.com
Wed Apr 30 20:09:32 EDT 2008

Hi Tom,

>  > The line search makes a cubic polynomial fit to the function, and
>  > polynomial fits can be sensitive to erroneous data.  I got around the
>  > problem by making my gradient more well behaved.
>  I understand that, but how did you manage to make your gradient better
>  behaved? Similar to many ITK applications, I use an image similarity
>  metric with linear interpolation for the image interpolation and
>  nearest-neighbor interpolation for the image gradient interpolation. I
>  tried using both a smoothed gradient image and a central difference
>  gradient image to compute the gradient of my cost function but I still
>  have the above problem in some cases. Maybe I should try using  linear
>  interpolation also for the image gradient interpolation...

My case is kind of specific, so i didnt go into details.  But since
you ask :-)  I was computing a gradient
of MI Inverse-compositionally, using Mattes (or Thevenaz-Unser,
originally) formulation.  That formulation is
asymmetric, so the gradient was not quite right.

The solution i found might be more general - at least if your using a
metric with a probability distribution in the middle.
One of my colleagues here has a paper on pre-seeding the histogram
with a prior distribution.  By preseeding with a
small, uniform distribution (so just prefilling the bins) I avoided
having any zero bins in my PDF, and this seemed
to clear up the problem.

The paper in question is
M. Toews, D.L. Collins and T. Arbel, "Maximum a posteriori local
histogram estimation for image registration", Proceedings of the 8th
International Conference on Medical Image Computing and Computer
Assisted Intervention , Lecture Notes in Computer Science, Vol. 3750,
pp. 163, Palm Springs, CA, Oct. 2005.
solving this particular problem wasnt his original intent though.

As for using a cached gradient image with nearest neighbor
interpolation to compute the gradient, i've done a few experiments on
how accurate the gradient is using that.  I find it tends to be biased
a little low - using the DerivativeCalculator as in the
MattesMutualInformation metric is a bit better, but theres still a
small low bias.  I think this is because all these methods inherently
pre-filter with a small Gaussian, which smooths the high-frequencies a
bit.  However, the bias is tiny and doesnt really seem to cause any

> Hessian-based optimizers is definitely something I would like to
> explore. Especially, I would strongly argue to get Gauss-Newton like
> optimizers (e.g. Levenberg-Marquardt, Powell's dog leg, ESM, etc.) to
> work with least-squares like image similarity criteria in ITK (Mean
> squared error, cross-correlation, etc.). These are not strictly
> speaking Hessian-based optimizers but can be seen as
> pseudo-Hessian-based and could be developed in an Hessian-based
> optimizer API.

In fact, I've been working on exactly this kind of stuff as part of my
thesis - which im in the process of finishing up now.  At the moment,
im pretty swamped with that, but once its done, i'd be happy to


Rupert Brooks
McGill Centre for Intelligent Machines (www.cim.mcgill.ca)
Ph.D Student, Electrical and Computer Engineering

More information about the Insight-users mailing list