[Insight-users] Small problem using LBFGSB
Rupert Brooks
rupe.brooks at gmail.com
Wed Apr 30 20:09:32 EDT 2008
Hi Tom,
> > The line search makes a cubic polynomial fit to the function, and
> > polynomial fits can be sensitive to erroneous data. I got around the
> > problem by making my gradient more well behaved.
>
> I understand that, but how did you manage to make your gradient better
> behaved? Similar to many ITK applications, I use an image similarity
> metric with linear interpolation for the image interpolation and
> nearest-neighbor interpolation for the image gradient interpolation. I
> tried using both a smoothed gradient image and a central difference
> gradient image to compute the gradient of my cost function but I still
> have the above problem in some cases. Maybe I should try using linear
> interpolation also for the image gradient interpolation...
My case is kind of specific, so i didnt go into details. But since
you ask :-) I was computing a gradient
of MI Inverse-compositionally, using Mattes (or Thevenaz-Unser,
originally) formulation. That formulation is
asymmetric, so the gradient was not quite right.
The solution i found might be more general - at least if your using a
metric with a probability distribution in the middle.
One of my colleagues here has a paper on pre-seeding the histogram
with a prior distribution. By preseeding with a
small, uniform distribution (so just prefilling the bins) I avoided
having any zero bins in my PDF, and this seemed
to clear up the problem.
The paper in question is
M. Toews, D.L. Collins and T. Arbel, "Maximum a posteriori local
histogram estimation for image registration", Proceedings of the 8th
International Conference on Medical Image Computing and Computer
Assisted Intervention , Lecture Notes in Computer Science, Vol. 3750,
pp. 163, Palm Springs, CA, Oct. 2005.
solving this particular problem wasnt his original intent though.
As for using a cached gradient image with nearest neighbor
interpolation to compute the gradient, i've done a few experiments on
how accurate the gradient is using that. I find it tends to be biased
a little low - using the DerivativeCalculator as in the
MattesMutualInformation metric is a bit better, but theres still a
small low bias. I think this is because all these methods inherently
pre-filter with a small Gaussian, which smooths the high-frequencies a
bit. However, the bias is tiny and doesnt really seem to cause any
problems.
> Hessian-based optimizers is definitely something I would like to
> explore. Especially, I would strongly argue to get Gauss-Newton like
> optimizers (e.g. Levenberg-Marquardt, Powell's dog leg, ESM, etc.) to
> work with least-squares like image similarity criteria in ITK (Mean
> squared error, cross-correlation, etc.). These are not strictly
> speaking Hessian-based optimizers but can be seen as
> pseudo-Hessian-based and could be developed in an Hessian-based
> optimizer API.
In fact, I've been working on exactly this kind of stuff as part of my
thesis - which im in the process of finishing up now. At the moment,
im pretty swamped with that, but once its done, i'd be happy to
discuss.
Regards,
Rupert
--
--------------------------------------------------------------
Rupert Brooks
McGill Centre for Intelligent Machines (www.cim.mcgill.ca)
Ph.D Student, Electrical and Computer Engineering
http://www.cyberus.ca/~rbrooks
More information about the Insight-users
mailing list