[Insight-developers] Error in OnePlusOneEvolutionaryOptimizer (?):
fix included
Zachary Pincus
zpincus at stanford.edu
Tue Mar 1 03:39:17 EST 2005
Hello folks,
I think there's an error in the OnePlusOneEvolutionaryOptimizer class
that prevents it from ever converging.
The quick summary is that the convergence criteria of this optimizer is
whether the frobenius norm of the "A" matrix dips below a specified
threshold. The norm of the A matrix is supposed to decrease when the
optimizer takes a unfavorable step. After many unfavorable trial steps
around the current point (which suggests that that the current point is
optimal), the frobenius norm of A should be small enough to cause the
optimizer to declare convergence.
This decrease is controlled by (1 - ShrinkFactor), where ShrinkFactor
is generally a number between zero and one. So, a negative value of (1
- ShrinkFactor) leads to a decrease in the frobenius norm of the
matrix.
Unfortunately, the code appears to square the (1 - ShrinkFactor) value
by multiplying it in twice in separate places, so it can never be
negative! This prevents the norm of A from ever decreasing, and thus
prevents the optimizer from ever converging.
I've gone over the code and the paper describing the 1+1ES strategy
(http://www.cs.unc.edu/~styner/docs/tmi00.pdf : Parametric estimate of
intensity inhomogeneities applied to MRI, Appendix B), and I am
reasonably confident that I am correct in this.
Here's the code in question (lines 205 to 226 of
itkOnePlusOneEvolutionaryOptimizer.cxx; note that f_norm is a sample
from a gaussian, and not the frobenius norm I spoke of earlier):
// A := alpha*x*y' + A,
// where y' = transpose(y)
// where alpha is a scalar, x is an m element vector, y is an n
element
// vector and A is an m by n matrix.
// x = A * f_norm , y = f_norm, alpha = (adjust - 1) /
Blas_Dot_Prod(
// f_norm, f_norm)
//A = A + (adjust - 1.0) * A ;
double alpha = ((adjust - 1.0) / dot_product(f_norm, f_norm)) ;
for ( unsigned int c = 0 ; c < spaceDimension ; c++ )
{
double temp = alpha * f_norm[c] ;
if ( f_norm[c] == 0.0 )
{
temp = 1.0 ;
}
for (unsigned int r = 0 ; r < spaceDimension ; r++)
{
A(c, r) += alpha * delta[r] * temp ;
}
}
As you may note, alpha gets multiplied in *twice*, which is not what
the comments suggest, nor what is described by the paper. By dropping
one of the alphas, I was able to get the optimizer to converge, which
was simply not happening before. (I charted frobenius norm versus time
for a variety of optimizer runs. Without the fixes, these values were
monotonically increasing. With this fix, the norm increases and
decreases as intended.)
I'm also not sure what the business with setting temp to 1 if f_norm[c]
is zero is all about -- that doesn't seem to be noted in the paper or
the comments, and seems at some variance with the intent of this step,
which is to expand/contract the search along the directions that were
fruitless/fruitful; if f_norm[c] is zero, then no searching was done in
that direction, and that dimension should not be expanded or
contracted. However, there may be some reason that that code exists.
Does anyone know why? If there's no good reason for it, the entire for
loop could be condensed as follows:
for ( unsigned int c = 0 ; c < spaceDimension ; c++ )
{
for (unsigned int r = 0 ; r < spaceDimension ; r++)
{
A(c, r) += alpha * delta[r] * f_norm[c] ;
}
}
Zach Pincus
Department of Biochemistry and Program in Biomedical Informatics
Stanford University School of Medicine
More information about the Insight-developers
mailing list