[Insight-developers] Error in OnePlusOneEvolutionaryOptimizer (?): fix included

Tue Mar 1 03:39:17 EST 2005

Hello folks,

I think there's an error in the OnePlusOneEvolutionaryOptimizer class 
that prevents it from ever converging.

The quick summary is that the convergence criteria of this optimizer is 
whether the frobenius norm of the "A" matrix dips below a specified 
threshold. The norm of the A matrix is supposed to decrease when the 
optimizer takes a unfavorable step. After many unfavorable trial steps 
around the current point (which suggests that that the current point is 
optimal), the frobenius norm of A should be small enough to cause the 
optimizer to declare convergence.

This decrease is controlled by (1 - ShrinkFactor), where ShrinkFactor 
is generally a number between zero and one. So, a negative value of (1 
- ShrinkFactor) leads to a decrease in the frobenius norm of the 
matrix.

Unfortunately, the code appears to square the (1 - ShrinkFactor) value 
by multiplying it in twice in separate places, so it can never be 
negative! This prevents the norm of A from ever decreasing, and thus 
prevents the optimizer from ever converging.

I've gone over the code and the paper describing the 1+1ES strategy 
(http://www.cs.unc.edu/~styner/docs/tmi00.pdf : Parametric estimate of 
intensity inhomogeneities applied to MRI, Appendix B), and I am 
reasonably confident that I am correct in this.

Here's the code in question (lines 205 to 226 of 
itkOnePlusOneEvolutionaryOptimizer.cxx; note that f_norm is a sample 
from a gaussian, and not the frobenius norm I spoke of earlier):

     // A := alpha*x*y' + A,
     // where y' = transpose(y)
     // where alpha is a scalar, x is an m element vector, y is an n 
element
     // vector and A is an m by n matrix.
     // x = A * f_norm , y = f_norm, alpha = (adjust - 1) / 
Blas_Dot_Prod(
     // f_norm, f_norm)

     //A = A + (adjust - 1.0) * A ;
     double alpha = ((adjust - 1.0) / dot_product(f_norm, f_norm)) ;
     for ( unsigned int c = 0 ; c < spaceDimension ; c++ )
       {
       double temp = alpha * f_norm[c] ;
       if ( f_norm[c] == 0.0 )
         {
         temp = 1.0 ;
         }

       for (unsigned int r = 0 ; r < spaceDimension ; r++)
         {
         A(c, r) += alpha * delta[r] * temp ;
         }
       }

As you may note, alpha gets multiplied in *twice*, which is not what 
the comments suggest, nor what is described by the paper. By dropping 
one of the alphas, I was able to get the optimizer to converge, which 
was simply not happening before. (I charted frobenius norm versus time 
for a variety of optimizer runs. Without the fixes, these values were 
monotonically increasing. With this fix, the norm increases and 
decreases as intended.)

I'm also not sure what the business with setting temp to 1 if f_norm[c] 
is zero is all about -- that doesn't seem to be noted in the paper or 
the comments, and seems at some variance with the intent of this step, 
which is to expand/contract the search along the directions that were 
fruitless/fruitful; if f_norm[c] is zero, then no searching was done in 
that direction, and that dimension should not be expanded or 
contracted. However, there may be some reason that that code exists. 
Does anyone know why? If there's no good reason for it, the entire for 
loop could be condensed as follows:
     for ( unsigned int c = 0 ; c < spaceDimension ; c++ )
       {
       for (unsigned int r = 0 ; r < spaceDimension ; r++)
         {
         A(c, r) += alpha * delta[r] * f_norm[c] ;
         }
       }

Zach Pincus

Department of Biochemistry and Program in Biomedical Informatics
Stanford University School of Medicine