[Insight-developers] Re: SPSA Optimizer
Stefan Klein
stefan at isi.uu.nl
Thu Mar 24 06:55:50 EST 2005
Hi,
>If we refer to the classical gradient descent optimizer,
>http://www.itk.org/cgi-bin/viewcvs.cgi/Code/Numerics/itkGradientDescentOptimizer.cxx?root=Insight&view=markup
>it seems there is to phases. In the first one, the gradient is computed
>without taking into account the scales
> m_CostFunction->GetValueAndDerivative(
> this->GetCurrentPosition(), m_Value, m_Gradient );
>In the second phase, the scales are applied to the gradient before
>updating the parameters
> transformedGradient[j] = m_Gradient[j] / scales[j];
Yep, that was also my first thought. In my first implementation I did it
like this.
>In your implementation, you take into account the scales in the first
>phase (gradient computation) when applying the perturbation
> m_Delta[j] /= sqrt(scales[j]);
This is what Daniel did in his implementation. And I think it is correct.
Take for example the rigid registration problem. Increasing the rotation
with an angle of 0.1 rad would have a big influence on the similarity
measure. However, increasing the x-translation with 0.1 mm would only have
a small influence on the similarity measure. This would mean that the
"f(thetaplus) - f(thetamin)" would be entirely dominated by the
perturbation of the rotation. This is not good for the optimisation process
i think.
Now suppose that the rotation is already close to its optimum. This means
that "f(thetaplus) - f(thetamin)" would be very small, because the
translation is only perturbed 0.1 mm.
Moreover, suppose we can only obtain noisy measurements of our function (as
is assumed in the SPSA). So, instead of measuring f(theta) directly, we
measure:
F(theta) = f(theta) + epsilon
with epsilon normally distributed N(\mu, \sigma).
'F(thetaplus) - F(thetamin)' would become extremely sensitive for noise
now. Normally we would increase c to be able to cope with high noise, but,
this would cause the whole perturbation vector to become larger. For
example, instead of 0.1 mm, we would try a perturbation of 0.5 mm, and
instead of 0.1 rad we would perturb the rotation with 0.5 rad. This is way
too much for the rotation, and would make the registration fail.
Groeten!
Stefan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.itk.org/mailman/private/insight-developers/attachments/20050324/363116b0/attachment.htm
More information about the Insight-developers
mailing list