[Insight-users] Parameter scales for registration (second try)

Tue May 7 17:06:49 EDT 2013

we did not implement a RegularStep...Optimizerv4

that would be fairly trivial to implement in v4 - perhaps you are
interested in contributing ?

brian

On Tue, May 7, 2013 at 3:42 PM, Joël Schaerer <joel.schaerer at gmail.com>wrote:

>  Hi Brian,
>
> I did read this code, but if I understand correctly,
> ModifyGradientByScales does the equivalent of the first scaling that was
> already done in the old itkRegularStepGradientOptimizer.
>
> However the whole point of my post is that it would make sense (I think!)
> to apply the scales *again* after adjusting the gradient for step size (or
> "learning rate"), which doesn't seem to be done in this code either.
>
> Note that in itkGradientDescentOptimizerv4, the gradient isn't normalized,
> so I don't think my change is needed. But if you implemented a
> RegularStepGradientOptimizerv4, I think re-applying the scales after
> scaling the gradient would be a good thing.
>
> joel
>
>
>
> On 05/07/2013 06:07 PM, brian avants wrote:
>
> yes - there is something you are missing.   read the code below:
>
>    /* Begin threaded gradient modification.
>     * Scale by gradient scales, then estimate the learning
>    * rate if options are set to (using the scaled gradient),
>     * then modify by learning rate. The m_Gradient variable
>    * is modified in-place. */
>   this->ModifyGradientByScales();
>   this->EstimateLearningRate();
>   this->ModifyGradientByLearningRate();
>
>  the call to    this->ModifyGradientByScales();   changes the gradient
> according the scales, as the name suggests.   the v4 optimizers all behave
> in this general manner although this is taken from the gradient descent
> class.
>
>  so - the transform expects that the update was already modified by the
> scales so that the only thing the transform needs to do ( if anything at
> all) is multiply by a scalar
>
>  also, you are looking at the base class which is only used if the
> derived class did not implement UpdateTransformParameters.  for instance,
> the GaussianDisplacementField transform will also smooth the parameters
> when this function is called.
>
>  is this clear enough?
>
>
>
>  brian
>
>
>
>
> On Tue, May 7, 2013 at 11:59 AM, Joël Schaerer <joel.schaerer at gmail.com>wrote:
>
>>  I spent a while looking at the v4 optimization framework. I can follow
>> your reasoning until UpdateTransformParameters is called on the transform.
>> However, at this state, the old scaling is still done:
>>
>> itkTransform.hxx:
>>   if( factor == 1.0 )
>>     {
>>     for( NumberOfParametersType k = 0; k < numberOfParameters; k++ )
>>       {
>>       this->m_Parameters[k] += update[k];
>>       }
>>     }
>>   else
>>     {
>>     for( NumberOfParametersType k = 0; k < numberOfParameters; k++ )
>>       {
>>       this->m_Parameters[k] += update[k] * factor;
>>       }
>>     }
>>
>> which makes sense, since parameters scales are an optimizer concept that
>> transforms know nothing about.
>>
>> So (if I understand correctly), the code has been shuffled around quite a
>> bit, but the behavior is still the same.
>>
>> Is there something I'm missing?
>>
>> joel
>>
>>
>>
>> On 07/05/2013 16:40, brian avants wrote:
>>
>> also - to take away a bit of the "mystery" surrounding v4 optimization,
>> let's see how the gradient descent AdvanceOneStep function works:
>>
>>  void
>> GradientDescentOptimizerv4
>> ::AdvanceOneStep()
>> {
>>   itkDebugMacro("AdvanceOneStep");
>>
>>    /* Begin threaded gradient modification.
>>    * Scale by gradient scales, then estimate the learning
>>    * rate if options are set to (using the scaled gradient),
>>    * then modify by learning rate. The m_Gradient variable
>>    * is modified in-place. */
>>   this->ModifyGradientByScales();
>>   this->EstimateLearningRate();
>>   this->ModifyGradientByLearningRate();
>>
>>    try
>>     {
>>     /* Pass graident to transform and let it do its own updating */
>>     this->m_Metric->UpdateTransformParameters( this->m_Gradient );
>>     }
>>   catch ( ExceptionObject & )
>>     {
>>     this->m_StopCondition = UPDATE_PARAMETERS_ERROR;
>>     this->m_StopConditionDescription << "UpdateTransformParameters error";
>>     this->StopOptimization();
>>
>>      // Pass exception to caller
>>     throw;
>>     }
>>
>>    this->InvokeEvent( IterationEvent() );
>> }
>>
>>
>>  i hope this does not look too convoluted.  then the base metric class
>> does this:
>>
>>  template<unsigned int TFixedDimension, unsigned int TMovingDimension,
>> class TVirtualImage>
>> void
>> ObjectToObjectMetric<TFixedDimension, TMovingDimension, TVirtualImage>
>> ::UpdateTransformParameters( const DerivativeType & derivative,
>> ParametersValueType factor )
>> {
>>   /* Rely on transform::UpdateTransformParameters to verify proper
>>    * size of derivative */
>>   this->m_MovingTransform->UpdateTransformParameters( derivative, factor
>> );
>> }
>>
>>
>>  so the transform parameters should be updated in a way that is
>> consistent with:
>>
>>  newPosition[j] = currentPosition[j] + transformedGradient[j] * factor /
>> scales[j];
>>
>>  factor defaults to 1 ....  anyway, as you can infer from the above
>> discussion, even the basic gradient descent optimizer can be used to take "
>> regular steps "  if you want.
>>
>>
>>
>>  brian
>>
>>
>>
>>
>> On Tue, May 7, 2013 at 10:23 AM, brian avants <stnava at gmail.com> wrote:
>>
>>> brad
>>>
>>>  did this issue ever go up on jira?  i do remember discussing with you
>>> at a meeting.   our solution is in the v4 optimizers.
>>>
>>>  the trivial additive parameter update doesnt work in more general
>>> cases e.g. when you need to compose parameters with parameter updates.
>>>
>>>  to resolve this limitation, the v4 optimizers pass the update step to
>>> the transformations
>>>
>>>  this implements the idea that  " the transforms know how to update
>>> themselves "
>>>
>>>  there are several other differences, as nick pointed out, that reduce
>>> the need for users to experiment with scales .
>>>
>>>  for basic scenarios like that being discussed by joel, i prefer the
>>> conjugate gradient optimizer with line search.
>>>
>>>  itkConjugateGradientLineSearchOptimizerv4.h
>>>
>>>  when combined with the scale estimators, this leads to registration
>>> algorithms with very few parameters to tune.   1 parameter if you dont
>>> consider multi-resolution.
>>>
>>>
>>>  brian
>>>
>>>
>>>
>>>
>>> On Tue, May 7, 2013 at 9:27 AM, Nick Tustison <ntustison at gmail.com>wrote:
>>>
>>>> Hi Brad,
>>>>
>>>> I certainly don't disagree with Joel's findings.  It seems like a
>>>> good fix which should be put up on gerrit.  There were several
>>>> components that we kept in upgrading the registration framework.
>>>> The optimizers weren't one of them.
>>>>
>>>> Also, could you elaborate a bit more on the "convoluted" aspects
>>>> of parameter advancement?  There's probably a reason for it and
>>>> we could explain why.
>>>>
>>>> Nick
>>>>
>>>>
>>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/pipermail/insight-users/attachments/20130507/45a9cece/attachment-0001.htm>