[Insight-developers] Numerical issues with new code

Wed Apr 14 12:47:01 EDT 2010

Luis,

You may need to handle the trivial solution when b = 0. I think the
old code did that.

Bill

On Wed, Apr 14, 2010 at 12:37 PM, Bill Lorensen <bill.lorensen at gmail.com> wrote:
> Luis,
>
> b is zero. This not good?
>
> Bill
>
> On Wed, Apr 14, 2010 at 12:29 PM, Bill Lorensen <bill.lorensen at gmail.com> wrote:
>> Actually, you don't need the borland compiler to see the issues. Look
>> at the output of one of the tests and you will see many nan's.
>>
>> I'm curious, who converted the fortran to c++, someone at Stanford?
>>
>> Bill
>>
>> On Wed, Apr 14, 2010 at 12:16 PM, Luis Ibanez <luis.ibanez at kitware.com> wrote:
>>>
>>> Bill,
>>>
>>> Good call !
>>>
>>> The Valgrind errors were indeed very enlightening.
>>>
>>>
>>> Here is what seems to happen:
>>>
>>> The array "w[n]" is a "working space", declared in
>>> line 338, of lsqrBase.cxx:
>>>
>>>                 double * w = new double[n];
>>>
>>> and only gets to be initialized if the variable "alpha"
>>> is larger than zero, in lines 364-368:
>>>
>>>   if( alpha > zero )
>>>     {
>>>     this->Scale( n, ( one / alpha ), v );
>>>     CopyVector( n, v, w );
>>>     }
>>>
>>> where "alpha" is the norm of he vector "v".
>>>
>>> In the original Fortran code, this is:
>>>
>>>     if (alpha > zero) then
>>>        call dscal (n, (one/alpha), v, 1)
>>>        w = v
>>>     end if
>>>
>>>
>>> To put this in context, from the paper:
>>> http://www.stanford.edu/group/SOL/software/lsqr/lsqr-toms82a.pdf
>>>
>>> The solver is dealing with the problem
>>>
>>>                           A*x = b
>>>
>>> where the matrix A is known,
>>> the vector b is known,
>>> and we are looking for the vector x.
>>>
>>> One of the first things that the solver does
>>> is to scale the problem by defining a vector "u"
>>> such that
>>>
>>>                      beta * u = b,
>>>
>>> and a vector "v" such that
>>>
>>>                  alpha * v = A' * u.
>>>
>>> where A' is the transpose of A.
>>>
>>> So,
>>>
>>> beta is the norm of the input vector "b",
>>>
>>> while
>>>
>>> alpha is the norm of the vector resulting
>>> from the product : A' * u.
>>>
>>>
>>> ---
>>>
>>> This also shows that I mis-translated a condition
>>> for early termination, that corresponds to the case
>>> of "alpha == 0" or "beta == 0".
>>>
>>>
>>> I'm correction that section at this point....
>>>
>>>
>>>
>>>        Luis
>>>
>>>
>>> -------------------------------------------------------------------------------------------------
>>> On Wed, Apr 14, 2010 at 8:27 AM, Bill Lorensen <bill.lorensen at gmail.com>
>>> wrote:
>>>>
>>>> Luis,
>>>>
>>>> Maybe these valgrind errors will help:
>>>>
>>>> http://www.cdash.org/CDash/viewDynamicAnalysis.php?buildid=586308
>>>>
>>>> Bill
>>>>
>>>> On Tue, Apr 13, 2010 at 4:18 PM, Luis Ibanez <luis.ibanez at kitware.com>
>>>> wrote:
>>>> >
>>>> > Hi Bill,
>>>> >
>>>> > Yeap, that seems to be the case.
>>>> >
>>>> > I'm updating my Borland build now...
>>>> >
>>>> >
>>>> >     Thanks for pointing this out.
>>>> >
>>>> >
>>>> >           Luis
>>>> >
>>>> > ---------------------------------------------------
>>>> > On Mon, Apr 12, 2010 at 3:15 PM, Bill Lorensen <bill.lorensen at gmail.com>
>>>> > wrote:
>>>> >>
>>>> >> Luis,
>>>> >>
>>>> >> Looks like there are some numerical problems with the new
>>>> >> vnl_lsqr code.
>>>> >>
>>>> >> Borland is reporting
>>>> >> http://www.cdash.org/CDash/viewTest.php?onlyfailed&buildid=584274
>>>> >>
>>>> >> I think Borland is the only compiler that currently catches floating
>>>> >> point divide by zero, etc.
>>>> >>
>>>> >> This is probably due to a real error in the new code.
>>>> >>
>>>> >> Bill
>>>> >
>>>> >
>>>
>>>
>>
>