[Insight-developers] Numerical issues with new code

Wed Apr 14 12:37:23 EDT 2010

Luis,

b is zero. This not good?

Bill

On Wed, Apr 14, 2010 at 12:29 PM, Bill Lorensen <bill.lorensen at gmail.com> wrote:
> Actually, you don't need the borland compiler to see the issues. Look
> at the output of one of the tests and you will see many nan's.
>
> I'm curious, who converted the fortran to c++, someone at Stanford?
>
> Bill
>
> On Wed, Apr 14, 2010 at 12:16 PM, Luis Ibanez <luis.ibanez at kitware.com> wrote:
>>
>> Bill,
>>
>> Good call !
>>
>> The Valgrind errors were indeed very enlightening.
>>
>>
>> Here is what seems to happen:
>>
>> The array "w[n]" is a "working space", declared in
>> line 338, of lsqrBase.cxx:
>>
>>                 double * w = new double[n];
>>
>> and only gets to be initialized if the variable "alpha"
>> is larger than zero, in lines 364-368:
>>
>>   if( alpha > zero )
>>     {
>>     this->Scale( n, ( one / alpha ), v );
>>     CopyVector( n, v, w );
>>     }
>>
>> where "alpha" is the norm of he vector "v".
>>
>> In the original Fortran code, this is:
>>
>>     if (alpha > zero) then
>>        call dscal (n, (one/alpha), v, 1)
>>        w = v
>>     end if
>>
>>
>> To put this in context, from the paper:
>> http://www.stanford.edu/group/SOL/software/lsqr/lsqr-toms82a.pdf
>>
>> The solver is dealing with the problem
>>
>>                           A*x = b
>>
>> where the matrix A is known,
>> the vector b is known,
>> and we are looking for the vector x.
>>
>> One of the first things that the solver does
>> is to scale the problem by defining a vector "u"
>> such that
>>
>>                      beta * u = b,
>>
>> and a vector "v" such that
>>
>>                  alpha * v = A' * u.
>>
>> where A' is the transpose of A.
>>
>> So,
>>
>> beta is the norm of the input vector "b",
>>
>> while
>>
>> alpha is the norm of the vector resulting
>> from the product : A' * u.
>>
>>
>> ---
>>
>> This also shows that I mis-translated a condition
>> for early termination, that corresponds to the case
>> of "alpha == 0" or "beta == 0".
>>
>>
>> I'm correction that section at this point....
>>
>>
>>
>>        Luis
>>
>>
>> -------------------------------------------------------------------------------------------------
>> On Wed, Apr 14, 2010 at 8:27 AM, Bill Lorensen <bill.lorensen at gmail.com>
>> wrote:
>>>
>>> Luis,
>>>
>>> Maybe these valgrind errors will help:
>>>
>>> http://www.cdash.org/CDash/viewDynamicAnalysis.php?buildid=586308
>>>
>>> Bill
>>>
>>> On Tue, Apr 13, 2010 at 4:18 PM, Luis Ibanez <luis.ibanez at kitware.com>
>>> wrote:
>>> >
>>> > Hi Bill,
>>> >
>>> > Yeap, that seems to be the case.
>>> >
>>> > I'm updating my Borland build now...
>>> >
>>> >
>>> >     Thanks for pointing this out.
>>> >
>>> >
>>> >           Luis
>>> >
>>> > ---------------------------------------------------
>>> > On Mon, Apr 12, 2010 at 3:15 PM, Bill Lorensen <bill.lorensen at gmail.com>
>>> > wrote:
>>> >>
>>> >> Luis,
>>> >>
>>> >> Looks like there are some numerical problems with the new
>>> >> vnl_lsqr code.
>>> >>
>>> >> Borland is reporting
>>> >> http://www.cdash.org/CDash/viewTest.php?onlyfailed&buildid=584274
>>> >>
>>> >> I think Borland is the only compiler that currently catches floating
>>> >> point divide by zero, etc.
>>> >>
>>> >> This is probably due to a real error in the new code.
>>> >>
>>> >> Bill
>>> >
>>> >
>>
>>
>