[Insight-developers] Numerical issues with new code

Wed Apr 14 15:49:44 EDT 2010

Yes,

You are right,

If have now fixed the termination criteria to match what the
original code was doing in the case where:

                 this->Arnorm = alpha * beta   == zero

I just committed the changes (they run fine in the Linux build):

http://public.kitware.com/cgi-bin/viewcvs.cgi/Utilities/vxl/v3p/netlib/linalg/lsqrBase.cxx?root=Insight&r1=1.2&r2=1.3
http://public.kitware.com/cgi-bin/viewcvs.cgi/Utilities/vxl/v3p/netlib/linalg/lsqrBase.h?root=Insight&r1=1.1&r2=1.2

BTW: The process has also revealed that the FEM tests do
not verify the correctness of the numerical outputs,.... so
having the tests passing doesn't really tell us much about
the correctness of the code.

Tom is right in that we should add the original testing of
LSQR code from the Fortran 90 code...

----

If the code passes fine in the Borland compiler, and we
have a Green Dashboard tomorrow, I would like very
much to tag the repository as "ITK 3.18"....

Any objections ?

    Luis

------------------------------------------------------------------------------------------------
On Wed, Apr 14, 2010 at 12:47 PM, Bill Lorensen <bill.lorensen at gmail.com>wrote:

> Luis,
>
> You may need to handle the trivial solution when b = 0. I think the
> old code did that.
>
> Bill
>
> On Wed, Apr 14, 2010 at 12:37 PM, Bill Lorensen <bill.lorensen at gmail.com>
> wrote:
> > Luis,
> >
> > b is zero. This not good?
> >
> > Bill
> >
> > On Wed, Apr 14, 2010 at 12:29 PM, Bill Lorensen <bill.lorensen at gmail.com>
> wrote:
> >> Actually, you don't need the borland compiler to see the issues. Look
> >> at the output of one of the tests and you will see many nan's.
> >>
> >> I'm curious, who converted the fortran to c++, someone at Stanford?
> >>
> >> Bill
> >>
> >> On Wed, Apr 14, 2010 at 12:16 PM, Luis Ibanez <luis.ibanez at kitware.com>
> wrote:
> >>>
> >>> Bill,
> >>>
> >>> Good call !
> >>>
> >>> The Valgrind errors were indeed very enlightening.
> >>>
> >>>
> >>> Here is what seems to happen:
> >>>
> >>> The array "w[n]" is a "working space", declared in
> >>> line 338, of lsqrBase.cxx:
> >>>
> >>>                 double * w = new double[n];
> >>>
> >>> and only gets to be initialized if the variable "alpha"
> >>> is larger than zero, in lines 364-368:
> >>>
> >>>   if( alpha > zero )
> >>>     {
> >>>     this->Scale( n, ( one / alpha ), v );
> >>>     CopyVector( n, v, w );
> >>>     }
> >>>
> >>> where "alpha" is the norm of he vector "v".
> >>>
> >>> In the original Fortran code, this is:
> >>>
> >>>     if (alpha > zero) then
> >>>        call dscal (n, (one/alpha), v, 1)
> >>>        w = v
> >>>     end if
> >>>
> >>>
> >>> To put this in context, from the paper:
> >>> http://www.stanford.edu/group/SOL/software/lsqr/lsqr-toms82a.pdf
> >>>
> >>> The solver is dealing with the problem
> >>>
> >>>                           A*x = b
> >>>
> >>> where the matrix A is known,
> >>> the vector b is known,
> >>> and we are looking for the vector x.
> >>>
> >>> One of the first things that the solver does
> >>> is to scale the problem by defining a vector "u"
> >>> such that
> >>>
> >>>                      beta * u = b,
> >>>
> >>> and a vector "v" such that
> >>>
> >>>                  alpha * v = A' * u.
> >>>
> >>> where A' is the transpose of A.
> >>>
> >>> So,
> >>>
> >>> beta is the norm of the input vector "b",
> >>>
> >>> while
> >>>
> >>> alpha is the norm of the vector resulting
> >>> from the product : A' * u.
> >>>
> >>>
> >>> ---
> >>>
> >>> This also shows that I mis-translated a condition
> >>> for early termination, that corresponds to the case
> >>> of "alpha == 0" or "beta == 0".
> >>>
> >>>
> >>> I'm correction that section at this point....
> >>>
> >>>
> >>>
> >>>        Luis
> >>>
> >>>
> >>>
> -------------------------------------------------------------------------------------------------
> >>> On Wed, Apr 14, 2010 at 8:27 AM, Bill Lorensen <
> bill.lorensen at gmail.com>
> >>> wrote:
> >>>>
> >>>> Luis,
> >>>>
> >>>> Maybe these valgrind errors will help:
> >>>>
> >>>> http://www.cdash.org/CDash/viewDynamicAnalysis.php?buildid=586308
> >>>>
> >>>> Bill
> >>>>
> >>>> On Tue, Apr 13, 2010 at 4:18 PM, Luis Ibanez <luis.ibanez at kitware.com
> >
> >>>> wrote:
> >>>> >
> >>>> > Hi Bill,
> >>>> >
> >>>> > Yeap, that seems to be the case.
> >>>> >
> >>>> > I'm updating my Borland build now...
> >>>> >
> >>>> >
> >>>> >     Thanks for pointing this out.
> >>>> >
> >>>> >
> >>>> >           Luis
> >>>> >
> >>>> > ---------------------------------------------------
> >>>> > On Mon, Apr 12, 2010 at 3:15 PM, Bill Lorensen <
> bill.lorensen at gmail.com>
> >>>> > wrote:
> >>>> >>
> >>>> >> Luis,
> >>>> >>
> >>>> >> Looks like there are some numerical problems with the new
> >>>> >> vnl_lsqr code.
> >>>> >>
> >>>> >> Borland is reporting
> >>>> >> http://www.cdash.org/CDash/viewTest.php?onlyfailed&buildid=584274
> >>>> >>
> >>>> >> I think Borland is the only compiler that currently catches
> floating
> >>>> >> point divide by zero, etc.
> >>>> >>
> >>>> >> This is probably due to a real error in the new code.
> >>>> >>
> >>>> >> Bill
> >>>> >
> >>>> >
> >>>
> >>>
> >>
> >
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Kitware offers ITK Training Courses, for more information visit:
> http://kitware.com/products/protraining.html
>
> Please keep messages on-topic and check the ITK FAQ at:
> http://www.itk.org/Wiki/ITK_FAQ
>
> Follow this link to subscribe/unsubscribe:
> http://www.itk.org/mailman/listinfo/insight-developers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20100414/354e8c8c/attachment.htm>