[Insight-developers] numerics code optimization issue

Fri Aug 12 19:22:57 EDT 2005

Miller, James V (Research) wrote:

>Simon, 
>
>Problems in netlib with respect to optimization has been a long standing 
>issue.  I tracked down one of these issues a while back and turned 
>off optimization for the Microsoft compiler in slamch.c with a pragma
>
>#if defined(_MSC_VER)
># pragma optimize("", off)
>#endif
>
>I did not do this is dlamch.c (probably an oversight on my part on what the 
>the netlib naming conventions are).  This probably points to a bigger issue of a 
>lack of tests.
>
>Which intel compiler were you using? Linux of Windows?
>
>I think CMake may have the capability to specify compile flags on a per
>file basis.  I think the vxl guys wanted such a capability. I would lean to 
>putting in pragmas to turn off optimization for the offending compilers in 
>those files if possible.
>
>To do it in CMake, it looks like we would need to do something like:
>
>IF(CMAKE_COMPILER_IS_GNUCC)
>  SET_SOURCE_FILES_PROPERTIES(gpfa5f.c PROPERTIES COMPILE_FLAGS -O0)
>  SET_SOURCE_FILES_PROPERTIES(dgpfa5f.c PROPERTIES COMPILE_FLAGS -O0)
>  SET_SOURCE_FILES_PROPERTIES(zlarfx.c PROPERTIES COMPILE_FLAGS -Os)
>  SET(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -w")
>ENDIF(CMAKE_COMPILER_IS_GNUCC)
>
>So we'd need to switch on each compiler and specify the COMPILE_FLAGS for each
>suspect file.  Not sure this is any easier than using pragmas.  But it may be
>necessary if we cannot turn off optimization from within the source file.
>
>Are slamch.c and dlamch.c the only ones that need to be compiled non-optimized?
>Or does slarfg.c also need it?
>
>Jim
>
>  
>
I was using the Intel compiler on linux.

I found setting file specific properties in CMakeLists.txt such as :
SET_SOURCE_FILES_PROPERTIES(slamch.c PROPERTIES COMPILE_FLAGS -O0)
SET_SOURCE_FILES_PROPERTIES(dlamch.c PROPERTIES COMPILE_FLAGS -O0)

  wasn't effective because these flags precede the flags associated with 
the build type rather than replace any other specified flags.

What was effective was using a pragma:
#if defined(__INTEL_COMPILER)
# pragma optimize("", off)
#endif

This turns of optimization until the end of the current source file or 
the next occurrence of
# pragma optimize("", on)

Keeping optimization off in the slamch.c and dlamch.c file solves the 
issue.  slarfg.c was failing only because the results from slamch.c were 
wrong, and it doesn't need fixing.
Attached are two fixed files, but the solution is basically to add the 
three above lines near the top of each of the files.

It would be great if the fix made it into the CVS repository.
--
Simon
-------------- next part --------------
#include "f2c.h"
#include "netlib.h"
#include <stdio.h>

/* There are too many problems in this file created by the MSVC
   optimizer.  Just disable it.  */
#if defined(_MSC_VER)
# pragma optimize("", off)
#endif

#if defined(__INTEL_COMPILER)
# pragma optimize("", off)
#endif

void slamc1_(integer *beta, integer *t, logical *rnd, logical *ieee1);
void slamc2_(integer *beta, integer *t, logical *rnd, real *eps,
             integer *emin, real *rmin, integer *emax, real *rmax);
real slamc3_(real *a, real *b);
void slamc4_(integer *emin, real *start, integer *base);
void slamc5_(integer *beta, integer *p, integer *emin, logical *ieee, integer *emax, real *rmax);

real slamch_(const char *cmach)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    SLAMCH determines single precision machine parameters.

    Arguments
    =========

    CMACH   (input) CHARACTER*1
            Specifies the value to be returned by SLAMCH:
            = 'E' or 'e',   SLAMCH := eps
            = 'S' or 's ,   SLAMCH := sfmin
            = 'B' or 'b',   SLAMCH := base
            = 'P' or 'p',   SLAMCH := eps*base
            = 'N' or 'n',   SLAMCH := t
            = 'R' or 'r',   SLAMCH := rnd
            = 'M' or 'm',   SLAMCH := emin
            = 'U' or 'u',   SLAMCH := rmin
            = 'L' or 'l',   SLAMCH := emax
            = 'O' or 'o',   SLAMCH := rmax

            where

            eps   = relative machine precision
            sfmin = safe minimum, such that 1/sfmin does not overflow
            base  = base of the machine
            prec  = eps*base
            t     = number of (base) digits in the mantissa
            rnd   = 1.0 when rounding occurs in addition, 0.0 otherwise
            emin  = minimum exponent before (gradual) underflow
            rmin  = underflow threshold - base**(emin-1)
            emax  = largest exponent before overflow
            rmax  = overflow threshold  - (base**emax)*(1-eps)

   =====================================================================
*/

    /* Initialized data */
    static logical first = TRUE_;
    /* System generated locals */
    integer i__1;
    /* Local variables */
    static real base;
    static integer beta;
    static real emin, prec, emax;
    static integer imin, imax;
    static logical lrnd;
    static real rmin, rmax, t;
    static real small, sfmin;
    static integer it;
    static real rnd, eps;

    if (first) {
        first = FALSE_;
        slamc2_(&beta, &it, &lrnd, &eps, &imin, &rmin, &imax, &rmax);
        base = (real) beta;
        t = (real) it;
        if (lrnd) {
            rnd = 1.f;
            i__1 = 1 - it;
            eps = pow_ri(&base, &i__1) / 2;
        } else {
            rnd = 0.f;
            i__1 = 1 - it;
            eps = pow_ri(&base, &i__1);
        }
        prec = eps * base;
        emin = (real) imin;
        emax = (real) imax;
        sfmin = rmin;
        small = 1.f / rmax;
        if (small >= sfmin) {
            /* Use SMALL plus a bit, to avoid the possibility of rounding */
            /* causing overflow when computing  1/sfmin. */
            sfmin = small * (eps + 1.f);
        }
    }

    if      (*cmach=='E' || *cmach=='e') return eps;   /* 1.19209e-7f */
    else if (*cmach=='S' || *cmach=='s') return sfmin; /* 1.17549e-38f */
    else if (*cmach=='B' || *cmach=='b') return base;  /* 2.f */
    else if (*cmach=='P' || *cmach=='p') return prec;  /* 2.38419e-7.f */
    else if (*cmach=='N' || *cmach=='n') return t;     /* 24.f */
    else if (*cmach=='R' || *cmach=='r') return rnd;   /* 0.f */
    else if (*cmach=='M' || *cmach=='m') return emin;  /* -125.f */
    else if (*cmach=='U' || *cmach=='u') return rmin;  /* 1.17549e-38f */
    else if (*cmach=='L' || *cmach=='l') return emax;  /* 128.f */
    else if (*cmach=='O' || *cmach=='o') return rmax;  /* 3.40282e38f */
    else return 0.f; /* in case a non-documented argument was passed */
} /* slamch_ */

/* Subroutine */
void slamc1_(integer *beta, integer *t, logical *rnd, logical *ieee1)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    SLAMC1 determines the machine parameters given by BETA, T, RND, and
    IEEE1.

    Arguments
    =========

    BETA    (output) INTEGER
            The base of the machine.

    T       (output) INTEGER
            The number of ( BETA ) digits in the mantissa.

    RND     (output) LOGICAL
            Specifies whether proper rounding  ( RND = .TRUE. )  or
            chopping  ( RND = .FALSE. )  occurs in addition. This may not

            be a reliable guide to the way in which the machine performs
            its arithmetic.

    IEEE1   (output) LOGICAL
            Specifies whether rounding appears to be done in the IEEE
            'round to nearest' style.

    Further Details
    ===============

    The routine is based on the routine  ENVRON  by Malcolm and
    incorporates suggestions by Gentleman and Marovich. See

       Malcolm M. A. (1972) Algorithms to reveal properties of
          floating-point arithmetic. Comms. of the ACM, 15, 949-951.

       Gentleman W. M. and Marovich S. B. (1974) More on algorithms
          that reveal properties of floating point arithmetic units.
          Comms. of the ACM, 17, 276-277.

   =====================================================================
*/

    /* Initialized data */
    static logical first = TRUE_;
    /* System generated locals */
    real r__1, r__2;
    /* Local variables */
    static logical lrnd;
    static real a, b, c, f;
    static integer lbeta;
    static real savec;
    static logical lieee1;
    static real t1, t2;
    static integer lt;
    static real one = 1.f;

    if (first) {
        first = FALSE_;

        /* LBETA,  LIEEE1,  LT and  LRND  are the  local values  of  BETA, IEEE1, T and RND. */

        /* Throughout this routine  we use the function  SLAMC3  to ensure */
        /* that relevant values are  stored and not held in registers, or */
        /* are not affected by optimizers. */

        /* Compute  a = 2.0**m  with the  smallest positive integer m such that */
        /*      fl( a + 1.0 ) = a. */

        a = c = one;

        while (c == one) {
            a *= 2;
            c = slamc3_(&a, &one);
            r__1 = -a;
            c = slamc3_(&c, &r__1);
        }

        /* Now compute  b = 2.0**m  with the smallest positive integer m such that */
        /*  fl( a + b ) .gt. a. */

        b = one;
        c = slamc3_(&a, &b);

        /* The next two lines of code were replaced by Ian Scott from the original line
                     while (c==a) {
          During a optimised build under MSVC, the compiler was using the value of
          C still in a register in while loop test. This is an 80-bit value rather than
          the 64 bit value it uses after saving and loading from memory.
          So the 80 bit precision value was having 1 added, making it a different number
          and so not executing the loop.
          The call to slamc3_ in the loop condition forces the value to 64-bit precision
          as during the previous calculation.
        */
        r__1 = -a;
        while (slamc3_(&c, &r__1) == 0.f) {
            b *= 2;
            c = slamc3_(&a, &b);
        }

        /* Now compute the base.  a and c  are neighbouring floating point */
        /* numbers  in the  interval  ( beta**t, beta**( t + 1 ) )  and so */
        /* their difference is beta. Adding 0.25 to c is to ensure that it */
        /* is truncated to beta and not ( beta - 1 ). */

        savec = c;
        r__1 = -a;
        c = slamc3_(&c, &r__1);
        lbeta = (integer)(c + 0.25f);

        /* Now determine whether rounding or chopping occurs,  by adding a */
        /* bit  less  than  beta/2  and a  bit  more  than  beta/2  to a. */

        b = (real) lbeta;
        r__1 = b / 2;
        r__2 = -b / 100;
        f = slamc3_(&r__1, &r__2);
        c = slamc3_(&f, &a);
        if (c == a) {
            lrnd = TRUE_;
        } else {
            lrnd = FALSE_;
        }
        r__1 = b / 2;
        r__2 = b / 100;
        f = slamc3_(&r__1, &r__2);
        c = slamc3_(&f, &a);
        if (lrnd && c == a) {
            lrnd = FALSE_;
        }

        /* Try and decide whether rounding is done in the  IEEE  'round to */
        /* nearest' style. B/2 is half a unit in the last place of the two */
        /* numbers A and SAVEC. Furthermore, A is even, i.e. has last bit */
        /* zero, and SAVEC is odd. Thus adding B/2 to A should not  change */
        /* A, but adding B/2 to SAVEC should change SAVEC. */

        r__1 = b / 2;
        t1 = slamc3_(&r__1, &a);
        r__1 = b / 2;
        t2 = slamc3_(&r__1, &savec);
        lieee1 = t1 == a && t2 > savec && lrnd;

        /* Now find  the  mantissa, t.  It should  be the  integer part of */
        /* log to the base beta of a,  however it is safer to determine t */
        /* by powering.  So we find t as the smallest positive integer for */
        /* which */
        /*    fl( beta**t + 1.0 ) = 1.0. */

        lt = 0;
        a = c = one;

        while (c == one) {
            ++lt;
            a *= lbeta;
            c = slamc3_(&a, &one);
            r__1 = -a;
            c = slamc3_(&c, &r__1);
        }
    }

    *beta = lbeta;
    *t = lt;
    *rnd = lrnd;
    *ieee1 = lieee1;
} /* slamc1_ */

/* Subroutine */
void slamc2_(integer *beta, integer *t, logical *rnd,
             real *eps, integer *emin, real *rmin,
             integer *emax, real *rmax)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    SLAMC2 determines the machine parameters specified in its argument
    list.

    Arguments
    =========

    BETA    (output) INTEGER
            The base of the machine.

    T       (output) INTEGER
            The number of ( BETA ) digits in the mantissa.

    RND     (output) LOGICAL
            Specifies whether proper rounding  ( RND = .TRUE. )  or
            chopping  ( RND = .FALSE. )  occurs in addition. This may not

            be a reliable guide to the way in which the machine performs

            its arithmetic.

    EPS     (output) REAL
            The smallest positive number such that

               fl( 1.0 - EPS ) .LT. 1.0,

            where fl denotes the computed value.

    EMIN    (output) INTEGER
            The minimum exponent before (gradual) underflow occurs.

    RMIN    (output) REAL
            The smallest normalized number for the machine, given by
            BASE**( EMIN - 1 ), where  BASE  is the floating point value

            of BETA.

    EMAX    (output) INTEGER
            The maximum exponent before overflow occurs.

    RMAX    (output) REAL
            The largest positive number for the machine, given by
            BASE**EMAX * ( 1 - EPS ), where  BASE  is the floating point

            value of BETA.

    Further Details
    ===============

    The computation of  EPS  is based on a routine PARANOIA by
    W. Kahan of the University of California at Berkeley.

   =====================================================================
*/

    /* Initialized data */
    static logical first = TRUE_;
    static logical iwarn = FALSE_;
    /* System generated locals */
    integer i__1;
    real r__1, r__2;
    /* Local variables */
    static logical ieee;
    static real half = 0.5f;
    static logical lrnd;
    static real leps, zero = 0.f, a, b, c;
    static integer i, lbeta;
    static real rbase;
    static integer lemin, lemax, gnmin;
    static real small;
    static integer gpmin;
    static real third, lrmin, lrmax, sixth;
    static logical lieee1;
    static integer lt, ngnmin, ngpmin;
    static real one = 1.f;

    if (first) {
        first = FALSE_;

/*        LBETA, LT, LRND, LEPS, LEMIN and LRMIN  are the local values of
          BETA, T, RND, EPS, EMIN and RMIN.

          Throughout this routine  we use the function  SLAMC3  to ensure
          that relevant values are stored  and not held in registers, or
          are not affected by optimizers.

          SLAMC1 returns the parameters  LBETA, LT, LRND and LIEEE1.
*/

        slamc1_(&lbeta, &lt, &lrnd, &lieee1);

        /* Start to find EPS. */

        b = (real) lbeta;
        i__1 = -lt;
        a = pow_ri(&b, &i__1);
        leps = a;

        /* Try some tricks to see whether or not this is the correct  EPS. */

        b = 2.f / 3;
        r__1 = -half;
        sixth = slamc3_(&b, &r__1);
        third = slamc3_(&sixth, &sixth);
        b = slamc3_(&third, &r__1);
        b = slamc3_(&b, &sixth);
        b = abs(b);
        if (b < leps) {
            b = leps;
        }

        leps = one;

        while (leps > b && b > zero) {
            leps = b;
            r__1 = half * leps;
            r__2 = 32.0f * leps * leps;
            c = slamc3_(&r__1, &r__2);
            r__1 = -c;
            c = slamc3_(&half, &r__1);
            b = slamc3_(&half, &c);
            r__1 = -b;
            c = slamc3_(&half, &r__1);
            b = slamc3_(&half, &c);
        }

        if (a < leps) {
            leps = a;
        }

/*        Computation of EPS complete.

          Now find  EMIN.  Let A = + or - 1, and + or - (1 + BASE**(-3)).
          Keep dividing  A by BETA until (gradual) underflow occurs. This
          is detected when we cannot recover the previous A.
*/

        rbase = one / lbeta;
        small = one;
        for (i = 1; i <= 3; ++i) {
            r__1 = small * rbase;
            small = slamc3_(&r__1, &zero);
        }
        a = slamc3_(&one, &small);
        slamc4_(&ngpmin, &one, &lbeta);
        r__1 = -one;
        slamc4_(&ngnmin, &r__1, &lbeta);
        slamc4_(&gpmin, &a, &lbeta);
        r__1 = -a;
        slamc4_(&gnmin, &r__1, &lbeta);
        ieee = FALSE_;

        if (ngpmin == ngnmin && gpmin == gnmin) {
            if (ngpmin == gpmin) {
                lemin = ngpmin;
                /* ( Non twos-complement machines, no gradual underflow; e.g.,  VAX ) */
            } else if (gpmin - ngpmin == 3) {
                lemin = ngpmin - 1 + lt;
                ieee = TRUE_;
                /* ( Non twos-complement machines, with gradual underflow; e.g., IEEE standard followers ) */
            } else {
                lemin = min(ngpmin,gpmin);
                /* ( A guess; no known machine ) */
                iwarn = TRUE_;
            }
        } else if (ngpmin == gpmin && ngnmin == gnmin) {
            if (abs(ngpmin - ngnmin) == 1) {
                lemin = max(ngpmin,ngnmin);
                /* ( Twos-complement machines, no gradual underflow; e.g., CYBER 205 ) */
            } else {
                lemin = min(ngpmin,ngnmin);
                /* ( A guess; no known machine ) */
                iwarn = TRUE_;
            }
        } else if (abs(ngpmin - ngnmin) == 1 && gpmin == gnmin) {
            if (gpmin - min(ngpmin,ngnmin) == 3) {
                lemin = max(ngpmin,ngnmin) - 1 + lt;
                /* ( Twos-complement machines with gradual underflow; no known machine ) */
            } else {
                lemin = min(ngpmin,ngnmin);
                /* ( A guess; no known machine ) */
                iwarn = TRUE_;
            }
        } else {
            lemin = min(min(min(ngpmin,ngnmin),gpmin),gnmin);
/*         ( A guess; no known machine ) */
            iwarn = TRUE_;
        }
        /* ** Comment out this if block if EMIN is ok */
        if (iwarn) {
            first = TRUE_;
            printf("\n\n WARNING. The value EMIN may be incorrect: - ");
            printf("EMIN = %8i\n",lemin);
            printf("If, after inspection, the value EMIN looks acceptable");
            printf(" please comment out\n the IF block as marked within the");
            printf(" code of routine SLAMC2,\n otherwise supply EMIN");
            printf(" explicitly.\n");
        }
/* **     Assume IEEE arithmetic if we found denormalised  numbers above,
          or if arithmetic seems to round in the  IEEE style,  determined
          in routine SLAMC1. A true IEEE machine should have both  things
          true; however, faulty machines may have one or the other.
*/

        ieee = ieee || lieee1;

/*        Compute  RMIN by successive division by  BETA. We could compute
          RMIN as BASE**( EMIN - 1 ),  but some machines underflow during
          this computation.
*/

        lrmin = one;
        for (i = 1; i <= 1-lemin; ++i) {
            r__1 = lrmin * rbase;
            lrmin = slamc3_(&r__1, &zero);
        }

/*        Finally, call SLAMC5 to compute EMAX and RMAX. */

        slamc5_(&lbeta, &lt, &lemin, &ieee, &lemax, &lrmax);
    }

    *beta = lbeta;
    *t = lt;
    *rnd = lrnd;
    *eps = leps;
    *emin = lemin;
    *rmin = lrmin;
    *emax = lemax;
    *rmax = lrmax;
} /* slamc2_ */

/* Microsoft Visual C++ 2003 produces bad code when the following */
/* routine is optimized.  Turn off the optimization for this one  */
/* routine and turn back on any optimizations after this routine. */
#if defined(_WIN32) || defined(WIN32)
#if (_MSC_VER >= 1310)
# pragma optimize("", off)
#endif
#endif

real slamc3_(real *a, real *b)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    SLAMC3  is intended to force  A  and  B  to be stored prior to doing
    the addition of  A  and  B ,  for use in situations where optimizers
    might hold one of these in a register.

    Arguments
    =========

    A, B    (input) REAL
            The values A and B.

   =====================================================================
*/

    return *a + *b;
} /* slamc3_ */

/* Turn the optimizations back on for Visual Studio .NET 2003 */
#if defined(_WIN32) || defined(WIN32)
#if (_MSC_VER >= 1310)
# pragma optimize("", on)
#endif
#endif

/* Subroutine */
void slamc4_(integer *emin, real *start, integer *base)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    SLAMC4 is a service routine for SLAMC2.

    Arguments
    =========

    EMIN    (output) EMIN
            The minimum exponent before (gradual) underflow, computed by
            setting A = START and dividing by BASE until the previous A
            can not be recovered.

    START   (input) REAL
            The starting point for determining EMIN.

    BASE    (input) INTEGER
            The base of the machine.

   =====================================================================
*/

    /* System generated locals */
    real r__1;
    /* Local variables */
    static real zero = 0.f, a;
    static integer i;
    static real rbase, b1, b2, c1, c2, d1, d2;
    static real one = 1.f;

    a = *start;
    rbase = one / *base;
    *emin = 1;
    r__1 = a * rbase;
    b1 = slamc3_(&r__1, &zero);
    c1 = c2 = d1 = d2 = a;
    while (c1 == a && c2 == a && d1 == a && d2 == a) {
        --(*emin);
        a = b1;
        r__1 = a / *base;
        b1 = slamc3_(&r__1, &zero);
        r__1 = b1 * *base;
        c1 = slamc3_(&r__1, &zero);
        d1 = zero;
        for (i = 1; i <= *base; ++i) {
            d1 += b1;
        }
        r__1 = a * rbase;
        b2 = slamc3_(&r__1, &zero);
        r__1 = b2 / rbase;
        c2 = slamc3_(&r__1, &zero);
        d2 = zero;
        for (i = 1; i <= *base; ++i) {
            d2 += b2;
        }
    }
} /* slamc4_ */

/* Subroutine */
void slamc5_(integer *beta, integer *p, integer *emin,
             logical *ieee, integer *emax, real *rmax)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    SLAMC5 attempts to compute RMAX, the largest machine floating-point
    number, without overflow.  It assumes that EMAX + abs(EMIN) sum
    approximately to a power of 2.  It will fail on machines where this
    assumption does not hold, for example, the Cyber 205 (EMIN = -28625,
    EMAX = 28718).  It will also fail if the value supplied for EMIN is
    too large (i.e. too close to zero), probably with overflow.

    Arguments
    =========

    BETA    (input) INTEGER
            The base of floating-point arithmetic.

    P       (input) INTEGER
            The number of base BETA digits in the mantissa of a
            floating-point value.

    EMIN    (input) INTEGER
            The minimum exponent before (gradual) underflow.

    IEEE    (input) LOGICAL
            A logical flag specifying whether or not the arithmetic
            system is thought to comply with the IEEE standard.

    EMAX    (output) INTEGER
            The largest exponent before overflow

    RMAX    (output) REAL
            The largest machine floating-point number.

   =====================================================================
*/

    /* Table of constant values */
    static real c_b5 = 0.f;
    /* System generated locals */
    real r__1;
    /* Local variables */
    static integer lexp;
    static real oldy;
    static integer uexp, i;
    static real y, z;
    static integer nbits;
    static real recbas;
    static integer exbits, expsum, try;

/*     First compute LEXP and UEXP, two powers of 2 that bound
       abs(EMIN). We then assume that EMAX + abs(EMIN) will sum
       approximately to the bound that is closest to abs(EMIN).
       (EMAX is the exponent of the required number RMAX).
*/

    lexp = 1;
    exbits = 1;
    while ((try = lexp << 1) <= -(*emin)) {
        lexp = try;
        ++exbits;
    }
    if (lexp == -(*emin)) {
        uexp = lexp;
    } else {
        uexp = try;
        ++exbits;
    }

/*     Now -LEXP is less than or equal to EMIN, and -UEXP is greater
       than or equal to EMIN. EXBITS is the number of bits needed to
       store the exponent.
*/

    if (uexp + *emin > -lexp - *emin) {
        expsum = lexp << 1;
    } else {
        expsum = uexp << 1;
    }

    /* EXPSUM is the exponent range, approximately equal to EMAX - EMIN + 1 . */
    *emax = expsum + *emin - 1;
    nbits = exbits + 1 + *p;

/*     NBITS is the total number of bits needed to store a floating-point number. */

    if (nbits % 2 == 1 && *beta == 2) {
/*        Either there are an odd number of bits used to store a
          floating-point number, which is unlikely, or some bits are
          not used in the representation of numbers, which is possible,
          (e.g. Cray machines) or the mantissa has an implicit bit,
          (e.g. IEEE machines, Dec Vax machines), which is perhaps the
          most likely. We have to assume the last alternative.
          If this is true, then we need to reduce EMAX by one because
          there must be some way of representing zero in an implicit-bit
          system. On machines like Cray, we are reducing EMAX by one
          unnecessarily.
*/
        --(*emax);
    }

    if (*ieee) {
        /* Assume we are on an IEEE machine which reserves one exponent for infinity and NaN. */
        --(*emax);
    }

/*  Now create RMAX, the largest machine number, which should be equal to (1.0 - BETA**(-P)) * BETA**EMAX . */
/*  First compute 1.0 - BETA**(-P), being careful that the result is less than 1.0 . */

    recbas = 1.f / *beta;
    z = *beta - 1.f;
    y = 0.f;
    for (i = 1; i <= *p; ++i) {
        z *= recbas;
        if (y < 1.f) {
            oldy = y;
        }
        y = slamc3_(&y, &z);
    }
    if (y >= 1.f) {
        y = oldy;
    }

    /* Now multiply by BETA**EMAX to get RMAX. */
    for (i = 1; i <= *emax; ++i) {
        r__1 = y * *beta;
        y = slamc3_(&r__1, &c_b5);
    }

    *rmax = y;
} /* slamc5_ */
-------------- next part --------------
#include "f2c.h"
#include "netlib.h"
#include <stdio.h>

#if defined(__INTEL_COMPILER)
# pragma optimize("", off)
#endif

void dlamc1_(integer *beta, integer *t, logical *rnd, logical *ieee1);
void dlamc2_(integer *beta, integer *t, logical *rnd, doublereal *eps,
             integer *emin, doublereal *rmin, integer *emax, doublereal *rmax);
doublereal dlamc3_(doublereal *a, doublereal *b);
void dlamc4_(integer *emin, doublereal *start, integer *base);
void dlamc5_(integer *beta, integer *p, integer *emin, logical *ieee, integer *emax, doublereal *rmax);
doublereal dlamc33_(doublereal *a, doublereal *b);

doublereal dlamch_(const char *cmach)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    DLAMCH determines double precision machine parameters.

    Arguments
    =========

    CMACH   (input) CHARACTER*1
            Specifies the value to be returned by DLAMCH:
            = 'E' or 'e',   DLAMCH := eps
            = 'S' or 's ,   DLAMCH := sfmin
            = 'B' or 'b',   DLAMCH := base
            = 'P' or 'p',   DLAMCH := eps*base
            = 'N' or 'n',   DLAMCH := t
            = 'R' or 'r',   DLAMCH := rnd
            = 'M' or 'm',   DLAMCH := emin
            = 'U' or 'u',   DLAMCH := rmin
            = 'L' or 'l',   DLAMCH := emax
            = 'O' or 'o',   DLAMCH := rmax

            where

            eps   = relative machine precision
            sfmin = safe minimum, such that 1/sfmin does not overflow
            base  = base of the machine
            prec  = eps*base
            t     = number of (base) digits in the mantissa
            rnd   = 1.0 when rounding occurs in addition, 0.0 otherwise
            emin  = minimum exponent before (gradual) underflow
            rmin  = underflow threshold - base**(emin-1)
            emax  = largest exponent before overflow
            rmax  = overflow threshold  - (base**emax)*(1-eps)

   =====================================================================
*/
    /* Initialized data */
    static logical first = TRUE_;
    /* System generated locals */
    integer i__1;
    /* Local variables */
    static doublereal base;
    static integer beta;
    static doublereal emin, prec, emax;
    static integer imin, imax;
    static logical lrnd;
    static doublereal rmin, rmax, t, rmach;
    static doublereal small, sfmin;
    static integer it;
    static doublereal rnd, eps;

    if (first) {
        first = FALSE_;
        dlamc2_(&beta, &it, &lrnd, &eps, &imin, &rmin, &imax, &rmax);
        base = (doublereal) beta;
        t = (doublereal) it;
        if (lrnd) {
            rnd = 1.;
            i__1 = 1 - it;
            eps = pow_di(&base, &i__1) / 2;
        } else {
            rnd = 0.;
            i__1 = 1 - it;
            eps = pow_di(&base, &i__1);
        }
        prec = eps * base;
        emin = (doublereal) imin;
        emax = (doublereal) imax;
        sfmin = rmin;
        small = 1. / rmax;
        if (small >= sfmin) {

/*           Use SMALL plus a bit, to avoid the possibility of rounding
             causing overflow when computing  1/sfmin.
*/

            sfmin = small * (eps + 1.);
        }
    }

    if (lsame_(cmach, "E")) {
        rmach = eps;
    } else if (lsame_(cmach, "S")) {
        rmach = sfmin;
    } else if (lsame_(cmach, "B")) {
        rmach = base;
    } else if (lsame_(cmach, "P")) {
        rmach = prec;
    } else if (lsame_(cmach, "N")) {
        rmach = t;
    } else if (lsame_(cmach, "R")) {
        rmach = rnd;
    } else if (lsame_(cmach, "M")) {
        rmach = emin;
    } else if (lsame_(cmach, "U")) {
        rmach = rmin;
    } else if (lsame_(cmach, "L")) {
        rmach = emax;
    } else if (lsame_(cmach, "O")) {
        rmach = rmax;
    }

    return rmach;

} /* dlamch_ */

/* Subroutine */
void dlamc1_(integer *beta, integer *t, logical *rnd, logical *ieee1)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    DLAMC1 determines the machine parameters given by BETA, T, RND, and
    IEEE1.

    Arguments
    =========

    BETA    (output) INTEGER
            The base of the machine.

    T       (output) INTEGER
            The number of ( BETA ) digits in the mantissa.

    RND     (output) LOGICAL
            Specifies whether proper rounding  ( RND = .TRUE. )  or
            chopping  ( RND = .FALSE. )  occurs in addition. This may not

            be a reliable guide to the way in which the machine performs
            its arithmetic.

    IEEE1   (output) LOGICAL
            Specifies whether rounding appears to be done in the IEEE
            'round to nearest' style.

    Further Details
    ===============

    The routine is based on the routine  ENVRON  by Malcolm and
    incorporates suggestions by Gentleman and Marovich. See

       Malcolm M. A. (1972) Algorithms to reveal properties of
          floating-point arithmetic. Comms. of the ACM, 15, 949-951.

       Gentleman W. M. and Marovich S. B. (1974) More on algorithms
          that reveal properties of floating point arithmetic units.
          Comms. of the ACM, 17, 276-277.

   =====================================================================
*/
    /* Initialized data */
    static logical first = TRUE_;
    /* System generated locals */
    doublereal d__1, d__2;
    /* Local variables */
    static logical lrnd;
    static doublereal a, b, c, f;
    static integer lbeta;
    static doublereal savec;
    static logical lieee1;
    static doublereal t1, t2;
    static integer lt;
    static doublereal one = 1., qtr;

    if (first) {
        first = FALSE_;

/*        LBETA,  LIEEE1,  LT and  LRND  are the  local values  of  BETA,
          IEEE1, T and RND.

          Throughout this routine  we use the function  DLAMC3  to ensure
          that relevant values are  stored and not held in registers, or
          are not affected by optimizers.

          Compute  a = 2.0**m  with the  smallest positive integer m such
          that
                fl( a + 1.0 ) = a.
*/

        a = c = one;

        while (c == one) {
            a *= 2;
            c = dlamc3_(&a, &one);
            c = dlamc33_(&c, &a);
        }

/*        Now compute  b = 2.0**m  with the smallest positive integer m
          such that
                    fl( a + b ) .gt. a.
*/

        b = one;

        do {
            b *= 2;
            c = dlamc3_(&a, &b);
        } while (c == a);

/*        Now compute the base.  a and c  are neighbouring floating point
          numbers  in the  interval  ( beta**t, beta**( t + 1 ) )  and so
          their difference is beta. Adding 0.25 to c is to ensure that it
          is truncated to beta and not ( beta - 1 ).
*/

        qtr = one / 4;
        savec = c;
        c = dlamc33_(&c, &a);
        lbeta = (integer) (c + qtr);

/*        Now determine whether rounding or chopping occurs,  by adding a
          bit  less  than  beta/2  and a  bit  more  than  beta/2  to a.
*/

        b = (doublereal) lbeta;
        d__1 = b / 2;
        d__2 = -b / 100;
        f = dlamc3_(&d__1, &d__2);
        c = dlamc3_(&f, &a);
        if (c == a) {
            lrnd = TRUE_;
        } else {
            lrnd = FALSE_;
        }
        d__1 = b / 2;
        d__2 = b / 100;
        f = dlamc3_(&d__1, &d__2);
        c = dlamc3_(&f, &a);
        if (lrnd && c == a) {
            lrnd = FALSE_;
        }

/*        Try and decide whether rounding is done in the  IEEE  'round to
          nearest' style. B/2 is half a unit in the last place of the two
          numbers A and SAVEC. Furthermore, A is even, i.e. has last bit
          zero, and SAVEC is odd. Thus adding B/2 to A should not  change
          A, but adding B/2 to SAVEC should change SAVEC.
*/

        d__1 = b / 2;
        t1 = dlamc3_(&d__1, &a);
        d__1 = b / 2;
        t2 = dlamc3_(&d__1, &savec);
        lieee1 = t1 == a && t2 > savec && lrnd;

/*        Now find  the  mantissa, t.  It should  be the  integer part of
          log to the base beta of a,  however it is safer to determine t
          by powering.  So we find t as the smallest positive integer for
          which

             fl( beta**t + 1.0 ) = 1.0.
*/

        lt = 0;
        a = c = one;

        while (c == one) {
            ++lt;
            a *= lbeta;
            c = dlamc3_(&a, &one);
            c = dlamc33_(&c, &a);
        }
    }

    *beta = lbeta;
    *t = lt;
    *rnd = lrnd;
    *ieee1 = lieee1;
} /* dlamc1_ */

/* Subroutine */
void dlamc2_(integer *beta, integer *t, logical *rnd,
             doublereal *eps, integer *emin, doublereal *rmin,
             integer *emax, doublereal *rmax)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    DLAMC2 determines the machine parameters specified in its argument
    list.

    Arguments
    =========

    BETA    (output) INTEGER
            The base of the machine.

    T       (output) INTEGER
            The number of ( BETA ) digits in the mantissa.

    RND     (output) LOGICAL
            Specifies whether proper rounding  ( RND = .TRUE. )  or
            chopping  ( RND = .FALSE. )  occurs in addition. This may not

            be a reliable guide to the way in which the machine performs

            its arithmetic.

    EPS     (output) DOUBLE PRECISION
            The smallest positive number such that

               fl( 1.0 - EPS ) .LT. 1.0,

            where fl denotes the computed value.

    EMIN    (output) INTEGER
            The minimum exponent before (gradual) underflow occurs.

    RMIN    (output) DOUBLE PRECISION
            The smallest normalized number for the machine, given by
            BASE**( EMIN - 1 ), where  BASE  is the floating point value

            of BETA.

    EMAX    (output) INTEGER
            The maximum exponent before overflow occurs.

    RMAX    (output) DOUBLE PRECISION
            The largest positive number for the machine, given by
            BASE**EMAX * ( 1 - EPS ), where  BASE  is the floating point

            value of BETA.

    Further Details
    ===============

    The computation of  EPS  is based on a routine PARANOIA by
    W. Kahan of the University of California at Berkeley.

   =====================================================================
*/

    /* Initialized data */
    static logical first = TRUE_;
    static logical iwarn = FALSE_;
    /* System generated locals */
    integer i__1;
    doublereal d__1, d__2;
    /* Local variables */
    static logical ieee;
    static doublereal half = 0.5;
    static logical lrnd;
    static doublereal leps, zero = 0., a, b, c;
    static integer i, lbeta;
    static doublereal rbase;
    static integer lemin, lemax, gnmin;
    static doublereal small;
    static integer gpmin;
    static doublereal third, lrmin, lrmax, sixth;
    static logical lieee1;
    static integer lt, ngnmin, ngpmin;
    static doublereal one = 1.;

    if (first) {
        first = FALSE_;

/*        LBETA, LT, LRND, LEPS, LEMIN and LRMIN  are the local values of
          BETA, T, RND, EPS, EMIN and RMIN.

          Throughout this routine  we use the function  DLAMC3  to ensure
          that relevant values are stored  and not held in registers, or
          are not affected by optimizers.

          DLAMC1 returns the parameters  LBETA, LT, LRND and LIEEE1.
*/

        dlamc1_(&lbeta, &lt, &lrnd, &lieee1);

/*        Start to find EPS. */

        b = (doublereal) lbeta;
        i__1 = -lt;
        a = pow_di(&b, &i__1);
        leps = a;

/*        Try some tricks to see whether or not this is the correct  EPS. */

        b = 2. / 3;
        sixth = dlamc33_(&b, &half);
        third = dlamc3_(&sixth, &sixth);
        b = dlamc33_(&third, &half);
        b = dlamc3_(&b, &sixth);
        b = abs(b);
        if (b < leps) {
            b = leps;
        }

        leps = one;

        while (leps > b && b > zero) {
            leps = b;
            d__1 = half * leps;
            d__2 = 32. * (leps * leps);
            c = dlamc3_(&d__1, &d__2);
            c = dlamc33_(&half, &c);
            b = dlamc3_(&half, &c);
            c = dlamc33_(&half, &b);
            b = dlamc3_(&half, &c);
        }

        if (a < leps) {
            leps = a;
        }

/*        Computation of EPS complete.

          Now find  EMIN.  Let A = + or - 1, and + or - (1 + BASE**(-3)).
          Keep dividing  A by BETA until (gradual) underflow occurs. This
          is detected when we cannot recover the previous A.
*/

        rbase = one / lbeta;
        small = one;
        for (i = 1; i <= 3; ++i) {
            d__1 = small * rbase;
            small = dlamc3_(&d__1, &zero);
        }
        a = dlamc3_(&one, &small);
        dlamc4_(&ngpmin, &one, &lbeta);
        d__1 = -one;
        dlamc4_(&ngnmin, &d__1, &lbeta);
        dlamc4_(&gpmin, &a, &lbeta);
        d__1 = -a;
        dlamc4_(&gnmin, &d__1, &lbeta);
        ieee = FALSE_;

        if (ngpmin == ngnmin && gpmin == gnmin) {
            if (ngpmin == gpmin) {
                lemin = ngpmin;
/*            ( Non twos-complement machines, no gradual underflow; e.g.,  VAX ) */
            } else if (gpmin - ngpmin == 3) {
                lemin = ngpmin - 1 + lt;
                ieee = TRUE_;
/*            ( Non twos-complement machines, with gradual underflow; e.g., IEEE standard followers ) */
            } else {
                lemin = min(ngpmin,gpmin);
/*            ( A guess; no known machine ) */
                iwarn = TRUE_;
            }

        } else if (ngpmin == gpmin && ngnmin == gnmin) {
            if (abs(ngpmin - ngnmin) == 1) {
                lemin = max(ngpmin,ngnmin);
/*            ( Twos-complement machines, no gradual underflow; e.g., CYBER 205 ) */
            } else {
                lemin = min(ngpmin,ngnmin);
/*            ( A guess; no known machine ) */
                iwarn = TRUE_;
            }
        } else if (abs(ngpmin - ngnmin) == 1 && gpmin == gnmin) {
            if (gpmin - min(ngpmin,ngnmin) == 3) {
                lemin = max(ngpmin,ngnmin) - 1 + lt;
/*            ( Twos-complement machines with gradual underflow; no known machine ) */
            } else {
                lemin = min(ngpmin,ngnmin);
/*            ( A guess; no known machine ) */
                iwarn = TRUE_;
            }
        } else {
            lemin = min(min(min(ngpmin,ngnmin),gpmin),gnmin);
/*         ( A guess; no known machine ) */
            iwarn = TRUE_;
        }
/* ** Comment out this if block if EMIN is ok */
        if (iwarn) {
            first = TRUE_;
            printf("\n\n WARNING. The value EMIN may be incorrect: - ");
            printf("EMIN = %8i\n",lemin);
            printf("If, after inspection, the value EMIN looks acceptable ");
            printf("please comment out\n the IF block as marked within the ");
            printf("code of routine DLAMC2,\n otherwise supply EMIN ");
            printf("explicitly.\n");
        }
/* **     Assume IEEE arithmetic if we found denormalised  numbers above,
          or if arithmetic seems to round in the  IEEE style,  determined
          in routine DLAMC1. A true IEEE machine should have both  things
          true; however, faulty machines may have one or the other.
*/

        ieee = ieee || lieee1;

/*        Compute  RMIN by successive division by  BETA. We could compute
          RMIN as BASE**( EMIN - 1 ),  but some machines underflow during
          this computation.
*/

        lrmin = one;
        for (i = 1; i <= 1-lemin; ++i) {
            d__1 = lrmin * rbase;
            lrmin = dlamc3_(&d__1, &zero);
        }

/*        Finally, call DLAMC5 to compute EMAX and RMAX. */

        dlamc5_(&lbeta, &lt, &lemin, &ieee, &lemax, &lrmax);
    }

    *beta = lbeta;
    *t = lt;
    *rnd = lrnd;
    *eps = leps;
    *emin = lemin;
    *rmin = lrmin;
    *emax = lemax;
    *rmax = lrmax;
} /* dlamc2_ */

doublereal dlamc3_(doublereal *a, doublereal *b)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    DLAMC3  is intended to force  A  and  B  to be stored prior to doing
    the addition of  A  and  B ,  for use in situations where optimizers
    might hold one of these in a register.

    Arguments
    =========

    A, B    (input) DOUBLE PRECISION
            The values A and B.

   =====================================================================
*/

    return *a + *b;
} /* dlamc3_ */

doublereal dlamc33_(doublereal *a, doublereal *b)
{
/*  Purpose
    =======

    As DLAMC3, but subtract A and B instead of adding them.

    Arguments
    =========

    A, B    (input) DOUBLE PRECISION
            The values A and B.

   =====================================================================
*/
    return *a - *b;

} /* dlamc33_ */

/* Subroutine */
void dlamc4_(integer *emin, doublereal *start, integer *base)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    DLAMC4 is a service routine for DLAMC2.

    Arguments
    =========

    EMIN    (output) EMIN
            The minimum exponent before (gradual) underflow, computed by
            setting A = START and dividing by BASE until the previous A
            can not be recovered.

    START   (input) DOUBLE PRECISION
            The starting point for determining EMIN.

    BASE    (input) INTEGER
            The base of the machine.

   =====================================================================
*/
    /* System generated locals */
    doublereal d__1;
    /* Local variables */
    static doublereal zero = 0., a;
    static integer i;
    static doublereal rbase, b1, b2, c1, c2, d1, d2;
    static doublereal one = 1.;

    a = *start;
    rbase = one / *base;
    *emin = 1;
    d__1 = a * rbase;
    b1 = dlamc3_(&d__1, &zero);
    c1 = c2 = d1 = d2 = a;
    while (c1 == a && c2 == a && d1 == a && d2 == a) {
        --(*emin);
        a = b1;
        d__1 = a / *base;
        b1 = dlamc3_(&d__1, &zero);
        d__1 = b1 * *base;
        c1 = dlamc3_(&d__1, &zero);
        d1 = zero;
        for (i = 1; i <= *base; ++i) {
            d1 += b1;
        }
        d__1 = a * rbase;
        b2 = dlamc3_(&d__1, &zero);
        d__1 = b2 / rbase;
        c2 = dlamc3_(&d__1, &zero);
        d2 = zero;
        for (i = 1; i <= *base; ++i) {
            d2 += b2;
        }
    }
} /* dlamc4_ */

/* Subroutine */
void dlamc5_(integer *beta, integer *p, integer *emin,
             logical *ieee, integer *emax, doublereal *rmax)
{
/*  -- LAPACK auxiliary routine (version 2.0) --
       Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
       Courant Institute, Argonne National Lab, and Rice University
       October 31, 1992

    Purpose
    =======

    DLAMC5 attempts to compute RMAX, the largest machine floating-point
    number, without overflow.  It assumes that EMAX + abs(EMIN) sum
    approximately to a power of 2.  It will fail on machines where this
    assumption does not hold, for example, the Cyber 205 (EMIN = -28625,
    EMAX = 28718).  It will also fail if the value supplied for EMIN is
    too large (i.e. too close to zero), probably with overflow.

    Arguments
    =========

    BETA    (input) INTEGER
            The base of floating-point arithmetic.

    P       (input) INTEGER
            The number of base BETA digits in the mantissa of a
            floating-point value.

    EMIN    (input) INTEGER
            The minimum exponent before (gradual) underflow.

    IEEE    (input) LOGICAL
            A logical flag specifying whether or not the arithmetic
            system is thought to comply with the IEEE standard.

    EMAX    (output) INTEGER
            The largest exponent before overflow

    RMAX    (output) DOUBLE PRECISION
            The largest machine floating-point number.

   =====================================================================
*/

    /* Table of constant values */
    static doublereal c_b5 = 0.;
    /* System generated locals */
    doublereal d__1;
    /* Local variables */
    static integer lexp;
    static doublereal oldy;
    static integer uexp, i;
    static doublereal y, z;
    static integer nbits;
    static doublereal recbas;
    static integer exbits, expsum, try;

/*     First compute LEXP and UEXP, two powers of 2 that bound
       abs(EMIN). We then assume that EMAX + abs(EMIN) will sum
       approximately to the bound that is closest to abs(EMIN).
       (EMAX is the exponent of the required number RMAX).
*/

    lexp = 1;
    exbits = 1;
    while ((try = lexp << 1) <= -(*emin)) {
        lexp = try;
        ++exbits;
    }
    if (lexp == -(*emin)) {
        uexp = lexp;
    } else {
        uexp = try;
        ++exbits;
    }

/*     Now -LEXP is less than or equal to EMIN, and -UEXP is greater
       than or equal to EMIN. EXBITS is the number of bits needed to
       store the exponent.
*/

    if (uexp + *emin > -lexp - *emin) {
        expsum = lexp << 1;
    } else {
        expsum = uexp << 1;
    }

/*     EXPSUM is the exponent range, approximately equal to EMAX - EMIN + 1 . */

    *emax = expsum + *emin - 1;
    nbits = exbits + 1 + *p;

/*     NBITS is the total number of bits needed to store a floating-point number. */

    if (nbits % 2 == 1 && *beta == 2) {

/*        Either there are an odd number of bits used to store a
          floating-point number, which is unlikely, or some bits are
          not used in the representation of numbers, which is possible,
          (e.g. Cray machines) or the mantissa has an implicit bit,
          (e.g. IEEE machines, Dec Vax machines), which is perhaps the
          most likely. We have to assume the last alternative.
          If this is true, then we need to reduce EMAX by one because
          there must be some way of representing zero in an implicit-bit
          system. On machines like Cray, we are reducing EMAX by one
          unnecessarily.
*/

        --(*emax);
    }

    if (*ieee) {

/*        Assume we are on an IEEE machine which reserves one exponent for infinity and NaN.  */

        --(*emax);
    }

/*     Now create RMAX, the largest machine number, which should
       be equal to (1.0 - BETA**(-P)) * BETA**EMAX .

       First compute 1.0 - BETA**(-P), being careful that the result is less than 1.0 . */

    recbas = 1. / *beta;
    z = *beta - 1.;
    y = 0.;
    for (i = 1; i <= *p; ++i) {
        z *= recbas;
        if (y < 1.) {
            oldy = y;
        }
        y = dlamc3_(&y, &z);
    }
    if (y >= 1.) {
        y = oldy;
    }

/*     Now multiply by BETA**EMAX to get RMAX. */

    for (i = 1; i <= *emax; ++i) {
        d__1 = y * *beta;
        y = dlamc3_(&d__1, &c_b5);
    }

    *rmax = y;
} /* dlamc5_ */