ITK/Mutual Information

From KitwarePublic
< ITK
Revision as of 21:48, 16 July 2004 by Ibanez (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
When you consider the pixel values of
  images A and B to be random variables,
  "a" and "b"; and estimate the entropy of
  their distributions you get
   H(A) = - Sum( p(a)* log2( p(a) ) )
   H(B) = - Sum( p(b)* log2( p(b) ) )
  Note that log2 is the logarithm in base
  two, not the natural logarigthm that
  unfortunately is commonly used.
  When you use log2(), the Units of entropy
  are "bit"s. It is again unfortunate that
  the usage of "bit" as unit of information
  measurement has been distorted in order
  to become the a symbol for binary encoding,
  or a unit of measurement for raw capacity
  of memory storage (along with its derivatives
  the byte, KiloByte, MegaByte... )
  A digital image whose pixels are encoded in
  pixels of M bits, can have 2^M different grayscale
  values in a pixel and therfore its entropy can go
  up to the maximum theoretical value of log2(2^M)
  which, not coincidentally is equal to M.
  In other words, if you compute the entropy of
  an image with PixelType unsigned char, whose
  pixels have grayscale values following a uniform
  distribution, the maximum value that you can get
  is "8", and if you want to be formal, you should
  mention the units and say:
  The Entropy of this images is:    "8 bits"
  In practice, of course you get lower values.
  For example the Entropy of the well known Lena
  image (the cropped version that is politicaly
  correct) is
         Lena Entropy  = 7.44 bits


   Now if you consider the mutual information measure,
   you have the following situation:


      Mutual Information = H(A) + H(B) - H(A,B)
             MI(A,B)     = H(A) + H(B) - H(A,B)
   In general, both H(A) and H(B) are bounded between
   [0:M] where "M" is the number of bits used for
   encoding their pixels.
      H(A,B) is in theory bounded in [0:M2]


   Note that if you use histograms with *less* bins
   than the actual digital encoding of the image,
   then your estimation of Entropy is bounded by
   the number of bins in your Histogram. !
   For example if you use a histogram with 20 bins
   instead of 256 in order to estimate Lena's Entropy
   you will not get 7.44 bits, but only
                  3.99 bits
   that reflects the fact that by quantizing the
   gray scale values in larger bins you loseinformation
   from the original image.


   For the particular case of Self-similarity that you
   are considering, the entropies H(A) and H(B) are
   expected to be pretty much the same. Their difference
   arise only from interpolation errors and from the
   eventual effect of one image having corners outside
   of the extent of the other (e.g. if the image is rotated).
   So, in general Mutual Information will give you
          MI(A,T(A)) = H(A) + H(T(A)) - H(A,T(A))
    Where T(A) is the Transformed version of A. E.g. under
    a translation, or rotation, or affine transform.
    If T = Identiy and the interpolator is not approximating,
    your measure of Mutual Information becomes
            MI(A,A) = 2 H(A) - H(A,A)
    and the joint entropy H(A,A) happens to be equal to
    the entropy of the single image H(A), therefore the
    expected value of Mutual Information is equal to the
    image Entropy (of course measured in bits).
                   MI(A,A) = H(A) bits
    That means that if you evaluate the Mutual Information
    measure between Lena and itself, you should get
                     7.44 bits


    Note that the reason why the measure of Mutual Information
    is reported as a negative number in ITK is because traditionally
    that has been used as a cost function for minimization.
    However, in principle, Mutual information should be
    reported in the range [0,H(A)], where zero corresponds
    to two totally uncorrelated images and H(A) corresponds
    to perfectly correlated images, case in which H(A)=H(B).


    To summarize, note that ITK is not using log2() but
    just log(), and note that the measure is reported as
    a negative number.


   We just added a simple example for computing the Entropy
   of the pixel value distribution of an image to the
   directory:
      Insight/Examples/Statistics/
                         ImageEntropy1.cxx
   You may find interesting to play with this example.
   E.g. you should try the effect of chaning the number
   of bins in the histogram.
   Note that you will have to update your CVS checkout
   in order to get this new file.