[Insight-developers] Classical statistics

Miller, James V (Research) millerjv@crd.ge.com
Tue, 22 Oct 2002 09:55:58 -0400


This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C279D2.BF75CBB5
Content-Type: text/plain;
	charset="iso-8859-1"

Is anyone working on adding what I will "classical statistics" to ITK?
 
The types of things that I am looking for are the classical interval and hypothesis testing.  I use
these types of tests in region merging/region splitting segmentation techniques.  There are lot of
"ad-hoc" statistics being used inside of ITK and our current algorithms would benefit having
classical statistics available.  The types of hypothesis testing  that I am look are for are

*	Does a sample mean equal a specified population mean?
*	Does a sample variance equal a specified population variance?
*	Does a (sample mean, sample variance) equal a specified (population mean, population
variance)?
*	Are two sample means equal?
*	Are two sample variances equal?
*	Are two (sample mean, sample variance) pairs equal?

I am interested in Student-t test, F-test, Chi-squared test, and Hotelling T^2 tests.  The issues
with these tests is that you usually need a table of values.  For instance for an F-test, there are
series of 2D tables.  There is a table for each "confidence"  value,  99%, 98%, 95%, 90% and each 2D
table has number of degrees of freedom in variable 1 down the rows and the number of degrees of
freedom in variable 2 across the columns.  The tables are usually sparse and once the degrees of
freedom get large enough (>1000?) there are polynomial approximations. I have built these tables in
the past and they usually work out to be about 25KB per 2D table.  I usually just build the F-tables
since you can use it for the F-test, Student-t (the square of a Student-t is an F statistics with one
of the degrees of freedom being 1), and Hotelling T^2 (which is a F statistic with different degrees
of freedom).
 
On the technical side, where would we put these tables?  The appropriate table would need to be
loaded from disk when a statical test was first used.
 
 
Here is nice web site on the techniques...
 
 <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm>
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm

 and here is a snippet from the site

	
*	Location 


1.	Measures  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda351.htm> of Location 

2.	Confidence  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm> Limits for the
Mean and One Sample t-Test 

3.	Two  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm> Sample t-Test for
Equal Means 

4.	One  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda354.htm> Factor Analysis of
Variance 

5.	Multi-Factor  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda355.htm> Analysis of
Variance 

*	Scale (or variability or spread) 


1.	Measures  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm> of Scale 

2.	Bartlett's  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda357.htm> Test 

3.	Chi-Square  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm> Test 

4.	F-Test <http://www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm>  

5.	Levene  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35a.htm> Test 

*	Skewness and Kurtosis 


1.	Measures  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm> of Skewness and
Kurtosis 

*	Randomness 


1.	Autocorrelation <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35c.htm>  

2.	Runs  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm> Test 

*	Distributional Measures 


1.	Anderson-Darling  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm> Test 

2.	Chi-Square  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm> Goodness-of-Fit
Test 

3.	Kolmogorov-Smirnov  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm> Test 

*	Outliers 


1.	Grubbs  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm> Test 

*	2-Level Factorial Designs 


1.	Yates  <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i.htm> Analysis 

Jim Miller 
_____________________________________
Visualization & Computer Vision
GE Research
Bldg. KW, Room C218B
P.O. Box 8, Schenectady NY 12301

millerjv@research.ge.com <mailto:millerjv@research.ge.com> 

james.miller@research.ge.com
(518) 387-4005, Dial Comm: 8*833-4005, 
Cell: (518) 505-7065, Fax: (518) 387-6981 

 <?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

 

------_=_NextPart_001_01C279D2.BF75CBB5
Content-Type: text/html;
	charset="iso-8859-1"

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">


<META content="MSHTML 6.00.2715.400" name=GENERATOR></HEAD>
<BODY>
<DIV><SPAN class=943172413-22102002><FONT size=2>Is anyone working on adding 
what I will "classical statistics" to ITK?</FONT></SPAN></DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2>The types of things that I am 
looking for are the classical interval and hypothesis testing.&nbsp; I use these 
types of tests in region merging/region splitting segmentation techniques.&nbsp; 
There are lot of "ad-hoc" statistics being used inside of ITK and our current 
algorithms would benefit having classical statistics available.&nbsp; The types 
of hypothesis testing&nbsp; that I am look are for are</FONT></SPAN></DIV>
<UL>
  <LI><SPAN class=943172413-22102002><FONT size=2>Does a sample mean equal a 
  specified population mean?</FONT></SPAN></LI>
  <LI><SPAN class=943172413-22102002><FONT size=2>Does a sample variance equal a 
  specified population variance?</FONT></SPAN></LI>
  <LI><SPAN class=943172413-22102002><FONT size=2>Does a (sample mean, sample 
  variance) equal a specified (population mean, population 
  variance)?</FONT></SPAN></LI>
  <LI><SPAN class=943172413-22102002><FONT size=2>Are two sample means 
  equal?</FONT></SPAN></LI>
  <LI><SPAN class=943172413-22102002><FONT size=2>Are two sample variances 
  equal?</FONT></SPAN></LI>
  <LI><SPAN class=943172413-22102002><FONT size=2>Are two (sample mean, sample 
  variance) pairs equal?</FONT></SPAN></LI></UL>
<DIV><SPAN class=943172413-22102002><FONT size=2>I am interested in Student-t 
test, F-test, Chi-squared test, and Hotelling T^2 tests.&nbsp; The issues with 
these tests is that you usually need a table of values.&nbsp; For instance for 
an F-test, there are series of 2D tables.&nbsp; There is a table for each 
"confidence"&nbsp; value,&nbsp; 99%, 98%, 95%, 90% and each 2D table has number 
of degrees of freedom in variable 1 down the rows and the number of degrees of 
freedom in variable 2 across the columns.&nbsp; The tables are usually sparse 
and once the degrees of freedom get large enough (&gt;1000?) there are 
polynomial approximations. I have built these tables in the past and they 
usually work out to be about 25KB per 2D table.&nbsp; I usually just build the 
F-tables since you can use it for the F-test, Student-t (the square of a 
Student-t is an F statistics with one of the degrees of freedom being 1), and 
Hotelling T^2 (which is a F statistic with different degrees of 
freedom).</FONT></SPAN></DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2>On the technical side, where 
would we put these tables?&nbsp; The appropriate table would need to be loaded 
from disk when a statical test was first used.</FONT></SPAN></DIV>
<DIV><SPAN class=943172413-22102002><SPAN 
class=943172413-22102002></SPAN>&nbsp;</DIV>
<DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2>Here is nice web site on the 
techniques...</FONT></SPAN></DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=943172413-22102002><A 
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm"><FONT 
size=2>http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm</FONT></A></SPAN></DIV>
<DIV class=Section1>
<P class=MsoNormal><FONT size=2>&nbsp;<SPAN class=943172413-22102002>and here is 
a snippet from the site</SPAN></FONT></P></DIV></SPAN></DIV>
<UL><SPAN class=943172413-22102002><FONT size=2>
  <LI>Location 
  <OL>
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda351.htm">Measures 
    of Location</A> 
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm">Confidence 
    Limits for the Mean and One Sample t-Test</A> 
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm">Two 
    Sample t-Test for Equal Means</A> 
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda354.htm">One 
    Factor Analysis of Variance</A> 
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda355.htm">Multi-Factor 
    Analysis of Variance</A> </LI></OL>
  <LI>Scale (or variability or spread) 
  <OL>
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm">Measures 
    of Scale</A> 
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda357.htm">Bartlett's 
    Test</A> 
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm">Chi-Square 
    Test</A> 
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm">F-Test</A> 

    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35a.htm">Levene 
    Test</A> </LI></OL>
  <LI>Skewness and Kurtosis 
  <OL>
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm">Measures 
    of Skewness and Kurtosis</A> </LI></OL>
  <LI>Randomness 
  <OL>
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35c.htm">Autocorrelation</A> 

    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm">Runs 
    Test</A> </LI></OL>
  <LI>Distributional Measures 
  <OL>
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm">Anderson-Darling 
    Test</A> 
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm">Chi-Square 
    Goodness-of-Fit Test</A> 
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm">Kolmogorov-Smirnov 
    Test</A> </LI></OL>
  <LI>Outliers 
  <OL>
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm">Grubbs 
    Test</A> </LI></OL>
  <LI>2-Level Factorial Designs 
  <OL>
    <LI><A 
    href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i.htm">Yates 
    Analysis</A> </LI></OL></LI></UL><!-- end paragraph --></FONT></SPAN>
<DIV><B><SPAN style="COLOR: navy; FONT-FAMILY: 'Comic Sans MS'">Jim 
Miller</SPAN></B> <BR><B><I><SPAN 
style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: Arial">_____________________________________</SPAN></I></B><BR><EM><SPAN 
style="FONT-SIZE: 7.5pt; COLOR: black; FONT-FAMILY: Arial">Visualization &amp; 
Computer Vision</SPAN></EM><I><SPAN 
style="FONT-SIZE: 7.5pt; COLOR: black; FONT-FAMILY: Arial"><BR><EM>GE 
Research</EM><BR><EM>Bldg. KW, Room C218B</EM><BR><EM>P.O. Box 8, Schenectady NY 
12301</EM><BR><BR></SPAN></I><EM><U><SPAN 
style="FONT-SIZE: 7.5pt; COLOR: blue"><A 
href="mailto:millerjv@research.ge.com">millerjv@research.ge.com</A></SPAN></U></EM></DIV>
<DIV class=Section1>
<P style="MARGIN: 0in 0in 0pt"><EM><U><SPAN 
style="FONT-SIZE: 7.5pt; COLOR: blue">james.miller@research.ge.com</SPAN></U></EM><BR><I><SPAN 
style="FONT-SIZE: 7.5pt; COLOR: black; FONT-FAMILY: Arial">(518) 387-4005, Dial 
Comm: 8*833-4005, </SPAN></I><BR><I><SPAN 
style="FONT-SIZE: 7.5pt; COLOR: black; FONT-FAMILY: Arial">Cell: (518) 505-7065, 
Fax: (518) 387-6981</SPAN></I> </P>
<P class=MsoNormal>&nbsp;<?xml:namespace prefix = o ns = 
"urn:schemas-microsoft-com:office:office" /><o:p></o:p></P></DIV>
<DIV>&nbsp;</DIV></BODY></HTML>

------_=_NextPart_001_01C279D2.BF75CBB5--