[Insight-developers] Classical statistics
Miller, James V (Research)
millerjv@crd.ge.com
Tue, 22 Oct 2002 09:55:58 -0400
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
------_=_NextPart_001_01C279D2.BF75CBB5
Content-Type: text/plain;
charset="iso-8859-1"
Is anyone working on adding what I will "classical statistics" to ITK?
The types of things that I am looking for are the classical interval and hypothesis testing. I use
these types of tests in region merging/region splitting segmentation techniques. There are lot of
"ad-hoc" statistics being used inside of ITK and our current algorithms would benefit having
classical statistics available. The types of hypothesis testing that I am look are for are
* Does a sample mean equal a specified population mean?
* Does a sample variance equal a specified population variance?
* Does a (sample mean, sample variance) equal a specified (population mean, population
variance)?
* Are two sample means equal?
* Are two sample variances equal?
* Are two (sample mean, sample variance) pairs equal?
I am interested in Student-t test, F-test, Chi-squared test, and Hotelling T^2 tests. The issues
with these tests is that you usually need a table of values. For instance for an F-test, there are
series of 2D tables. There is a table for each "confidence" value, 99%, 98%, 95%, 90% and each 2D
table has number of degrees of freedom in variable 1 down the rows and the number of degrees of
freedom in variable 2 across the columns. The tables are usually sparse and once the degrees of
freedom get large enough (>1000?) there are polynomial approximations. I have built these tables in
the past and they usually work out to be about 25KB per 2D table. I usually just build the F-tables
since you can use it for the F-test, Student-t (the square of a Student-t is an F statistics with one
of the degrees of freedom being 1), and Hotelling T^2 (which is a F statistic with different degrees
of freedom).
On the technical side, where would we put these tables? The appropriate table would need to be
loaded from disk when a statical test was first used.
Here is nice web site on the techniques...
<http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm>
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm
and here is a snippet from the site
* Location
1. Measures <http://www.itl.nist.gov/div898/handbook/eda/section3/eda351.htm> of Location
2. Confidence <http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm> Limits for the
Mean and One Sample t-Test
3. Two <http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm> Sample t-Test for
Equal Means
4. One <http://www.itl.nist.gov/div898/handbook/eda/section3/eda354.htm> Factor Analysis of
Variance
5. Multi-Factor <http://www.itl.nist.gov/div898/handbook/eda/section3/eda355.htm> Analysis of
Variance
* Scale (or variability or spread)
1. Measures <http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm> of Scale
2. Bartlett's <http://www.itl.nist.gov/div898/handbook/eda/section3/eda357.htm> Test
3. Chi-Square <http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm> Test
4. F-Test <http://www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm>
5. Levene <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35a.htm> Test
* Skewness and Kurtosis
1. Measures <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm> of Skewness and
Kurtosis
* Randomness
1. Autocorrelation <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35c.htm>
2. Runs <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm> Test
* Distributional Measures
1. Anderson-Darling <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm> Test
2. Chi-Square <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm> Goodness-of-Fit
Test
3. Kolmogorov-Smirnov <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm> Test
* Outliers
1. Grubbs <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm> Test
* 2-Level Factorial Designs
1. Yates <http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i.htm> Analysis
Jim Miller
_____________________________________
Visualization & Computer Vision
GE Research
Bldg. KW, Room C218B
P.O. Box 8, Schenectady NY 12301
millerjv@research.ge.com <mailto:millerjv@research.ge.com>
james.miller@research.ge.com
(518) 387-4005, Dial Comm: 8*833-4005,
Cell: (518) 505-7065, Fax: (518) 387-6981
<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
------_=_NextPart_001_01C279D2.BF75CBB5
Content-Type: text/html;
charset="iso-8859-1"
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2715.400" name=GENERATOR></HEAD>
<BODY>
<DIV><SPAN class=943172413-22102002><FONT size=2>Is anyone working on adding
what I will "classical statistics" to ITK?</FONT></SPAN></DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2>The types of things that I am
looking for are the classical interval and hypothesis testing. I use these
types of tests in region merging/region splitting segmentation techniques.
There are lot of "ad-hoc" statistics being used inside of ITK and our current
algorithms would benefit having classical statistics available. The types
of hypothesis testing that I am look are for are</FONT></SPAN></DIV>
<UL>
<LI><SPAN class=943172413-22102002><FONT size=2>Does a sample mean equal a
specified population mean?</FONT></SPAN></LI>
<LI><SPAN class=943172413-22102002><FONT size=2>Does a sample variance equal a
specified population variance?</FONT></SPAN></LI>
<LI><SPAN class=943172413-22102002><FONT size=2>Does a (sample mean, sample
variance) equal a specified (population mean, population
variance)?</FONT></SPAN></LI>
<LI><SPAN class=943172413-22102002><FONT size=2>Are two sample means
equal?</FONT></SPAN></LI>
<LI><SPAN class=943172413-22102002><FONT size=2>Are two sample variances
equal?</FONT></SPAN></LI>
<LI><SPAN class=943172413-22102002><FONT size=2>Are two (sample mean, sample
variance) pairs equal?</FONT></SPAN></LI></UL>
<DIV><SPAN class=943172413-22102002><FONT size=2>I am interested in Student-t
test, F-test, Chi-squared test, and Hotelling T^2 tests. The issues with
these tests is that you usually need a table of values. For instance for
an F-test, there are series of 2D tables. There is a table for each
"confidence" value, 99%, 98%, 95%, 90% and each 2D table has number
of degrees of freedom in variable 1 down the rows and the number of degrees of
freedom in variable 2 across the columns. The tables are usually sparse
and once the degrees of freedom get large enough (>1000?) there are
polynomial approximations. I have built these tables in the past and they
usually work out to be about 25KB per 2D table. I usually just build the
F-tables since you can use it for the F-test, Student-t (the square of a
Student-t is an F statistics with one of the degrees of freedom being 1), and
Hotelling T^2 (which is a F statistic with different degrees of
freedom).</FONT></SPAN></DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2>On the technical side, where
would we put these tables? The appropriate table would need to be loaded
from disk when a statical test was first used.</FONT></SPAN></DIV>
<DIV><SPAN class=943172413-22102002><SPAN
class=943172413-22102002></SPAN> </DIV>
<DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2>Here is nice web site on the
techniques...</FONT></SPAN></DIV>
<DIV><SPAN class=943172413-22102002><FONT size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=943172413-22102002><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm"><FONT
size=2>http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm</FONT></A></SPAN></DIV>
<DIV class=Section1>
<P class=MsoNormal><FONT size=2> <SPAN class=943172413-22102002>and here is
a snippet from the site</SPAN></FONT></P></DIV></SPAN></DIV>
<UL><SPAN class=943172413-22102002><FONT size=2>
<LI>Location
<OL>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda351.htm">Measures
of Location</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm">Confidence
Limits for the Mean and One Sample t-Test</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm">Two
Sample t-Test for Equal Means</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda354.htm">One
Factor Analysis of Variance</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda355.htm">Multi-Factor
Analysis of Variance</A> </LI></OL>
<LI>Scale (or variability or spread)
<OL>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm">Measures
of Scale</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda357.htm">Bartlett's
Test</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm">Chi-Square
Test</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm">F-Test</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35a.htm">Levene
Test</A> </LI></OL>
<LI>Skewness and Kurtosis
<OL>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm">Measures
of Skewness and Kurtosis</A> </LI></OL>
<LI>Randomness
<OL>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35c.htm">Autocorrelation</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm">Runs
Test</A> </LI></OL>
<LI>Distributional Measures
<OL>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm">Anderson-Darling
Test</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm">Chi-Square
Goodness-of-Fit Test</A>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm">Kolmogorov-Smirnov
Test</A> </LI></OL>
<LI>Outliers
<OL>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm">Grubbs
Test</A> </LI></OL>
<LI>2-Level Factorial Designs
<OL>
<LI><A
href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i.htm">Yates
Analysis</A> </LI></OL></LI></UL><!-- end paragraph --></FONT></SPAN>
<DIV><B><SPAN style="COLOR: navy; FONT-FAMILY: 'Comic Sans MS'">Jim
Miller</SPAN></B> <BR><B><I><SPAN
style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: Arial">_____________________________________</SPAN></I></B><BR><EM><SPAN
style="FONT-SIZE: 7.5pt; COLOR: black; FONT-FAMILY: Arial">Visualization &
Computer Vision</SPAN></EM><I><SPAN
style="FONT-SIZE: 7.5pt; COLOR: black; FONT-FAMILY: Arial"><BR><EM>GE
Research</EM><BR><EM>Bldg. KW, Room C218B</EM><BR><EM>P.O. Box 8, Schenectady NY
12301</EM><BR><BR></SPAN></I><EM><U><SPAN
style="FONT-SIZE: 7.5pt; COLOR: blue"><A
href="mailto:millerjv@research.ge.com">millerjv@research.ge.com</A></SPAN></U></EM></DIV>
<DIV class=Section1>
<P style="MARGIN: 0in 0in 0pt"><EM><U><SPAN
style="FONT-SIZE: 7.5pt; COLOR: blue">james.miller@research.ge.com</SPAN></U></EM><BR><I><SPAN
style="FONT-SIZE: 7.5pt; COLOR: black; FONT-FAMILY: Arial">(518) 387-4005, Dial
Comm: 8*833-4005, </SPAN></I><BR><I><SPAN
style="FONT-SIZE: 7.5pt; COLOR: black; FONT-FAMILY: Arial">Cell: (518) 505-7065,
Fax: (518) 387-6981</SPAN></I> </P>
<P class=MsoNormal> <?xml:namespace prefix = o ns =
"urn:schemas-microsoft-com:office:office" /><o:p></o:p></P></DIV>
<DIV> </DIV></BODY></HTML>
------_=_NextPart_001_01C279D2.BF75CBB5--