[Insight-developers] Validation Directory

Jisung Kim bahrahm@yahoo.com
Wed, 18 Sep 2002 10:54:20 -0700 (PDT)


Hi Sayan and Josh.

I agree for now. Then the Validation directory will
look just like the Examples directory which each
examples has its own directory and everything happens
under the subdirectory under the control of each
validation study,  right?

If any more change or suggestion won't come up until
the beginning of next week, I am going to start to
check in my stuff following your suggestions.

Thanks,
--- Sayan Pathak <spathak@insightful.com> wrote:
> Hi Josh,
> I agree with your opinion. That is the view I have
> too. Given the time frame that is the best we can
> achieve.
> 
> Sayan
> 
> > -----Original Message-----
> > From: Joshua Cates [mailto:cates@sci.utah.edu]
> > Sent: Wednesday, September 18, 2002 9:56 AM
> > To: Jisung Kim
> > Cc: Sayan Pathak; insight-dev-list
> > Subject: RE: [Insight-developers] Validation
> Directory
> > 
> > 
> > Hi Jisung, Sayan, Lydia,
> > 
> > I think the simplest course of action is just to
> follow the 
> > example that
> > Lydia has set with her validation work.  Relevant
> code goes in the
> > Validation directory, the write up and results
> (i.e. tables, figures,
> > stats, etc)  go in InsightDocuments, and any large
> input or 
> > output data
> > files go in the Insight data ftp repository.  
> > 
> > Organize your code and results under the
> appropriate 
> > directory in whatever
> > way makes sense for your particular study.
> > 
> > Whether or not we want Validation code on the
> dashboard right 
> > now is an 
> > open question, but my vote is to leave it off
> until things have 
> > stabilized.
> > 
> > Josh.
> > 
> > ______________________________
> >  Josh Cates			
> >  School of Computer Science	
> >  University of Utah
> >  Email: cates@sci.utah.edu
> >  Phone: (801) 587-7697
> >  URL:   www.cs.utk.edu/~cates
> > 
> > 
> > On Tue, 17 Sep 2002, Jisung Kim wrote:
> > 
> > > HI Sayan.
> > > 
> > > I think we don't have much time left before
> final
> > > release. I think the most important thing in
> > > coordinating validation studies is that making
> things
> > > comparable to be easily comparable. What I mean
> is
> > > that studies that use similar or exactly same
> data for
> > > similar purpose should be easily comparable
> between
> > > different groups' studies. 
> > > 
> > > For example, your IBSR classification study
> includes
> > > GaussianClassifier, KmeansClassifier,
> > > MRFGaussianClassifier, and MRFKmeansClassifier.
> My
> > > validation study will include k-d tree based
> Kmeans,
> > > expectation-maximization mixture modelling, and
> > > goodness-of-fit mixture modelling. I plan to use
> > > BrainWeb data for multi-channel experiments and
> IBSR
> > > for  single-channel experiments. Users might
> want to
> > > compare the results from your KmeansClassifier
> and my
> > > k-d tree based Kmeans clustering or the results
> from
> > > you MRFKmeansClassifier and my EM stuff. 
> > > 
> > > I think at least we should agree on two things
> for
> > > this. First, since you and I will use the IBSR
> data,
> > > by creating a common data discription that
> explains
> > > how to get the data and includes proper
> credentials,
> > > we can prevent duplicate and confusing
> description
> > > about the same data. I think it is even better
> if we
> > > can agree on even the image file format. I
> prefer
> > > meta-image format,  because of its simplicity
> and easy
> > > to read through text editors (at least, you can
> get
> > > basic information about the image from it).
> Second, it
> > > will be nice if we have a  consistent formats
> for the
> > > experiments' output that allows users to read
> them on
> > > their favorite data analysis tools to summarize
> and
> > > plot them to get more comparable and intuitive
> > > comparisons.
> > > 
> > > Here are my some suggestions:
> > > 
> > > 1) directory structure
> > > * Validation/Data
> > >     has subdirectories for each datasets. For an
> > > example, Validation/Data/IBSR and
> > > Validation/Data/BrainWeb. the IBSR directory may
> > > includes a credentials or copyright statement of
> IBSR
> > > dataset, how-to-get-it documents, and meta image
> > > headers (only headers).
> > > 
> > > * Validataion/Utilities or Validation/Common
> > >     I found that plotting data before doing any
> real
> > > processing is quite important to understand the
> data
> > > and  make proper plans for data analysis. So I
> created
> > > a utility that samples data point from the
> dataset and
> > > create a table in a file that I can use for
> plotting
> > > in a statistical package. I also have a
> preprocessor
> > > that maskes out some tissue classes from the
> data
> > > using class mask image and then normalize images
> using
> > > images' means and standard deviations. I believe
> > > normalizing process is quite common for
> multivariate
> > > data analysis. I think also some basic UI stuff
> can be
> > > placed here too.
> > > 
> > > * Validation/"Your Own Studies"
> > >   In this directory, we put our own validation
> stuff
> > > specific to each study. I like your idea, having
> at
> > > least three common subdirectories that you
> already
> > > have with your validation stuff, "Code",
> "Inputs", and
> > > "Results".
> > > 
> > > 2) Experiments Output format
> > > * table with headers, each class statistics is a
> > > record ( a row) in the table.
> > > 
> > >    For example, a clustering algorithm produces
> 3
> > > different Gaussian classes with each class has a
> mean
> > > and a standard deviation as its parameters. And
> in
> > > addition to that it has two common field for an
> case,
> > > "number of iterations" and "elapsed time". And I
> ran
> > > it with two different sets of initial parameters
> for
> > > classes (say, two cases in the experiment). Then
> the
> > > output file would look like:
> > > 
> > > "case" "class" "mean" "standard deviation"
> > > "iterations" "elapsed time"
> > > 1 1 200 40 2000 20.35
> > > 1 2 300 20 2000 20.35
> > > 1 3 100 50 2000 20.35
> > > 2 1 202 38 2020 21.23
> > > 2 2 298 19 2020 21.23
> > > 2 3 98 47 2020 21.23
> > > 
> > > I also want to name such output file or other
> table
> > > like files ( such as intial parameters files)
> have the
> > > same file extension such as .dat for each
> search. :)
> > > 
> 
=== message truncated ===


=====
Jisung Kim
bahrahm@yahoo.com
106 Mason Farm Rd.
129 Radiology Research Lab., CB# 7515
Univ. of North Carolina at Chapel Hill
Chapel Hill, NC 27599-7515

__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com