[IGSTK-Developers] Crash recovery?

Mon Sep 12 22:28:04 EDT 2005

Hi Luis,
Thanks.   I admire your devotion to patients :)
I agree that having a state machine formalism helps define the behavior of a
component, and that seems to me the best argument for using it.  Restricting
the number of ways in which a class can be used is a reasonable way to
reduce the number of subsequent errors and I'm sure that you have gone
through discussions of the tradeoffs between having to manage the state
machine formalism overhead everytime functionality is added. 

Thanks for the rest of the treatise. I really don't know where you find the
time to work and write all these emails, but I appreciate it.

-Tina

>-----Original Message-----
>From: Luis Ibanez [mailto:luis.ibanez at kitware.com] 
>Sent: Monday, September 12, 2005 5:48 AM
>To: Tina Kapur
>Cc: 'David Gobbi'; 'Andinet Enquobahrie'; 'IGSTK Developers'
>Subject: Re: [IGSTK-Developers] Crash recovery?
>
>
>Hi Tina,
>
>A State Machine is the most formal model of an Algorithm.
>
>Every computer *is* a state machine, and every computer 
>program *is* a state machine, regardless of whether we design 
>it or not with a state machine formalism in mind.
>
>Computer code that is not designed as a State Machine 
>formalism, end up being a bad quality code because tend to 
>have large numbers of poorly defined states, and its execution 
>is very fragile because it is based on assumptions made by 
>developers, that are often ignored by users.
>
>
>By designing every IGSTK class as a State Machine we achieve 
>the following goals:
>
>1) Clearly defined its behavior.
>
>2) Restricted the number of ways in which a class can be
>    used, therefore reducing the risk of misuse by developers.
>
>3) Prevented it from entering error states. By introducing
>    preconditions that test whether it is safe or not to
>    enter in a state, before performing the corresponding
>    transition.
>
>4) Facilitated debugging, since it is possible to track a
>    class through the set of transitions that it follows.
>
>5) Facilitated logging and retrospective analysis of the
>    classes, again because a finite set of well defined
>    states makes possible for us to know what the class
>    should do at every interaction with other classes.
>
>
>You are right about testing, the State Machine formalism does 
>not replace testing. But, only the State Machine approach 
>makes possible to implement full testing, both in the sense of 
>100% code coverage, and in the sense of 100% coverage of 
>execution paths.
>
>That is, thanks to the State Machines we are able to test the 
>behavior of a class when its methods are called in arbitrary 
>order, or as you suggest, when they are called repeatedly.
>
>A State Machine should be able to pass the "Monkey test"
>which is the most exigent in software implementation.
>That is, the state machine should admit random input without 
>entering in error states or ever crashing.
>
>The developers of the GNU compiler actually perform Monkey 
>test on it, and it has proved to be extremely robust to random input.
>
>
>That being said,...
>
>The State Machine is not a magical solution either.
>We still require a quality control system for the toolkit involving:
>
>     Nightly Builds
>     Continuous Builds
>     Code Coverage analysis (currently at 90%)
>     Dynamic run-time analysis (Valgrind)
>     Code Reviews
>
>This system is reinforced by the power of being Open Source, 
>since anybody can inspect the code and evaluate its quality 
>and correctness.
>
>The test suits will not quite determine the robustness, they 
>will allow to measure the robustness. The actual quality of 
>the code is still the result of good design and thoughtful analysis.
>
>
>On the dark side, the State Machines have also challenges that 
>we are trying to sort out. For example, it is not obvious how 
>to implement C++ hierarchies with State Machines in a 
>consistent way. It is also unclear how to keep their safety in 
>a multi-threaded environment.
>
>So, this is still an ongoing discussion, and we welcome any 
>ideas that will make this code safer for patients.
>
>
>
>    Regards,
>
>
>
>        Luis
>
>
>
>-----------------
>Tina Kapur wrote:
>> Thanks, Luis. 
>> 
>> Another question: why the state machine pattern? I've read a 
>couple of 
>> IGSTK papers and the discussion on the project wiki. I 
>appreciate the 
>> motivation to keep the code as robust as possible and to prevent 
>> programmers from making mistakes. However, the choice of the 
>> architecture typically most directly impacts extensibility and 
>> maintainability of the code base rather than robustness. The quality 
>> of programmers and test suites is what is going to determine the 
>> robustness of IGSTK.  Did I miss the point of what the state machine 
>> pattern buys here? Again, all your reputations precede you as 
>> architects and engineers, and I can see this architecture 
>choice being 
>> a no-op in the worst case, but if there are any particular 
>discussions 
>> that point to the benefits of this pattern in the case of 
>IGS, I'd appreciate a pointer to increase my knowledge base in 
>this area.
>> 
>> And thanks again to everyone for any time this discussion 
>might cost you... 
>> 
>> Best,
>> -Tina
>> 
>> 
>>>-----Original Message-----
>>>From: Luis Ibanez [mailto:luis.ibanez at kitware.com]
>>>Sent: Friday, September 09, 2005 2:26 PM
>>>To: Tina Kapur
>>>Cc: 'David Gobbi'; 'Andinet Enquobahrie'; 'IGSTK Developers'
>>>Subject: Re: [IGSTK-Developers] Crash recovery?
>>>
>>>
>>>Tina,
>>>
>>>If we implement applications as C++ classes that derive from an 
>>>igstk::Application class endowed with a State Machine, then 
>testing of 
>>>the application should be feasible with 100% code coverage and with 
>>>full path combinations (calling its method in arbitrary orders).
>>>
>>>Stress testing should also be possible, since it becomes a matter of 
>>>adding new test programs that rerun the application class as many 
>>>times as desired and in as many combinations of inputs as desired.
>>>
>>>The use of simulators, such as the tracker simulator that Hee-Su 
>>>wrote, will help a lot to make this vision of full testing a 
>practical 
>>>reality.
>>>
>>>
>>>   Luis
>>>
>>>
>>>-----------------
>>>Tina Kapur wrote:
>>>
>>>>Speaking of reliability and testing, is testing of applications 
>>>>considered outside the currently defined scope of the 
>toolkit?  I am 
>>>>sure that those of you who have worked with FDA requirements on 
>>>>products have suffered through the mechanism of writing one or more 
>>>>tests corresponding to each requirement in the req spec, and
>>>
>>>then documenting that you ran the test and it passed.
>>>
>>>>Other measures of robustness that I've also found useful are stress 
>>>>testing in which we simulate multiple passes through the software 
>>>>(this which basically checks for memory leaks and re-entrant code).
>>>>Again, as we all know, testing applications is a pretty
>>>
>>>large task and
>>>
>>>>I don't want to use too many of the group's cycles on it if it is 
>>>>currently outside the scope of the project...
>>>>
>>>>-Tina
>>>>
>>>> 
>>>>
>>>>
>>>>>-----Original Message-----
>>>>>From: David Gobbi [mailto:dgobbi at atamai.com]
>>>>>Sent: Thursday, September 08, 2005 5:31 PM
>>>>>To: Andinet Enquobahrie
>>>>>Cc: Tina Kapur; 'IGSTK Developers'
>>>>>Subject: Re: [IGSTK-Developers] Crash recovery?
>>>>>
>>>>>Andinet Enquobahrie wrote:
>>>>>
>>>>>
>>>>>
>>>>>>Tina Kapur wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>Hi,
>>>>>>>
>>>>>>>Does IGSTK have a mechanism for crash recovery?  I was
>>>>>
>>>>>making a wish
>>>>>
>>>>>
>>>>>>>list of what such a toolkit should have, and I can see how the 
>>>>>>>logging mechanism could be used for recovery from a mid surgery 
>>>>>>>crash, but just wanted to check if the feature was planned for 
>>>>>>>anytime in the near future.
>>>>>>>
>>>>>>>Thanks.
>>>>>>>-Tina
>>>>>>
>>>>>>Hi Tina,
>>>>>>
>>>>>>We hope that the toolkit wont crash in the middle of a
>>>
>>>surgery...may
>>>
>>>>>>be before or after :) But on a serious note, the main reason
>>>>>
>>>>>the state
>>>>>
>>>>>
>>>>>>machines were introduced in IGSTK is to make it "ideally" 100% 
>>>>>>predictable. All error conditions and scenarios should be
>>>
>>>handled by
>>>
>>>>>>the state machines. IGSTK should be unrecoverable-condition
>>>
>>>proof. In
>>>
>>>>>>fact, we had  a discussion this afternoon on the TCON about   
>>>>>>introducing state machines into the applications itself. 
>Something 
>>>>>>like an application class with state machines that every
>>>
>>>application
>>>
>>>>>>should be derived from. This design will tighten it up even more.
>>>>>>
>>>>>>cheers,
>>>>>>-Andinet
>>>>>
>>>>>I think as we design IGSTK, we should work with the
>>>
>>>assumption that it
>>>
>>>>>is never going to perfectly crash-proof.  We need to be
>>>
>>>ready for the
>>>
>>>>>worst.  The titanic wasn't an unsinkable ship, regardless of
>>>
>>>what its
>>>
>>>>>designers thought.
>>>>>
>>>>>One of the most important things for medical software is to
>>>
>>>be able to
>>>
>>>>>quantify its reliability.  I wonder if there is some way we can do 
>>>>>this for IGSTK?  Usually reliability statistics are 
>computed at the 
>>>>>application level, by looking at failure rates or counting 
>how often 
>>>>>critical bugs are found.
>>>>>
>>>>>- David
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>_______________________________________________
>>>>IGSTK-Developers mailing list
>>>>IGSTK-Developers at public.kitware.com
>>>>http://public.kitware.com/cgi-bin/mailman/listinfo/igstk-developers
>>>>
>>>>
>>>
>>>
>>>
>> 
>> 
>> 
>> 
>
>
>
>