ITK/Procedure for Contributing New Classes and Algorithms: Difference between revisions

From KitwarePublic
< ITK
Jump to navigationJump to search
 
(50 intermediate revisions by 13 users not shown)
Line 1: Line 1:
= Introduction =
= Introduction =


This page describes the procedure for contributing new algorithms and classes to the [[http://www.itk.org Insight Toolkit]].  
This page describes the procedure for contributing new algorithms and classes to the [http://www.itk.org Insight Toolkit].  


The fundamental idea of this procedure is to make the [[http://www.insightsoftwareconsortium.org/InsightJournal  Insight Journal]] to be the entry gate of new classes and algorithms to the Insight Toolkit. This means that developers should not commit new classes into the CVS repository unless they have been already posted as papers to the Insight Journal and have received positive reviews from the community.
The fundamental idea of this procedure is to make the [http://www.insight-journal.org/ Insight Journal] to be the entry gate of new classes and algorithms to the Insight Toolkit. This means that developers should not publish new classes into the Git repository unless they have been already posted as papers to the Insight Journal and have received positive reviews from the community.


Although this may appear as a bureaucratic procedure, it should be quite agile in practice because the Insight Journal is not a typical Journal. The time between submitting a paper and finding it posted online should be in the range of minutes for an average case, and a couple of  hours as a worst case. The time difference will depend on how computing-intensive the testing for the code is.
Although this may appear as a bureaucratic procedure, it should be quite agile in practice because the Insight Journal is not a typical Journal. The time between submitting a paper and finding it posted online should be in the range of minutes for an average case, and a couple of  hours as a worst case. The time difference will depend on how computing-intensive the testing for the code is.
Line 9: Line 9:
As soon as a paper is posted online, the source code that <b>must</b> accompany the paper becomes also freely available online. So the time for sharing the contributions with the community should be in all cases less than 24 hours.
As soon as a paper is posted online, the source code that <b>must</b> accompany the paper becomes also freely available online. So the time for sharing the contributions with the community should be in all cases less than 24 hours.


The following sections describe the rationale behind this procedure, and the technical details on how to prepare a submission and follow it through until the source code is committed in to the ITK CVS repository.
The following sections describe the rationale behind this procedure, and the technical details on how to prepare a submission and follow it through until the source code is published in to the ITK Git repository.


= The Rationale =
= The Rationale =
Line 31: Line 31:
The community should pounder whether the technical concepts behind a new algorithm are acceptable. Technical correctness requires the contributor to provide a background on the proposed algorithm. Some algorithms may be so widely known that a simple citation to a major paper describing the algorithm may be enough for satisfying the requirement of technical correctness. Less known algorithms would require more detailed descriptions in order to make the case for their technical correctness. There are no hard rules on how deep this description should be. The only clear cut criteria is that it should be clear enough  for not raising major objections from the community.
The community should pounder whether the technical concepts behind a new algorithm are acceptable. Technical correctness requires the contributor to provide a background on the proposed algorithm. Some algorithms may be so widely known that a simple citation to a major paper describing the algorithm may be enough for satisfying the requirement of technical correctness. Less known algorithms would require more detailed descriptions in order to make the case for their technical correctness. There are no hard rules on how deep this description should be. The only clear cut criteria is that it should be clear enough  for not raising major objections from the community.


== Technical correctness of new contributions ==
== Avoid duplication of functionalities ==
== Avoid duplication of functionalities ==
Given the large number of classes existing in the toolkit and the fact that the development effort was distributed among multiple institutions, is is not trivial for a single developer to establish whether a particular algorithm is already implemented in the toolkit. Therefore, when it comes to adding new functionality, an opportunity should be created for other developers to point out to existing code that may already provide such functionality or that may help to implement the suggested new functionality.
== Maximize reuse of existing code ==
== Maximize reuse of existing code ==
During the time that the submissions are exposed in the Journal, other developers and users may find that parts of the algorithm could be implemented using existing classes in the toolkit. By posting those comments in their reviews, they will help the authors to refactore their code in order to use those existing classes.
== Maximize generalization of the algorithm implementations ==
== Maximize generalization of the algorithm implementations ==
It is common that algorithm implementation is done in the context of a very specific problem. Authors will typically post the algorithms they have used for solve a specific problem. By opening the papers to public non-anonymous reviews, readers and reviewers may find the algorithm applicable to other problems, and may suggest ways of generalizing the algorithms. In this way a larger community will benefit from the insertion of a generalized algorithm, instead of restricting the benefit for those involved with the specific problem for which the algorithm was originally intended.
== Enforce validation, testing and code coverage ==
== Enforce validation, testing and code coverage ==
It is fundamental to make sure that new algorithms are working as advertised. The practical way of doing this is to provide a test with realistic data input, and typical parameters to the algorithm, in such a way that it can be run by anybody. The test should also include the expected output, so when it is executed by other users, there is a baseline for comparison that will make possible to evaluate whether the algorithm is actually producing the expected output or not.
Code coverage should also be brought as close to 100% as possible, before the classes are contributed to the toolkit. The reason is that the relevance of a test passing is only significant at the level of the code coverage of the test.  In other words, a test that passes but that only exercises 20% of the code in a class can not be claimed to be a suficient demostration of the implementation correctness. Such test will only prove that 20% of the class works as advertised.
Failure to provide sufficient code coverage in the initial commit of a class is the most common cause for bugs getting undetected in the toolkit for long periods. It is also the most common case for classes to break without being noticed when other changes in the code affect the untested sections.
Lack of code coverage breaks the basic assumptions of the quality control system based on the Dart Dashboard.
== Maximize maintainability ==
== Maximize maintainability ==


Once <b>any</b> class is included in the toolkit, the developers community gets engaged in maintaining this code for as long as the Toolkit is available. This can easily mean five to ten years of software maintenance. It is a well known fact that 80% of the cost and effort of software development is spent in maintenance and bug fixes. The bulk of this maintenance effort is the time spend by future developers in <b>understanding</b>
Once <b>any</b> class is included in the toolkit, the developers community gets engaged in maintaining this code for as long as the Toolkit is available. This can easily mean five to ten years of software maintenance. It is a well known fact that 80% of the cost and effort of software development is spent in maintenance and bug fixes. The bulk of this maintenance effort is the time spend by future developers in <b>understanding</b>. Also, one code is delivered in a release, the developers have a responsibility to maintain that code's API. See the Insight Consortium's [http://insightsoftwareconsortium.org/wiki/index.php/Administration-BackwardCompatibility Policy on Backward Compatibility].


Therefore it is quite important for any class contributed to the toolkit to be analyzed for maintainability.  
Therefore it is quite important for any class contributed to the toolkit to be analyzed for maintainability.  


Among the most important criteria for maintainabilu
Among the most important criteria for maintainability


== Ensure that new algorithms are properly documented ==
== Ensure that new algorithms are properly documented ==
Proper documentation of new algorithms is key for encouraging their use by the community. It is quite common for users in the mailing list to post questions regarding a paper where a particular ITK class is described.
Given that many of the algorithms may have been described already in published papers, the new classes may simply cite those papers.
The documentation of a new algorithm should also include guidance on how to use it. In particular, practical examples, with realisting data input are the ideal way of presenting the algorithm usage to the community.
== Gather feedback from the community ==
== Gather feedback from the community ==
The Insight Journal uses an open public peer-review system. It is then possible for anybody in the community to contribute reviews for the articles posted in the Journal. This open channel allows users and developers to share information about the papers in the Journal. In particular, it facilitates to send corrections, and suggestions for  improvements, that the authors can use for improving their work (source code and documents) and submit subsequent versions of their contributions.
== Hold a continuously open forum ==
== Hold a continuously open forum ==
Given that the reviews are non-anonymous and public, authors are free to have a two-way communication with the reviewers and constructibly discuss the details of the proposed algorithms. This dialog gets recorded in the form of reviews and replies to reviews and is also shared with the community. Readers of the papers can benefit from reading these dialogs since they will give them insight on the issues that may be raised by the article's content.


= The Procedure =
= The Procedure =


The procedure for contributing new classes and algorithms to [[http://www.itk.org ITK]] is the following.
'''Note''': the methods for making contributions in the ITK community will change with ITKv4 as [[ITK_Release_4/New_Code_Contribution_Process|made possible by the new modular architecture]].
 
== Life Cycle of a Submission ==
 
The procedure for contributing new classes and algorithms to [http://www.itk.org ITK] is the following.


# An Author will propose an algorithm to the developers list or to the weekly tcon.<br>This will be an initial check to make sure that the algorithm is not already available in ITK, or that it can not be constructed with components already existing on the toolkit.
# An Author will propose an algorithm to the developers list or to the weekly tcon.<br>This will be an initial check to make sure that the algorithm is not already available in ITK, or that it can not be constructed with components already existing on the toolkit.
# The Author will prepare a working prototype of source code and will tested with realistic data.
# The Author will prepare a working prototype of source code and test the code with realistic data.
# The Author will submit a paper to the Insight Journal. <br>The paper <b>must</b> include the following
# The Author will submit a paper to the Insight Journal. <br>The paper <b>must</b> include the following
## The source code of the prototype
## The source code of the prototype
Line 61: Line 94:
## The output data produced by the test
## The output data produced by the test
## A document (preferably a hyperlinked PDF) describing the algorithm and how to use the new classes
## A document (preferably a hyperlinked PDF) describing the algorithm and how to use the new classes
# The source code of the prototype and it test will be automatically compiled an executed by the testing system of the Insight Journal.
# The source code of the prototype and its test will be automatically compiled and executed by the testing system of the Insight Journal.
# The paper and its source code will receive reviews from the community.
# Once a paper has 4 reviews (3 by the community, 1 by the automated testing system), the submission is eligible for inclusion in ITK.  While the rating from the reviewers will influence the decision on accepting a contribution for inclusion into ITK, there is currently no minimum rating requirement.
# Every week, at the tcon, the oversight committee will select papers from the eligible set and assign developers to shepard the code into the CVS repository.
# Before each release, the oversight committee will select papers from the set that currently do not have the necessary number of reviews and assign additional reviewers. 
# Before each release, the oversight committee will select papers from the eligible set and assign developers to shepard the code into the CVS repository.
# The shepards will request and verify that the authors have faxed a [http://www.insightsoftwareconsortium.org/documents/policies/OSCertification-v2006-02-01.pdf copyright transfer form] to Josh Cates at The University of Utah.
## Although copyright transfer isn't required for submission to the IJ, it is required for code to be distributed in ITK, using the ITK copyright.
## Authors of code will still be acknowledged as the author in the code's comments.
## Copyright transfer ensures the continued and consistent open-source licensing of ISC supported software.
# The assignment will be added as a Feature Request in the bug tracker in order to ensure that it gets checked before the following release of the Toolkit.
# The code will be added to the Review directory and will go through code reviews according to the [[ITK Code Review Check List|Code Review Check List]]
# Once the code is in the ITK repository, further improvements to the code will be accompanied by short papers to the Insight Journal. The need for these papers will be limited to algorithmic improvements.
 
 
It must be noted that papers to the Insight Journal are not the typical burdersome papers expected by the traditional Journals. Instead, these papers are a kind of technical report addressed to future developers, maintainers and users of the code. The goal of the papers is to provide enough technical information for making possible to use the algorithms, and to maintain their code in the years to come. The papers will be focused on the reproducibility of the test, and in instructing users on how to adapt the algorithm parameters to other data scenarios.
 
== The ITK Editorial Committee ==
 
ITK Developers will play the role of editors for the Journal in particular topics. In this role they will make sure that a paper that falls into their subject of competence gets reviewed and move through the process described above.
 
=== Subject Matter ===
 
The following is the list of current editors and their subjects.
 
{| border="1"
|- bgcolor="#abcdef"
! Subject Matter !! Editor !! Affiliation
|-
| Registration || Daniel Blezek || GE
|-
| Registration || Luis Ibanez || Kitware
|-
| LevelSets    || Jim Miller || GE
|-
| Mathematical Morphology  || Jim Miller || GE
|-
| Meshes  || Alex Gouaillard || A*STAR
|}
 
=== Toolkit Areas ===
 
Editors will also be in charge of specific areas of the toolkit, according to the subdirectory organization.
 
Current areas and their editors are listed following table.
 
{| border="1"
|- bgcolor="#abcdef"
! Toolkit Area !! Editor !! Affiliation
|-
| Common || Luis Ibanez || Kitware
|-
| BasicFilters || Daniel Blezek || GE
|-
| Algorithms  || Jim Miller || GE
|-
| Statistics    || Stephen Aylward || Kitware 
|-
| Spatial Objects  || Julien Jomier || Kitware 
|-
| Wrapping || Brad King || Kitware 
|-
| DICOM || Mathieu Malaterre || MMC
|}
 
== Template for a Submission ==
 
=== From the Subversion Repository ===
 
The template for submitting to the Insight Journal can be found at:
 
* http://www.na-mic.org:8000/svn/NAMICSandBox/trunk/InsightJournal/SubmissionTemplate/
 
=== From Kitware's HTTP site ===
 
The template for submitting to the Insight Journal can be found [http://public.kitware.com/pub/itk/InsightJournal/ here]:
 
* http://public.kitware.com/pub/itk/InsightJournal/InsightJournalSubmissionTemplate.tgz
* http://public.kitware.com/pub/itk/InsightJournal/InsightJournalSubmissionTemplate.zip


== The Automatic Testing System of the Insight Journal ==
== The Automatic Testing System of the Insight Journal ==


http://www.insightsoftwareconsortium.org/wiki/index.php/IJ-Testing-Environment
The infrastructure of the Insight Journal will automatically test the source code of a submitted paper.
 
A full description of the testing environment is described in the following link
 
http://www.insight-journal.org/help


== How to Prepare a Submission to the Insight Journal ==
== How to Prepare a Submission to the Insight Journal ==


The full description of the process on how to prepare a submission to the Insight Journal can be found at
The full description of the process on how to prepare a submission to the Insight Journal can be found at
http://www.insight-journal.org/help/submission


http://www.insightsoftwareconsortium.org/wiki/index.php/CMake_Tutorial
Templates for papers and CMakeLists files are available in this link.
 
== The CMakeList.txt Template for a Paper ==


The following CMakeLists.txt template is explained in the link above. However we added here a direct link just for convenience
For instructions on inserting code into your paper (using the LaTeX package "listings") see [http://www.itk.org/pipermail/insight-users/2009-January/028763.html this] Insight Users post.


  [[http://insight-journal.org/documentation/CMake/CMakeTemplate.txt  CMakeTemplate.txt]]
{{ITK/Template/Footer}}

Latest revision as of 20:18, 10 February 2012

Introduction

This page describes the procedure for contributing new algorithms and classes to the Insight Toolkit.

The fundamental idea of this procedure is to make the Insight Journal to be the entry gate of new classes and algorithms to the Insight Toolkit. This means that developers should not publish new classes into the Git repository unless they have been already posted as papers to the Insight Journal and have received positive reviews from the community.

Although this may appear as a bureaucratic procedure, it should be quite agile in practice because the Insight Journal is not a typical Journal. The time between submitting a paper and finding it posted online should be in the range of minutes for an average case, and a couple of hours as a worst case. The time difference will depend on how computing-intensive the testing for the code is.

As soon as a paper is posted online, the source code that must accompany the paper becomes also freely available online. So the time for sharing the contributions with the community should be in all cases less than 24 hours.

The following sections describe the rationale behind this procedure, and the technical details on how to prepare a submission and follow it through until the source code is published in to the ITK Git repository.

The Rationale

The rationale behind this procedure is to pursue the following goals

  1. Technical correctness of new contributions
  2. Avoid duplication of functionalities
  3. Maximize reuse of existing code
  4. Maximize generalization of the algorithm implementations
  5. Enforce validation, testing and code coverage
  6. Maximize maintainability
  7. Ensure that new algorithms are properly documented
  8. Gather feedback from the community
  9. Hold a continuously open forum where algorithmic, and performance issues are discussed.

Since some of these goals may be conflicting, it will be the prerrogative of the Oversight Committee to rule on whether one criteria should be given more importance over another one. This decisions will have to be made on a case-by-case basis.

Technical Correctness

The community should pounder whether the technical concepts behind a new algorithm are acceptable. Technical correctness requires the contributor to provide a background on the proposed algorithm. Some algorithms may be so widely known that a simple citation to a major paper describing the algorithm may be enough for satisfying the requirement of technical correctness. Less known algorithms would require more detailed descriptions in order to make the case for their technical correctness. There are no hard rules on how deep this description should be. The only clear cut criteria is that it should be clear enough for not raising major objections from the community.

Avoid duplication of functionalities

Given the large number of classes existing in the toolkit and the fact that the development effort was distributed among multiple institutions, is is not trivial for a single developer to establish whether a particular algorithm is already implemented in the toolkit. Therefore, when it comes to adding new functionality, an opportunity should be created for other developers to point out to existing code that may already provide such functionality or that may help to implement the suggested new functionality.

Maximize reuse of existing code

During the time that the submissions are exposed in the Journal, other developers and users may find that parts of the algorithm could be implemented using existing classes in the toolkit. By posting those comments in their reviews, they will help the authors to refactore their code in order to use those existing classes.

Maximize generalization of the algorithm implementations

It is common that algorithm implementation is done in the context of a very specific problem. Authors will typically post the algorithms they have used for solve a specific problem. By opening the papers to public non-anonymous reviews, readers and reviewers may find the algorithm applicable to other problems, and may suggest ways of generalizing the algorithms. In this way a larger community will benefit from the insertion of a generalized algorithm, instead of restricting the benefit for those involved with the specific problem for which the algorithm was originally intended.

Enforce validation, testing and code coverage

It is fundamental to make sure that new algorithms are working as advertised. The practical way of doing this is to provide a test with realistic data input, and typical parameters to the algorithm, in such a way that it can be run by anybody. The test should also include the expected output, so when it is executed by other users, there is a baseline for comparison that will make possible to evaluate whether the algorithm is actually producing the expected output or not.

Code coverage should also be brought as close to 100% as possible, before the classes are contributed to the toolkit. The reason is that the relevance of a test passing is only significant at the level of the code coverage of the test. In other words, a test that passes but that only exercises 20% of the code in a class can not be claimed to be a suficient demostration of the implementation correctness. Such test will only prove that 20% of the class works as advertised.

Failure to provide sufficient code coverage in the initial commit of a class is the most common cause for bugs getting undetected in the toolkit for long periods. It is also the most common case for classes to break without being noticed when other changes in the code affect the untested sections.

Lack of code coverage breaks the basic assumptions of the quality control system based on the Dart Dashboard.

Maximize maintainability

Once any class is included in the toolkit, the developers community gets engaged in maintaining this code for as long as the Toolkit is available. This can easily mean five to ten years of software maintenance. It is a well known fact that 80% of the cost and effort of software development is spent in maintenance and bug fixes. The bulk of this maintenance effort is the time spend by future developers in understanding. Also, one code is delivered in a release, the developers have a responsibility to maintain that code's API. See the Insight Consortium's Policy on Backward Compatibility.

Therefore it is quite important for any class contributed to the toolkit to be analyzed for maintainability.

Among the most important criteria for maintainability

Ensure that new algorithms are properly documented

Proper documentation of new algorithms is key for encouraging their use by the community. It is quite common for users in the mailing list to post questions regarding a paper where a particular ITK class is described.

Given that many of the algorithms may have been described already in published papers, the new classes may simply cite those papers.

The documentation of a new algorithm should also include guidance on how to use it. In particular, practical examples, with realisting data input are the ideal way of presenting the algorithm usage to the community.

Gather feedback from the community

The Insight Journal uses an open public peer-review system. It is then possible for anybody in the community to contribute reviews for the articles posted in the Journal. This open channel allows users and developers to share information about the papers in the Journal. In particular, it facilitates to send corrections, and suggestions for improvements, that the authors can use for improving their work (source code and documents) and submit subsequent versions of their contributions.

Hold a continuously open forum

Given that the reviews are non-anonymous and public, authors are free to have a two-way communication with the reviewers and constructibly discuss the details of the proposed algorithms. This dialog gets recorded in the form of reviews and replies to reviews and is also shared with the community. Readers of the papers can benefit from reading these dialogs since they will give them insight on the issues that may be raised by the article's content.

The Procedure

Note: the methods for making contributions in the ITK community will change with ITKv4 as made possible by the new modular architecture.

Life Cycle of a Submission

The procedure for contributing new classes and algorithms to ITK is the following.

  1. An Author will propose an algorithm to the developers list or to the weekly tcon.
    This will be an initial check to make sure that the algorithm is not already available in ITK, or that it can not be constructed with components already existing on the toolkit.
  2. The Author will prepare a working prototype of source code and test the code with realistic data.
  3. The Author will submit a paper to the Insight Journal.
    The paper must include the following
    1. The source code of the prototype
    2. The source code of the test
    3. The realistic input data required by the test
    4. The full list of parameters required by the test
    5. The output data produced by the test
    6. A document (preferably a hyperlinked PDF) describing the algorithm and how to use the new classes
  4. The source code of the prototype and its test will be automatically compiled and executed by the testing system of the Insight Journal.
  5. The paper and its source code will receive reviews from the community.
  6. Once a paper has 4 reviews (3 by the community, 1 by the automated testing system), the submission is eligible for inclusion in ITK. While the rating from the reviewers will influence the decision on accepting a contribution for inclusion into ITK, there is currently no minimum rating requirement.
  7. Every week, at the tcon, the oversight committee will select papers from the eligible set and assign developers to shepard the code into the CVS repository.
  8. Before each release, the oversight committee will select papers from the set that currently do not have the necessary number of reviews and assign additional reviewers.
  9. Before each release, the oversight committee will select papers from the eligible set and assign developers to shepard the code into the CVS repository.
  10. The shepards will request and verify that the authors have faxed a copyright transfer form to Josh Cates at The University of Utah.
    1. Although copyright transfer isn't required for submission to the IJ, it is required for code to be distributed in ITK, using the ITK copyright.
    2. Authors of code will still be acknowledged as the author in the code's comments.
    3. Copyright transfer ensures the continued and consistent open-source licensing of ISC supported software.
  11. The assignment will be added as a Feature Request in the bug tracker in order to ensure that it gets checked before the following release of the Toolkit.
  12. The code will be added to the Review directory and will go through code reviews according to the Code Review Check List
  13. Once the code is in the ITK repository, further improvements to the code will be accompanied by short papers to the Insight Journal. The need for these papers will be limited to algorithmic improvements.


It must be noted that papers to the Insight Journal are not the typical burdersome papers expected by the traditional Journals. Instead, these papers are a kind of technical report addressed to future developers, maintainers and users of the code. The goal of the papers is to provide enough technical information for making possible to use the algorithms, and to maintain their code in the years to come. The papers will be focused on the reproducibility of the test, and in instructing users on how to adapt the algorithm parameters to other data scenarios.

The ITK Editorial Committee

ITK Developers will play the role of editors for the Journal in particular topics. In this role they will make sure that a paper that falls into their subject of competence gets reviewed and move through the process described above.

Subject Matter

The following is the list of current editors and their subjects.

Subject Matter Editor Affiliation
Registration Daniel Blezek GE
Registration Luis Ibanez Kitware
LevelSets Jim Miller GE
Mathematical Morphology Jim Miller GE
Meshes Alex Gouaillard A*STAR

Toolkit Areas

Editors will also be in charge of specific areas of the toolkit, according to the subdirectory organization.

Current areas and their editors are listed following table.

Toolkit Area Editor Affiliation
Common Luis Ibanez Kitware
BasicFilters Daniel Blezek GE
Algorithms Jim Miller GE
Statistics Stephen Aylward Kitware
Spatial Objects Julien Jomier Kitware
Wrapping Brad King Kitware
DICOM Mathieu Malaterre MMC

Template for a Submission

From the Subversion Repository

The template for submitting to the Insight Journal can be found at:

From Kitware's HTTP site

The template for submitting to the Insight Journal can be found here:

The Automatic Testing System of the Insight Journal

The infrastructure of the Insight Journal will automatically test the source code of a submitted paper.

A full description of the testing environment is described in the following link

http://www.insight-journal.org/help

How to Prepare a Submission to the Insight Journal

The full description of the process on how to prepare a submission to the Insight Journal can be found at http://www.insight-journal.org/help/submission

Templates for papers and CMakeLists files are available in this link.

For instructions on inserting code into your paper (using the LaTeX package "listings") see this Insight Users post.



ITK: [Welcome | Site Map]