ITK/Release 4/Testing Data: Difference between revisions

From KitwarePublic
< ITK‎ | Release 4
Jump to navigationJump to search
(Created page with '= The Challenge = * Data used for testing can be large. * Data should not be stored along with the source code = MIDAS Proposal = * Store ITK Testing Data in MIDAS * Download …')
 
 
(11 intermediate revisions by 4 users not shown)
Line 2: Line 2:


* Data used for testing can be large.
* Data used for testing can be large.
** ITK has three types of Data (Input, Baselines and "Large Data"). At this point, neither Baselines or Input are very large (subjective remark). "Large Data" would benefit from the MIDAS solution.
* Data should not be stored along with the source code
* Data should not be stored along with the source code
** Lorensen comment: This is an opinion, not a fact.
* Data files change along with the evolution of the code
** So we have to track their specific versions.
* Adding new files should be quick and easy, ideally with a single command line
** This was possible with ITK 3.x.
* There must be a way to pull down the testing data corpus as a whole. You must be able to run CTest without an Internet connection.


= MIDAS Proposal =
= MIDAS Proposal =


* Store ITK Testing Data in MIDAS
* Store ITK Testing Data in MIDAS
** Lorensen Proposal: Keep Input and Baseline data in the ITK repository. Store Large data in MIDAS.
* Download it at run time during ITK testing.
* Download it at run time during ITK testing.


Line 13: Line 21:
At:
At:


* http://midas.kitware.com/community/view/5
* http://midas.kitware.com/community/view/7


=== MIDASapp ===
=== MIDAS - CMake ===


Allows to download data on-the-fly, at run-time while running the testing.


* http://www.kitware.com/midaswiki/index.php/MIDAScpp
The related CMake files can be found in the directory
* svn co https://www.kitware.com/svn/KWPublic/trunk/MIDAS/MIDAScpp
 
* Dashboard http://www.cdash.org/CDash/index.php?project=MIDAS
    ITK/CMake
 
The file
 
    ExternalData.cmake
 
contains CMake functions that will download specific files and put them in a given directory.
 
The file
 
    ITKExternalData.cmake
 
Contains the default file URL's.
 
An itk_add_test function to automatically resolve DATA{} lines is found in ITK/CMakeLists.txt.
 
==== How to find Checksum files ====
 
* Go to the MIDAS page of the image of your interest
** From: http://midas.kitware.com/community/view/7
*** For example, the image "cthead1.png" can be found at: http://midas.kitware.com/item/view/258
* Click on the check-box "Advanced View" at the top of the page.
** The checksums of individual files will be displayed under the filenames
** A "Download MD5 Key File" link will be shown in front of the checksum number
* Click on the "Download MD5 Key File" and save the file in the directory
** ITK/Testing/MIDAS_Keys
 
==== How to locally cache testing data ====
 
It is easiest to first download all data by running an ITK build with BUILD_TESTING ON.  This will download the required data into
 
  <build_dir>/ExternalData/Objects
 
Copy the testing data into the desired location, e.g.
 
  cp -r <build_dir>/ExternalData/Objects/MD5 /var/bigharddrive/
 
Or
 
  rsync -av <build_dir>/ExternalData/Objects/MD5 /var/bigharddrive/
 
In new builds, point the build to the local cache by setting the CMake cache variable '''ExternalData_URL_TEMPLATES'''. The setting in this case would be
 
  file:///var/bigharddrive/%(algo)/%(hash)
 
All testing data that can be found in the local cache will be used first, and the search will fall back on the default URL's if a file cannot be found.

Latest revision as of 15:59, 9 December 2011

The Challenge

  • Data used for testing can be large.
    • ITK has three types of Data (Input, Baselines and "Large Data"). At this point, neither Baselines or Input are very large (subjective remark). "Large Data" would benefit from the MIDAS solution.
  • Data should not be stored along with the source code
    • Lorensen comment: This is an opinion, not a fact.
  • Data files change along with the evolution of the code
    • So we have to track their specific versions.
  • Adding new files should be quick and easy, ideally with a single command line
    • This was possible with ITK 3.x.
  • There must be a way to pull down the testing data corpus as a whole. You must be able to run CTest without an Internet connection.

MIDAS Proposal

  • Store ITK Testing Data in MIDAS
    • Lorensen Proposal: Keep Input and Baseline data in the ITK repository. Store Large data in MIDAS.
  • Download it at run time during ITK testing.

Data on MIDAS

At:

MIDAS - CMake

The related CMake files can be found in the directory

   ITK/CMake

The file

   ExternalData.cmake 

contains CMake functions that will download specific files and put them in a given directory.

The file

   ITKExternalData.cmake

Contains the default file URL's.

An itk_add_test function to automatically resolve DATA{} lines is found in ITK/CMakeLists.txt.

How to find Checksum files

  • Go to the MIDAS page of the image of your interest
  • Click on the check-box "Advanced View" at the top of the page.
    • The checksums of individual files will be displayed under the filenames
    • A "Download MD5 Key File" link will be shown in front of the checksum number
  • Click on the "Download MD5 Key File" and save the file in the directory
    • ITK/Testing/MIDAS_Keys

How to locally cache testing data

It is easiest to first download all data by running an ITK build with BUILD_TESTING ON. This will download the required data into

 <build_dir>/ExternalData/Objects

Copy the testing data into the desired location, e.g.

 cp -r <build_dir>/ExternalData/Objects/MD5 /var/bigharddrive/

Or

 rsync -av <build_dir>/ExternalData/Objects/MD5 /var/bigharddrive/

In new builds, point the build to the local cache by setting the CMake cache variable ExternalData_URL_TEMPLATES. The setting in this case would be

 file:///var/bigharddrive/%(algo)/%(hash)

All testing data that can be found in the local cache will be used first, and the search will fall back on the default URL's if a file cannot be found.