ITK/HDF5: Difference between revisions
No edit summary |
|||
(6 intermediate revisions by 2 users not shown) | |||
Line 37: | Line 37: | ||
With HDF5, everything is either a group or a dataset. | With HDF5, everything is either a group or a dataset. | ||
ITK must be | The actual ITK type must be stored as a string attribute called "Type" and must correspond to the RTTI typeinfo. | ||
==Managing versions== | |||
Each object must store its version in an attribute called "Version". The version is a string corresponding to the ITK version. | |||
===Atomic objects=== | ===Atomic objects=== | ||
Line 45: | Line 48: | ||
====Index==== | ====Index==== | ||
This is stored as a 1D dataset of Dimension elements. | |||
====Size==== | ====Size==== | ||
This is stored as a 1D dataset of Dimension elements. | |||
====Point==== | ====Point==== | ||
This is stored as a 1D dataset of Dimension elements. | |||
====Matrix==== | ====Matrix==== | ||
This is stored as a 2D dataset of Dimension elements. | |||
====Vector==== | ====Vector==== | ||
This is stored as a 1D dataset of Dimension elements. | |||
===Composite objects=== | ===Composite objects=== | ||
Composite objects are store as groups in the HDF5 file and are made of one or more atomic or composite objects. Each object is named in the same way it is named in the ITK classes, without the leading "m_". | Composite objects are store as groups in the HDF5 file and are made of one or more atomic or composite objects. Each object is named in the same way it is named in the ITK classes, without the leading "m_". | ||
====ImageRegion==== | ====ImageRegion==== | ||
Line 70: | Line 79: | ||
|} | |} | ||
The dimension of the Size and the Index must match. | |||
====Histogram==== | ====Histogram==== | ||
Line 107: | Line 112: | ||
This is not a strict requirement, but images should be saved in chunks to allow them to be efficiently streamed (both read and write) and compressed. | This is not a strict requirement, but images should be saved in chunks to allow them to be efficiently streamed (both read and write) and compressed. | ||
The chunk size should be one on all the dimensions but x and y. | |||
====LabelObjectLine==== | ====LabelObjectLine==== | ||
Line 118: | Line 122: | ||
| Index || [[#Index | Index]] || | | Index || [[#Index | Index]] || | ||
|- | |- | ||
| | | Length || unsigned integer || how do we describe this type? | ||
|} | |} | ||
Line 146: | Line 150: | ||
By default, the object of interest is stored in '''/ITK''', so it can be either a atomic (HDF5 dataset) or composite object (HDF5 group). Of course it is possible to access the objects by using another or a longer path. Some classes in ITK may not provide a way to change the path of the object of interest (for example HDF5ImageIO). | By default, the object of interest is stored in '''/ITK''', so it can be either a atomic (HDF5 dataset) or composite object (HDF5 group). Of course it is possible to access the objects by using another or a longer path. Some classes in ITK may not provide a way to change the path of the object of interest (for example HDF5ImageIO). | ||
Latest revision as of 08:29, 27 April 2011
HDF5 file format and library
HDF5 is both a file format and a library dedicated to reading and writing files in that format.
According to Wikipedia, "HDF5 include only two major types of object:
- Datasets, which are multidimensional arrays of a homogenous type
- Groups, which are container structures which can hold datasets and other groups
This results in a truly hierarchical, filesystem-like data format. In fact, resources in an HDF5 file are even accessed using the POSIX-like syntax /path/to/resource. Metadata is stored in the form of user-defined, named attributes attached to groups and datasets. More complex storage APIs representing images and tables can then be built up using datasets, groups and attributes. In addition to these advances in the file format, HDF5 includes an improved type system, and dataspace objects which represent selections over dataset regions. The API is also object-oriented with respect to datasets, groups, attributes, types, dataspaces and property lists. Because it uses B-trees to index table objects, HDF5 works well for Time series data such as stock price series, network monitoring data, and 3D meteorological data. The bulk of the data goes into straightforward arrays (the table objects) that can be accessed much more quickly than the rows of a SQL database, but B-Tree access is available for non-array data. The HDF5 data storage mechanism can be simpler and faster than an SQL Star schema."
It is available in BSD-like license.
Use cases
ImageIO
(FromProposals:HDF5_ImageIO)
- Chunking (streaming)
- Multi-Resolution
- Multi-Channel images
- Large datasets ( Size > 4Gb )
- Single experiment images of size 1024 x 1024 x 75 (XYZ), 2 channels, 1000 time-points
- 8bit and 16bit
- Images stored as 2D PNGs with filenames giving location
- Need to support optimized reading (image streaming) of a sub-volume
- Eg: Box filtering using a kernel of size 5x5x1x1x3
- Cyclic buffer optimization in the ITK reader that keeps overlapping data and only reads new data
- Multi-resolution images for heirarchical registration of multiple experimental sets
- Compression is not as important in the short term but will be needed in the long term
TransformIO
Protocol
Typing
With HDF5, everything is either a group or a dataset.
The actual ITK type must be stored as a string attribute called "Type" and must correspond to the RTTI typeinfo.
Managing versions
Each object must store its version in an attribute called "Version". The version is a string corresponding to the ITK version.
Atomic objects
Atomic objects or unbreakable basic types. They are (generally?) stored as datasets in the HDF5 files.
Index
This is stored as a 1D dataset of Dimension elements.
Size
This is stored as a 1D dataset of Dimension elements.
Point
This is stored as a 1D dataset of Dimension elements.
Matrix
This is stored as a 2D dataset of Dimension elements.
Vector
This is stored as a 1D dataset of Dimension elements.
Composite objects
Composite objects are store as groups in the HDF5 file and are made of one or more atomic or composite objects. Each object is named in the same way it is named in the ITK classes, without the leading "m_".
ImageRegion
This is the storage of the class ImageRegion.
Member | Type |
---|---|
Index | Index |
Size | Size |
The dimension of the Size and the Index must match.
Histogram
TODO
ImageBase
This is the storage of the class ImageBase.
Member | Type | Comment |
---|---|---|
Region | ImageRegion | This is the largest possible region shortened, because the different regions in itk::Image doesn't really make sense in a file storage. |
Spacing | Vector | |
Origin | Point | |
Direction | Matrix |
Image
This is the storage of the class Image. Image inherits the members of ImageBase and adds its own members.
Member | Type | Comment |
---|---|---|
Pixels | TODO | which type should be used? a dataset directly? an atomic type? |
This is not a strict requirement, but images should be saved in chunks to allow them to be efficiently streamed (both read and write) and compressed. The chunk size should be one on all the dimensions but x and y.
LabelObjectLine
This is the storage of the class LabelObjectLine.
Member | Type | Comment |
---|---|---|
Index | Index | |
Length | unsigned integer | how do we describe this type? |
LabelObject
This is the storage of the class LabelObject.
Member | Type | Comment |
---|---|---|
Label | integer | how do we describe this type? |
Lines | TODO | which type should be used? a group directly? an composite type? |
LabelMap
This is the storage of the class LabelMap. LabelMap inherits the members of ImageBase and adds its own members.
Member | Type | Comment |
---|---|---|
LabelObjects | TODO | which type should be used? a group directly? an composite type? |
Base path
By default, the object of interest is stored in /ITK, so it can be either a atomic (HDF5 dataset) or composite object (HDF5 group). Of course it is possible to access the objects by using another or a longer path. Some classes in ITK may not provide a way to change the path of the object of interest (for example HDF5ImageIO).