Proposals:Explicit Instantiation
This document derives the motivation and implementation of explicit template instantiation support in ITK.
Overview
Most of ITK is implemented using class templates in order to provide users with tremendous flexibility on the types of data that can be processed. Here we investigate the mechanism used by the build system to transform these templates into executable code.
Consider an example ITK class template defined in itkFoo.h:
// itkFoo.h namespace itk { template <class T> class Foo { public: void MethodA(); }; }
The template member MethodA is defined in itkFoo.txx:
// itkFoo.txx namespace itk { template <class T> void Foo<T>::MethodA() { } }
When the user writes code such as
#include "itkFoo.h" int main() { itk::Foo<int> foo; foo.MethodA(); return 0; }
a reference to the symbol "void itk::Foo<int>::MethodA()" is created. The basic problem we face is how to create the template instantiation providing this symbol.
Implicit Template Instantiation
One way to provide the symbol is called "implicit template instantiation". If the user were to use
#include "itkFoo.txx"
then the compiler will have a definition of the template member MethodA and create a copy of the symbol in the same object file that contains main. In this case the method implementation is implicitly instantiated by the compiler because the definition is available and the method is called.
Implicit instantiation has the advantage that user code may create any instantiation of the template Foo that it needs without worrying about where the symbol definitions will reside. The drawback is that every source file that references the method will create its own copy of the symbol. Code for MethodA will be compiled and stored in every object file that needs it, but the linker will choose only one copy and throw out the rest. This means that the work the compiler did in most of the object files was wasted.
Explicit Template Instantiation
An alternative way to provide the symbol "void itk::Foo<int>::MethodA()" is through "explicit template instantiation". Assume the user did not include itkFoo.txx. When the compiler sees a reference to MethodA it does not know how to create the symbol so it instead places in the object file an unresolved symbol reference. In order to provide the symbol to the linker we need to explicitly create a copy of it somewhere. We can do this by creating an explicit instantiation in a separate source file. Consider a source file called "itkFoo+int-.cxx" that contains the following code:
#include "itkFoo.h" #include "itkFoo.txx" namespace itk { template class Foo<int>; // explicit template instantiation }
When the compiler builds this source file it will have the definitions of all the members of Foo<int> because itkFoo.txx was included. The explicit template instantiation syntax tells the compiler to instantiate a copy of every symbol in the template. Now when the user builds a program referencing "void itk::Foo<int>::MethodA()" the linker will find it in the object file providing the explicit instantiation.
Extern Template Instantiation
Now that we have provided the instantiation itk::Foo<int> explicitly users may use this instantiation by including only itkFoo.h. However, say the user now wants to use itk::Foo<float> as well:
#include "itkFoo.h" int main() { itk::Foo<int> fooI; fooI.MethodA(); itk::Foo<float> fooF; fooF.MethodA(); return 0; }
The compiler will not see the definition of MethodA and will produce an unresolved symbol reference. The linker will now fail to resolve the symbol because it is not provided explicitly. A user might fix this by including itkFoo.txx to get the definition. The problem is now the compiler will see the definition for the template MethodA and instantiate both itk::Foo<int>::MethodA and itk::Foo<float>::MethodA! The linker will now resolve itk::Foo<float>::MethodA but there will be two copies of itk::Foo<int>::MethodA and one will be thrown out. Again, the compiler has done work that is wasted.
How can this problem be avoided? One solution is to not allow the user to include itkFoo.txx directly but instead require that his or her project manually create an itkFoo+float-.cxx file containing the instantiation for itk::Foo<float>. While this may work it will be confusing for many users and require alot of extra work. Another solution is to take advantage of a compiler-specific extension known as "extern template instantiation". This extension is provided on at least the MSVC, GCC, and Intel compilers. Consider the following code added to the bottom of itkFoo.h:
namespace itk { extern template class Foo<int>; // extern template instantiation }
This instructs the compiler to NOT instantiate any members of itk::Foo<int> even if the definition is available. Now, if the user includes itkFoo.txx and uses both itk::Foo<int> and itk::Foo<float> the compiler will create symbols for itk::Foo<float> but leave those for itk::Foo<int> undefined. The linker will then come along and find one copy of each, and no work will have been wasted.
A similar extension is provided by the SGI MIPSpro compiler. The above extern instantiation may be written on this compiler as
namespace itk { #pragma do_not_instantiate class Foo<int> }
which tells it not to instantiate this template.
DLL Symbol Resolution on Windows
When an explicit template instantiation is provided by a shared library we must ensure that the symbols are available for use outside the library. On UNIX systems this is automatic. On Windows systems we need to explicitly tell the compiler that the symbols provided by an explicit template instantiation are to be exported from the DLL. This can be achieved by adding a dllexport decoration to the explicit template instantiation line:
#include "itkFoo.h" #include "itkFoo.txx" namespace itk { template class __declspec(dllexport) Foo<int>; }
Similarly, when using an explicit instantiation from another source file we must tell the compiler that we wish to import the symbols from a DLL. This can be achieved by adding a dllimport decoration to the extern template instantiation line:
namespace itk { extern template class __declspec(dllimport) Foo<int>; }
Instantiation Syntax is not Scalable
There are three ways that a class template instantiation may appear in order to export and import it to/from a library. We must make sure these each appear a the proper time, and that each uses the appropriate DLL import/export macro for the library providing the instantiation. For example, if itk::Foo<int> were provided by ITKCommon (perhaps in a source called itkFoo+int-.cxx), we would need the following layout.
// Bottom of itkFoo+int-.cxx namespace itk { template class ITKCommon_EXPORT Foo<int>; } // Bottom of itkFoo.h namespace itk { #if (...compiler supports extern instantiation...) extern template class ITKCommon_EXPORT Foo<int>; #elif (...compiler supports do_not_instantiate...) # pragma do_not_instantiate class ITKCommon_EXPORT Foo<int> #endif }
To make matters worse, some class templates have function templates that also need to be instantiated. For example, instantiating itk::Vector<double, 3> might look like this:
namespace itk { template class Vector<double, 3>; template std::ostream& operator<<(std::ostream&, const Vector<double, 3>&); template std::istream& operator>>(std::ostream&, Vector<double, 3>&); }
Using the above layout, these three lines would have to be duplicated three times with slight variations. The resulting nine lines would have to be duplicated again for each instantiation provided. This duplication is tedious and error-prone, but can be avoided by using a macro to specify the declarations to instantiate for a given template.
Developing Instantiation Macros
We now incrementally develop the macro-based instantiation design to be used in ITK.
Basic Methodology
In order to specify declarations to instantiate only once we must be able to both export and import using the same declaration. Consider the following macro definitions:
#define ITK_TEMPLATE_EXPORT(X) template X; #define ITK_TEMPLATE_IMPORT(X) extern template X;
We can now export itk::Foo<int> by writing
namespace itk { ITK_TEMPLATE_EXPORT(class ITKCommon_EXPORT Foo<int>) }
and import it by writing
namespace itk { ITK_TEMPLATE_IMPORT(class ITKCommon_EXPORT Foo<int>) }
Note that the argument to the macro is the same in both cases.
Per-Template Instantiation Macro
Now we can provide a macro for the template Foo that specifies the argument common to both export and import lines:
#define ITK_TEMPLATE_Foo(_, T) namespace itk { \ _(class ITKCommon_EXPORT Foo< T >) }
Exporting and importing itk::Foo<int> can now be done by the following two lines respectively.
// Bottom of itkFoo+int-.cxx ITK_TEMPLATE_Foo(ITK_TEMPLATE_EXPORT, int) // Bottom of itkFoo.h ITK_TEMPLATE_Foo(ITK_TEMPLATE_IMPORT, int)
These will be expanded to
namespace itk { ITK_TEMPLATE_EXPORT(class ITKCommon_EXPORT Foo< int >) }
and
namespace itk { ITK_TEMPLATE_IMPORT(class ITKCommon_EXPORT Foo< int >) }
and then to
namespace itk { template class ITKCommon_EXPORT Foo< int >; }
and
namespace itk { extern template class ITKCommon_EXPORT Foo< int >; }
This design avoids duplicating the declaration "class Foo<int>" and all supporting declarations (such as in the Vector case).
Supporting Multiple Libraries
So far our design allows instantiations to be provided only by ITKCommon. Instead we should allow the export macro to be specified as an argument:
#define ITK_TEMPLATE_Foo(_, EXPORT, T) namespace itk { \ _(class EXPORT Foo< T >) }
This leads to
// Bottom of itkFoo+int-.cxx ITK_TEMPLATE_Foo(ITK_TEMPLATE_EXPORT, ITKCommon_EXPORT, int) // Bottom of itkFoo.h ITK_TEMPLATE_Foo(ITK_TEMPLATE_IMPORT, ITKCommon_EXPORT, int)
which is more flexible.
Concise Syntax
We can shorten the above export/import lines by providing some helper macros:
#define ITK_EXPORT_TEMPLATE(EXPORT, c, T) \ ITK_TEMPLATE_##c(ITK_TEMPLATE_EXPORT, EXPORT, T) #define ITK_IMPORT_TEMPLATE(EXPORT, c, T) \ ITK_TEMPLATE_##c(ITK_TEMPLATE_IMPORT, EXPORT, T)
which leads to
// Bottom of itkFoo+int-.cxx ITK_EXPORT_TEMPLATE(ITKCommon_EXPORT, Foo, int) // Bottom of itkFoo.h ITK_IMPORT_TEMPLATE(ITKCommon_EXPORT, Foo, int)
We can shorten this further by providing library-specific import/export macros:
#define ITK_EXPORT_ITKCommon(c, T) ITK_EXPORT_TEMPLATE(ITKCommon_EXPORT, c, T) #define ITK_IMPORT_ITKCommon(c, T) ITK_IMPORT_TEMPLATE(ITKCommon_EXPORT, c, T)
which leads to
// Bottom of itkFoo+int-.cxx ITK_EXPORT_ITKCommon(Foo, int) // Bottom of itkFoo.h ITK_IMPORT_ITKCommon(Foo, int)
This is pretty short, and is relatively nice looking.
Multiple Template Arguments Part 1
Unfortunately the current design works only for templates with one argument! The compilers we wish to support do not support variable-length macro argument lists. Consider what happens when we try to pass multiple template arguments to the inner-most instantiation macro:
namespace itk { ITK_TEMPLATE_EXPORT(class ITKCommon_EXPORT Vector<double, 3>) }
The C preprocessor does not understand C++ template arguments, and will treat this as a call to ITK_TEMPLATE_EXPORT with two arguments. The first argument will contain "class ITKCommon_EXPORT Vector<double" and the second argument will contain " 3>". Since ITK_TEMPLATE_EXPORT was defined with only one argument this will produce a preprocessing error.
We need a way to pass multiple template arguments through only one macro argument. One way to shield a comma-separated list from being expanded as multiple macro arguments is to put it inside a nested level of parentheses. For example, invoking
ITK_TEMPLATE_EXPORT((class ITKCommon_EXPORT Vector<double, 3>))
will pass just one argument and will not be a preprocessing error. Unfortunately the resulting expansion will be
template (class ITKCommon_EXPORT Vector<double, 3>);
which is not valid. In order to remove the extra level of parentheses after the argument is passed, we use some helper macros:
#define ITK_TEMPLATE_1(x1) x1 #define ITK_TEMPLATE_2(x1,x2) x1,x2 #define ITK_TEMPLATE_3(x1,x2,x3) x1,x2,x3
Consider the definition
#define ITK_TEMPLATE_EXPORT(x) template ITK_TEMPLATE_##x;
Now we can use the same macro for both Foo and Vector instantiations:
namespace itk { ITK_TEMPLATE_EXPORT(1(class ITKCommon_EXPORT Foo<int>)) ITK_TEMPLATE_EXPORT(2(class ITKCommon_EXPORT Vector<double, 3>)) }
The preprocessor will expand the Foo line first to
template ITK_TEMPLATE_1(class ITKCommon_EXPORT Foo<int>);
which will be recursively expanded to
template class ITKCommon_EXPORT Foo<int>;
Then the preprocessor will expand the Vector line first to
template ITK_TEMPLATE_2(class ITKCommon_EXPORT Vector<double, 3>);
which will be recursively expanded to
template class ITKCommon_EXPORT Vector<double, 3>;
These are exactly the lines we need. The ITK_TEMPLATE_IMPORT macro can be modified similarly.
Multiple Template Arguments Part 2
In order to use these new low-level export/import macros properly we need to change our definition of ITK_TEMPLATE_Foo. Consider a the new definition
#define ITK_TEMPLATE_Foo(_, EXPORT, T) namespace itk { \ _(1(class EXPORT Foo< T >)) }
This will produce a proper invocation of ITK_TEMPLATE_EXPORT and ITK_TEMPLATE_IMPORT for any template argument of Foo. However, this method cannot be used for templates like Vector that take multiple arguments. Consider an attempt to define the instantiation macro for Vector:
#define ITK_TEMPLATE_Vector(_, EXPORT, T, D) namespace itk { \ _(2(class EXPORT Vector< T,D >)) \ _(1(EXPORT std::ostream& operator<<(std::ostream&, \ const Vector< T,D >&))) \ _(1(EXPORT std::istream& operator>>(std::istream&, \ Vector< T,D >&))) \ }
Note that the second two lines have a comma only nested inside a second level of parentheses, so the length of the argument list is only one. This is why the first line is preceded by a "2" while the other lines have only "1".
This instantiation macro will work if invoked directly, but we will not be able to use the short-hand interface provided by ITK_EXPORT_ITKCommon and ITK_IMPORT_ITKCommon. If we try to invoke one of them for Vector we will have too many arguments:
ITK_EXPORT_ITKCommon(Vector, double, 3) // too many arguments
The preprocessor will again give an error. The solution is to use the numbered helper macros to pass the template argument list:
#define ITK_TEMPLATE_Foo(_, EXPORT, x) namespace itk { \ _(1(class EXPORT Foo< ITK_TEMPLATE_1 x >)) } #define ITK_TEMPLATE_Vector(_, EXPORT, x) namespace itk { \ _(2(class EXPORT Vector< ITK_TEMPLATE_2 x >)) \ _(1(EXPORT std::ostream& operator<<(std::ostream&, \ const Vector< ITK_TEMPLATE_2 x >&))) \ _(1(EXPORT std::istream& operator>>(std::istream&, \ Vector< ITK_TEMPLATE_2 x >&))) \ }
In this case the instantiation macro knows how many template arguments are required, so it can directly invoke the proper helper macro to expand the arguments passed through "x". The user need not pass the leading argument count.
Finally, we can use ITK_EXPORT_ITKCommon or its related macros to export templates with any number of arguments:
// Bottom of itkFoo+int-.cxx ITK_EXPORT_ITKCommon(Foo, (int)) // Bottom of itkVector+double.3-.cxx ITK_EXPORT_ITKCommon(Vector, (double, 3))
Naming Template Instantiations
The above macros work well as long as the template arguments are simple types. In ITK however there are many class templates that are instaniated using other template instantiations as arguments. In our example, we might want to instantiate the type
itk::Foo< itk::Vector<double, 3> >
In order to do this we should first import our instantiation of Vector
ITK_IMPORT_ITKCommon(Vector, (double, 3))
and then export the instantiation of Foo
ITK_EXPORT_ITKCommon(Foo, (itk::Vector<double, 3))
Unfortunately this passes an argument list of length two to the instantiation macro for Foo, which will be evaluated as
ITK_TEMPLATE_1(itk::Vector<double, 3)
and will give a preprocessing error. Instead we need to use a typedef to avoid having a comma in the name of the Vector type:
typedef itk::Vector<double, 3> VectorD3; ITK_EXPORT_ITKCommon(Foo, (VectorD3))
This will work, but it is tedious. Consider what it would take to import this instantiation.
ITK_IMPORT_ITKCommon(Vector, (double, 3)) typedef itk::Vector<double, 3> VectorD3; ITK_IMPORT_ITKCommon(Foo, (VectorD3))
This requires that the author of the code importing the instantiation knows exactly how the Vector instantiation appears inside ITK_IMPORT_ITKCommon. Instead we can arrange things to have the Vector instantiation define the typedef automatically.
In order to provide this typedef automatically we need a way to produce a name for it. This can be done by adding another argument to the instantiation macros that provides a single preprocessing token corresponding to the template argument list which may be used to construct typedef names. For example, we might write
ITK_IMPORT_ITKCommon(Foo, (int), I) ITK_IMPORT_ITKCommon(Vector, (double, 3), D3)
where the "I" corresponds to the argument list "(int)" and the "D3" corresponds to the argument list "(double, 3)". These tokens may then be used by the instantiation macros to produce a unique name for each instantiation, such as "FooI" and "VectorD3". In order to avoid conflict with other names in the itk namespace these names can be placed in a sub-namespace called "itk::Templates". Importing the above instantiation of Foo is now more simple:
ITK_IMPORT_ITKCommon(Vector, (double, 3), D3) ITK_IMPORT_ITKCommon(Foo, (Templates::VectorD3), VD3)
We are now ready to define instantiation macros to achieve this syntax. The final version of the macros is given below.
Import Macro on SGI MIPSpro
There is one last detail required to support the do_not_instantiate pragma on the SGI MIPSpro compiler. We must find a way to define ITK_TEMPLATE_IMPORT(X) to produce the code
#pragma do_not_instantiate X
There is no portable way to put a preprocessor directive inside a macro, but fortunately the MIPSpro provides a compiler-specific way to do this. It will transform the expression
_Pragma("some string literal")
into
#pragma some string literal
We can take advantage of this to define ITK_TEMPLATE_IMPORT as follows.
#define ITK_TEMPLATE_IMPORT(x) ITK_TEMPLATE_IMPORT_DELAY(x) #define ITK_TEMPLATE_IMPORT_DELAY(x) \ ITK_TEMPLATE_IMPORT_IMPL(do_not_instantiate ITK_TEMPLATE_##x) #define ITK_TEMPLATE_IMPORT_IMPL(x) _Pragma(#x)
This will transform the code
ITK_TEMPLATE_IMPORT(2(class Vector<double, 3>))
into
_Pragma("do_not_instantiate ITK_TEMPLATE_2(class Vector<double, 3>)")
and will be interpreted by the compiler as
#pragma do_not_instantiate class Vector<double, 3>
which is exactly what we need.
Final Instantiation Macros
We now give the final instantiation macro design used in ITK. These are defined and documented in the header itkMacro.h, except for the per-template instantiation macros.
The low-level instantiation macros are
#define ITK_TEMPLATE_EXPORT(x) ITK_TEMPLATE_EXPORT_DELAY(x) #define ITK_TEMPLATE_IMPORT(x) ITK_TEMPLATE_IMPORT_DELAY(x) #define ITK_TEMPLATE_EXPORT_DELAY(x) template ITK_TEMPLATE_##x; #define ITK_TEMPLATE_IMPORT_DELAY(x) extern template ITK_TEMPLATE_##x;
One level of substitution delay is provided to allow the argument "x" to contain a macro computing the number of arguments in the paren-enclosed expression passed through the argument. Currently this is not needed but it is provided for flexibility and does not hurt anything.
These macros are passed by name to per-template instantiation macros which in turn invoke them to produce specific instantiations. Here we give the final example per-template instantiations macros for Foo and Vector based on the above derivation.
#define ITK_TEMPLATE_Foo(_, EXPORT, x, y) namespace itk { \ _(1(class EXPORT Foo< ITK_TEMPLATE_1 x >)) \ namespace Templates { typedef Foo< ITK_TEMPLATE_1 x > Foo##y; }\ } #define ITK_TEMPLATE_Vector(_, EXPORT, x, y) namespace itk { \ _(2(class EXPORT Vector< ITK_TEMPLATE_2 x >)) \ _(1(EXPORT std::ostream& operator<<(std::ostream&, \ const Vector< ITK_TEMPLATE_2 x >&))) \ _(1(EXPORT std::istream& operator>>(std::istream&, \ Vector< ITK_TEMPLATE_2 x >&))) \ namespace Templates { typedef Vector< ITK_TEMPLATE_2 x > Vector##y; } \ }
These macros are typically defined in the header file of their corresponding class template just after the class template is defined. They should contain at least one invocation of the "_" macro passed as an argument plus one typedef in the Templates sub-namespace providing a name constructed from the class template name and the preprocessor token passed in the "y" argument.
The per-template instantiation macros are invoked by ITK_EXPORT_TEMPLATE and ITK_IMPORT_TEMPLATE, passing ITK_TEMPLATE_EXPORT and ITK_TEMPLATE_IMPORT respectively:
#define ITK_EXPORT_TEMPLATE(EXPORT, c, x, y) \ ITK_TEMPLATE_##c(ITK_TEMPLATE_EXPORT, EXPORT, x, y) #define ITK_IMPORT_TEMPLATE(EXPORT, c, x, y) \ ITK_TEMPLATE_##c(ITK_TEMPLATE_IMPORT, EXPORT, x, y)
For these macros the "c" argument provies the class template name whose per-template instantiation macro is to be invoked. Finally, the library-specific export/import macros are
#define ITK_EXPORT_ITKCommon(c, x, y) \ ITK_EXPORT_TEMPLATE(ITKCommon_EXPORT, c, x, y) #define ITK_IMPORT_ITKCommon(c, x, y) \ ITK_IMPORT_TEMPLATE(ITKCommon_EXPORT, c, x, y)
and provide a short-hand way to export and import instantiations from each library.
These macros can be used to export some example instantiations:
// itkFoo+int-.cxx ITK_EXPORT_ITKCommon(Foo, (int), I) // itkVector+double.3-.cxx ITK_EXPORT_ITKCommon(Vector, (double, 3), D3) // itkFoo+itkVector+double.3--.cxx ITK_IMPORT_ITKCommon(Vector, (double, 3), D3) ITK_EXPORT_ITKCommon(Foo, (Templates::VectorD3), VD3)
and also to import them
ITK_IMPORT_ITKCommon(Foo, (int), I) ITK_IMPORT_ITKCommon(Vector, (double, 3), D3) ITK_IMPORT_ITKCommon(Foo, (Templates::VectorD3), VD3)
Writing Instantiation Macros
All knowledge about how to instantiate a class template and its supporting function templates is encoded in its instantiation macro.
Macro Signature
An instantiation macro for a template is a function-style C preprocessor macro. It is defined with the signature
ITK_TEMPLATE_<name>(_, EXPORT, x, y)
where "<name>" is the name of the template. The arguments are as follows.
Argument | Meaning |
---|---|
"_" | A placeholder for a low-level instantiation export/import macro. It will be replaced by either ITK_TEMPLATE_EXPORT or ITK_TEMPLATE_IMPORT, which are each macros accepting a single argument. |
"EXPORT" | A placeholder for a Windows DLL-export/import macro such as ITKCommon_EXPORT. |
"x" | A placeholder for a paren-enclosed comma-separated list of template arguments. The length of the list is the number of template arguments for the template corresponding to the instantiation macro. For example, an instantiation macro for "itk::Vector<double, 3>" would be given the argument "(double, 3)". |
"y" | A placeholder for a C preprocessor token (identifier) that corresponds to the argument list given to "x". It may be appended to the name of the template to construct a name for the instantiation. For example, if "x" is given "(double, 3)", then "y" might be given "D3". |
Developing Example ITK_TEMPLATE_Foo
Consider the definition of ITK_TEMPLATE_Foo from the section above.
#define ITK_TEMPLATE_Foo(_, EXPORT, x, y) namespace itk { \ _(1(class EXPORT Foo< ITK_TEMPLATE_1 x >)) \ namespace Templates { typedef Foo< ITK_TEMPLATE_1 x > Foo##y; }\ }
At first glance this looks intimidating, but let's investigate how it was constructed. The goal is to produce an instantiation of Foo and a typedef naming the instantiation. If written out manually the code to instantiate "itk::Foo<int>" and name it "FooI" would be
namespace itk { template class ITKCommon_EXPORT Foo<int>; namespace Templates { typedef Foo<int> FooI; } }
It is easy to produce the namespace layout inside our macro:
#define ITK_TEMPLATE_Foo(_, EXPORT, x, y) \ namespace itk \ { \ [some explicit instantiation] \ namespace Templates { [some typedef] }\ }
Now we need to create the explicit instantiation and typedef portion. The macro ITK_TEMPLATE_Foo will be invoked as
ITK_TEMPLATE_Foo(ITK_TEMPLATE_EXPORT, ITKCommon_EXPORT, (int), I)
which gives us the arguments
- _ = ITK_TEMPLATE_EXPORT
- EXPORT = ITKCommon_EXPORT
- x = (int)
- y = I
Both the explicit instantiation and typedef portion of the macro require that our template argument list "(int)" be expanded to "int" with no outer-level of parentheses. Since we know that Foo has only one template argument the list inside the parentheses will always be of length 1. We can use the macro ITK_TEMPLATE_1 to do the expansion. The macro invocation
ITK_TEMPLATE_1 (int)
will expand to just
int
Since "x" contains the "(int)" part we can write the macro invocation as
ITK_TEMPLATE_1 x
In order to produce the name "FooI" as a single C identifier we must concatenate the token "Foo" with the token "I" given to argument "y". This can be done using the C preprocessor token concatenation syntax and is written "Foo##y". Now we can write the typedef portion as
typedef Foo< ITK_TEMPLATE_1 x > Foo##y;
For the explicit instantiation portion, we can use the macro ITK_TEMPLATE_EXPORT. The macro invocation
ITK_TEMPLATE_EXPORT(1([some instantiation]))
will be expanded to
template ITK_TEMPLATE_##1([some instantiation]);
and further to
template [some instantiation];
We need to produce the text "class ITKCommon_EXPORT Foo<int>" from our macro arguments. We can trivially produce "ITKCommon_EXPORT" using the argument "EXPORT". The "class" keyword and template name "Foo<>" are always the same no matter the arguments. We have already seen how to produce the "int" part from the "x" argument "(int)". The text can be produced by
class EXPORT Foo< ITK_TEMPLATE_1 x >
We substitute this text into the above invocation of ITK_TEMPLATE_EXPORT.
ITK_TEMPLATE_EXPORT(1(class EXPORT Foo< ITK_TEMPLATE_1 x >))
Finally, since the macro ITK_TEMPLATE_EXPORT is given to us by the argument "_" we can substitute the macro argument into the invocation.
_(1(class EXPORT Foo< ITK_TEMPLATE_1 x >))
Putting all the pieces together yields the final instantiation macro.
#define ITK_TEMPLATE_Foo(_, EXPORT, x, y) \ namespace itk \ { \ _(1(class EXPORT Foo< ITK_TEMPLATE_1 x >)) \ namespace Templates { typedef Foo< ITK_TEMPLATE_1 x > Foo##y; }\ }
Developing Example ITK_TEMPLATE_Vector
Consider the definition of ITK_TEMPLATE_Vector from the section above.
#define ITK_TEMPLATE_Vector(_, EXPORT, x, y) namespace itk { \ _(2(class EXPORT Vector< ITK_TEMPLATE_2 x >)) \ _(1(EXPORT std::ostream& operator<<(std::ostream&, \ const Vector< ITK_TEMPLATE_2 x >&))) \ _(1(EXPORT std::istream& operator>>(std::istream&, \ Vector< ITK_TEMPLATE_2 x >&))) \ namespace Templates { typedef Vector< ITK_TEMPLATE_2 x > Vector##y; } \ }
At first glance this looks intimidating, but let's investigate how it was constructed. The goal is to produce an instantiation of Vector and a typedef naming the instantiation. If written out manually the code to instantiate "itk::Vector<double, 3>" and name it "VectorD3" would be
namespace itk { template class ITKCommon_EXPORT Vector<double, 3>; ITKCommon_EXPORT std::ostream& operator<<(std::ostream&, const Vector<double, 3>&); ITKCommon_EXPORT std::istream& operator>>(std::istream&, Vector<double, 3>&); namespace Templates { typedef Vector<double, 3> VectorD3; } }
It is easy to produce the namespace layout inside our macro:
#define ITK_TEMPLATE_Vector(_, EXPORT, x, y) \ namespace itk \ { \ [some explicit instantiations] \ namespace Templates { [some typedef] }\ }
Now we need to create the explicit instantiations and typedef portion. The macro ITK_TEMPLATE_Vector will be invoked as
ITK_TEMPLATE_Vector(ITK_TEMPLATE_EXPORT, ITKCommon_EXPORT, (double, 3), D3)
which gives us the arguments
- _ = ITK_TEMPLATE_EXPORT
- EXPORT = ITKCommon_EXPORT
- x = (double, 3)
- y = D3
Both the explicit instantiations and typedef portion of the macro require that our template argument list "(double, 3)" be expanded to "double, 3" with no outer-level of parentheses. Since we know that Vector has two template arguments the list inside the parentheses will always be of length 2. We can use the macro ITK_TEMPLATE_2 to do the expansion. The macro invocation
ITK_TEMPLATE_2 (double, 3)
will expand to just
double, 3
Since "x" contains the "(double, 3)" part we can write the macro invocation as
ITK_TEMPLATE_2 x
In order to produce the name "VectorD3" as a single C identifier we must concatenate the token "Vector" with the token "D3" given to argument "y". This can be done using the C preprocessor token concatenation syntax and is written "Vector##y". Now we can write the typedef portion as
typedef Vector< ITK_TEMPLATE_2 x > Vector##y;
For the explicit instantiations portion, we can use the macro ITK_TEMPLATE_EXPORT. The macro invocation
ITK_TEMPLATE_EXPORT(2([some instantiation with one comma]))
will be expanded to
template ITK_TEMPLATE_##2([some instantiation with one comma]);
and further to
template [some instantiation with one comma];
We need to produce the text "class ITKCommon_EXPORT Vector<double, 3>" from our macro arguments. We can trivially produce "ITKCommon_EXPORT" using the argument "EXPORT". The "class" keyword and template name "Vector<>" are always the same no matter the arguments. We have already seen how to produce the "double, 3" part from the "x" argument "(double, 3)". The text can be produced by
class EXPORT Vector< ITK_TEMPLATE_2 x >
We substitute this text into the above invocation of ITK_TEMPLATE_EXPORT.
ITK_TEMPLATE_EXPORT(2(class EXPORT Vector< ITK_TEMPLATE_2 x >))
Finally, since the macro ITK_TEMPLATE_EXPORT is given to us by the argument "_" we can substitute the macro argument into the invocation.
_(2(class EXPORT Vector< ITK_TEMPLATE_2 x >))
This takes care of the class template instantiation. Next we need to construct the function template instantiations for the streaming operators. For these, there are no commas in the name not protected by a nested layer of parentheses. Therefore we can use the form "_(1(...))" to specify the instantiations:
_(1(EXPORT std::ostream& operator<<(std::ostream&, const Vector< ITK_TEMPLATE_2 x >&))) _(1(EXPORT std::istream& operator>>(std::istream&, Vector< ITK_TEMPLATE_2 x >&)))
Putting all the pieces together yields the final instantiation macro.
#define ITK_TEMPLATE_Vector(_, EXPORT, x, y) \ namespace itk \ { \ _(2(class EXPORT Vector< ITK_TEMPLATE_2 x >)) \ _(1(EXPORT std::ostream& operator<<(std::ostream&, const Vector< ITK_TEMPLATE_2 x >&))) \ _(1(EXPORT std::istream& operator>>(std::istream&, Vector< ITK_TEMPLATE_2 x >&))) \ namespace Templates { typedef Vector< ITK_TEMPLATE_2 x > Vector##y; } \ }
Maintaining the Macros
If a new supporting function or class template is added to a template's implementation it may also be added to the instantiation macro and all existing exported and imported instantiations of the template will automatically include the new code. For example, if we add the template function
template <class T> void FooHelper(Foo<T>);
to itkFoo.h we can add it to the instantiation macro
#define ITK_TEMPLATE_Foo(_, EXPORT, x, y) \ namespace itk \ { \ _(1(class EXPORT Foo< ITK_TEMPLATE_1 x >)) \ _(1(EXPORT void FooHelper(Foo< ITK_TEMPLATE_1 x >))) \ namespace Templates { typedef Foo< ITK_TEMPLATE_1 x > Foo##y; }\ }
Now every place that exports an instantiation of Foo with a line such as
ITK_EXPORT_ITKCommon(Foo, (int), I)
will automatically include the new helper function in the instantiation.
Organizing Source Files
The instantiation macros defined above provide a concise syntax to export and import template instantiations for any template in ITK. We need to carefully organize our source files to make sure the macros are invoked at the proper time.
One Instantiation per Object
Our design must take into account the way linkers work with object files. The granularity of storing symbols in a library or executable is at the object file level. If any symbol is needed from an object, the whole object file is included, which may then reference other symbols that bring in additional object files. This means that each object file should have as few symbols as possible. For explicit instantiation this means we should have at most one template instantiation per object file.
Since the purpose of an explicit instantiation source file is to provide symbols for that specific instantiation, we should avoid letting the compiler create any more symbols than necessary. If the template being instantiated explicitly refers to other templates in its implementation the compiler will try to implicitly instantiate the other templates. We need to arrange our code so that the compiler never has template definitions other than those for the template being instantiated so that no extra code is compiled. We can do this by making sure that only the .txx file for the desired template is included. This means that all .h files must provide a way to block inclusion of their corresponding .txx file. Previously in ITK all header files surrounded their inclusion of the corresponding .txx files:
// Bottom of itkArray.h #ifndef ITK_MANUAL_INSTANTIATION #include "itkArray.txx" #endif
When converting a template to be available for explicit instantiation we will change the test to
// Bottom of itkFoo.h #if ITK_TEMPLATE_TXX # include "itkFoo.txx" #endif
The macro ITK_TEMPLATE_TXX is defined by itkMacro.h as follows.
#if defined(ITK_MANUAL_INSTANTIATION) # define ITK_TEMPLATE_TXX 0 #else # define ITK_TEMPLATE_TXX !(ITK_TEMPLATE_CXX || ITK_TEMPLATE_TYPE) #endif
This definition allows some flexibility while still letting users block the .txx files completely by defining ITK_MANUAL_INSTANTIATION. Specifically, it allows explicit instantiation sources to identify themselves by writing
#define ITK_TEMPLATE_CXX 1
at the top. The header itkMacro.h also uses this definition to block inclusion of .txx files from headers that have not been converted to be available for explicit instantiation:
#if ITK_TEMPLATE_CXX # undef ITK_MANUAL_INSTANTIATION # define ITK_MANUAL_INSTANTIATION #endif
This means that any outside template instantiations needed by a specific explicit instantiation will also have to be provided by an explicit instantiation. No duplicate implicit instantiations across multiple explicit instantiation source files will be possible.
Now that no .txx files are included by .h files inside an explicit instantiation source we need to make sure the .txx file providing the template definitions for that specific instantiation is available. This is achieved simply by including the one needed .txx file directly:
#include "itkFoo.txx"
This will work because the .txx file also includes its own header:
// itkFoo.txx #ifndef itkFoo_txx #define itkFoo_txx #include "itkFoo.h" template <class T> void Foo<T>::MethodA() {} #endif
An instantiation source file for Foo will get the template definitions for Foo but no other template.
Instantiations as Classes
Each instantiation is effectively its own C++ class just as if it had been written as a non-template. This leads to the idea of having one header and one source file per instantiation just as we have for non-template classes. Since we already have names for the instantiations consisting of a single C identifier we can use that to construct a file name. For example, we could instantiate "itk::Foo<int>" with the name "FooI" by creating a header file
// Templates/itkFooI.h #ifndef ITK_TEMPLATE_H_FooI #define ITK_TEMPLATE_H_FooI #include "itkFoo.h" ITK_IMPORT_ITKCommon(Foo, (int), I) #endif
and a source file
// Templates/itkFooI.cxx #define ITK_TEMPLATE_CXX 1 #include "itkFoo.txx" ITK_EXPORT_ITKCommon(Foo, (int), I)
The template is exported from the ITKCommon library by including Templates/itkFooI.cxx in the library source file list. Other code can import the template simply by including the header
#include "Templates/itkFooI.h"
Importing Standard Instantiations
When instantiations are provided natively by ITK we should automatically import them whenever the user may use them. For example, when the user includes "itkFoo.h" it is likely that some instantiation of Foo will be used. We can automatically import all instantiations provided by ITK by including the import headers:
// Bottom of itkFoo.h: #include "Templates/itkFooI.h" // import itk::Foo<int> as Templates::FooI #include "Templates/itkFooD.h" // import itk::Foo<double> as Templates::FooD
Unfortunately this means that every time a new instantiation is added we need to edit itkFoo.h. Instead we could use an intermediate header:
// Bottom of itkFoo.h: #if ITK_TEMPLATE_EXPLICIT # include "Templates/itkFoo+-.h" #endif // Templates/itkFoo+-.h #include "Templates/itkFooI.h" #include "Templates/itkFooD.h"
The macro ITK_TEMPLATE_EXPLICIT is defined in itkMacro.h as follows.
#if ITK_TEMPLATE_IMPORT_WORKS && defined(ITK_EXPLICIT_INSTANTIATION) # define ITK_TEMPLATE_EXPLICIT !ITK_TEMPLATE_CXX #else # define ITK_TEMPLATE_EXPLICIT 0 #endif
where ITK_TEMPLATE_IMPORT_WORKS is true only on platforms supporting a template import syntax, and ITK_EXPLICIT_INSTANTIATION is defined when explicit instantiation support is enabled for the current build tree.
The intermediate header avoids the need to edit itkFoo.h for every instantiation and prevents inclusion of useless headers when explicit instantiation is not used. Such an intermediate import header may also be automatically generated.
Generating Instantiations
Instantiation headers and sources follow a specific pattern. It is important to maintain this pattern and use consistent instantiation names. Most instantiations for ITK could be generated automatically from a few tables specifying type and dimension arguments.
Much of the problem of defining and naming instantiations has already been solved by the ITK language wrapping process, and further by the WrapITK project.
...utilize this work...integrate instantiation and wrapping...
TODO...
An alternative method: Speeding up builds using precompiled headers.
The method proposed above is undoubtedly the right way to do this. However, as it is not available for another few months. In the interim, it is possible to speed up builds using the ability of certain compilers to precompile headers. It is likely that even with explicitly building object files that using a precompiled header will achieve some small build speed gains. However, in the particular case that I have tested, using implicit building of templated classes, the speed gain is considerable.
The concept is simple. A header file can be pre-parsed by the precompiler, and the result stored. Then the code that includes that header can be compiled more quickly because the compiler can start with the preparsed header. It makes the most sense when there is a block of common includes included in many files, but i have achieved gains even when there is a block for one file.
Precompiled headers are not yet supported by CMAKE (see bug 1260), thus it is up to an implementer to support them through a creative CMakeList.txt. Precompiled headers are supported on at least the Visual C++ compilers, and the GCC compiler from version 3.4 and up, but the details are different. The experience described below is only with the VC++ compiler. However, I briefly describe the GCC method as well.
Limitations
- Under both Visual C++ and GCC, only one precompiled header may be used. Therefore it makes sense to create one include file that includes most other include files, at least the ones that do not change often.
- Visual C++ ignores everything in the file up to the point where the precompiled header is loaded in. GCC is slightly less restrictive, and ignores all C tokens up to the point where the precompiled header is loaded in.
- Under Visual C++ the precompiled header is project specific. A typical ITK build (for Visual C++.net anyway) creates a solution file that contains many projects, one for each target. If you have one header file included by all your executables, you still have to build a separate precompiled header for each project. While somewhat discouraging, considerable build speedups are still possible.
Implementation
For GCC, the header is simply compiled with the compiler to yield a header.h.gch. If that file exists when header.h is included, then the precompiled version will be used.
For VC++, there will be at least one file that creates the precompiled header when it is compiled (using the compile flag /Yc"header.h"), and any number of files that use the precompiled header without creating it (using the compile flag /Yu"header.h"). Fortunately, despite what it says in the reference below, VC++7 is smart enough not to recreate the precompiled header unless its .h file or a .h file included in it has changed.
CMake
Here is an example of what i have in my CMakeLists.txt file, which seems to be working, under Visual C++ 7.NET, and degrading to the standard method on other compilers. One anomaly is that since these are included as compiler flags, the IDE is unaware that it is using precompiled headers, if you try to set that in the options. The effect, in my experience so far, is that the first build takes about the same amount of time but that my subsequent builds - so long as the contents of common.h and its includes have not changed - were about 6 times faster. Needless to say, this was worth the effort for me.
IF(MSVC) SET_SOURCE_FILES_PROPERTIES(example.cxx COMPILE_FLAGS /Yc"common_2d.h") ENDIF(MSVC) ADD_EXECUTABLE(example example.cxx another.cxx yetanother.cxx) TARGET_LINK_LIBRARIES(example ${ITKLIBS} ${BUNCHOFLIBS})
This is for the case where common.h is only included in my main source file. If it was also included in my other files, i could add lines like the following to make the other files use the precompiled header, but not recreate it every time they are compiled.
IF(MSVC) SET_SOURCE_FILES_PROPERTIES(example.cxx COMPILE_FLAGS /Yc"common_2d.h") SET_SOURCE_FILES_PROPERTIES(another.cxx COMPILE_FLAGS /Yu"common_2d.h") SET_SOURCE_FILES_PROPERTIES(yetanother.cxx COMPILE_FLAGS /Yu"common_2d.h") ENDIF(MSVC) ADD_EXECUTABLE(example example.cxx another.cxx yetanother.cxx) TARGET_LINK_LIBRARIES(example ${ITKLIBS} ${BUNCHOFLIBS})
References on precompiled headers
- [1]CMake Bug report 1260: Precompiled header support http://www.cmake.org/Bug/bug.php?op=show&bugid=1260&pos=0
- [2]GNU compiler documentation 3.20 Using Precompiled Headers http://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html
- [3]Cygnus Software: Precompiled headers article. http://www.cygnus-software.com/papers/precompiledheaders.html
ITK Template Statistics
As of May 6, 2006, here is a summary of statistics regarding ITK templates. These were generated with a simple shell command run from the Code, Testing and Examples directories. The command was run on a linux system.
find . -name \*.o -exec nm -A -C {} \; | grep \< | grep itk:: | cut -d":" -f4- | grep \>\$ | sort
The frequency counts were created by running the following gawk script on the results from each directory.
BEGIN {count=1; getline last} { if ($0 == last) { count++ } else { if (count >0) { printf "%4d: %s\n", count, last } last=$0 count=1 } } END { if (count >0) { printf "%4d: %s\n", count, last } }
Summary for the three directories:
Directory | # of Templates | # of unique Templates |
---|---|---|
Code | 1277 | 446 |
Testing | 52234 | 3903 |
Examples | 36873 | 1734 |
Details for for each directory sorted by frequency of the templates are in
- Code By Frequency
- Code By Name
- Testing By Frequency
- Testing By Name
- Examples By Frequency
- Examples By Name
Some Results for Visual Studio 2003
The following experiment included 111 template files in addition to the files checked into Common/Templates. These files covered: ConstIterator, ConstRegionIterator and RegionIterator.
Directory | Explicit size | Implicit size |
---|---|---|
Code | 104 meg | 64 meg |
Testing | 598 meg | 740 meg |
Examples | 427 meg | 500 meg |
bin | 6.3 gig | 6.6 gig |
Total build tree | 7.5 gig | 8 gig |
Using WrapITK for Explicit Template Files Generation
WrapITK can be downloaded from the InsightJournal: http://www.insight-journal.org/dspace/handle/1926/188
Here are some modifications to do in order to allow WrapITK to generate explicit template files
- Modification to CMakeLists.txt to add option for building explicit templates and install scripts
- Modification to ConfigureWrapping.cmake to make CableSwig not required (not needed when only generating explicit templates)
- Add the newly created CreateExplicitInstantiations.cmake
- Add the newly created WrapITKTypesExplicit
- Add explicit_.h.in, explicit_.cxx.in and explicit_+-.h.in files in the ConfigurationInputs directory
- Modify top level CMakeLists.txt file to enable Explicit Instantiations:
IF(WRAP_ITK_EXPLICIT_INSTANTIATION) WRAPPER_LIBRARY_CREATE_EXPLICIT_INSTANTIATION_FILES("ITKCommon") WRAPPER_LIBRARY_CREATE_EXPLICIT_INSTANTIATION_LIBRARY() ENDIF(WRAP_ITK_EXPLICIT_INSTANTIATION)
IF(CSWIG_REQUIRED) WRAPPER_LIBRARY_CREATE_WRAP_FILES() WRAPPER_LIBRARY_CREATE_LIBRARY() ENDIF(CSWIG_REQUIRED)
- Current Limitations
- The wrap_*.cmake should contain only one wrapping per class. For instance, the current wrap_itkVector.cmake is also doing the wrapping for itkCovariantVector. This would have to be divided into two files.