[Insight-developers] Bio-Formats integration with VTK/ITK

Curtis Rueden ctrueden at wisc.edu
Thu Jan 8 16:11:41 EST 2009


Hi everyone,

This email is about integration of Bio-Formats with the VTK and/or ITK
toolkits. It is fairly long, so please ignore if uninterested.

My apologies for cross posting to multiple mailing lists, but I thought this
topic might be of interest to a variety of people, and I would rather err on
the side of inclusion. I am new to some lists, so I also apologize if I have
missed any relevant prior discussion. Also, the message was rejected the
first time due to "too many recipients" so I am resending -- please forgive
the duplicate if you already received this.

For those who don't know me, my name is Curtis Rueden and I am the project
manager for Bio-Formats (http://www.loci.wisc.edu/ome/formats.html).
Bio-Formats is a standalone Java library for reading and writing life
sciences image file formats. It is capable of parsing both pixels and
metadata for a large number of formats (64 at this writing), as well as
writing to several (11) formats.

---
BENEFITS

In recent months it has become apparent that a robust Bio-Formats C++
interface would significantly benefit the community. In particular,
integrating Bio-Formats with VTK/ITK would provide several advantages:

* Support for 60+ life sciences file formats within VTK/ITK, of course.

* Support for these formats in Badri Roysam's FARSIGHT project at RPI, which
currently uses ITK's ImageIOBase/ImageIOFactory mechanism to read images.

* Support for these formats in BioImageXD (http://www.bioimagexd.net/),
which I understand also uses ITK to read images.

* A cross-platform testbed for the Jace framework (
http://sourceforge.net/projects/jace/) for deploying Java APIs from within
native code. Background: Jace produces C++ proxies corresponding to a Java
API, allowing C++ code to call Java methods transparently via JNI
Invocation. This technique requires a Java Runtime Environment on the
machine, but is more performant than shared VM messaging solutions such as
Ice (http://www.zeroc.com/ice.html), CORBA or XML-RPC. If Bio-Formats had a
set of cross-platform Jace-driven C++ bindings, it would simplify calling
Bio-Formats from VTK/ITK. Once the bindings exist, our regular testing and
maintenance would strengthen the Jace project itself.

* It might be useful for MeVis Research's MeVisLab environment (
http://www.mevislab.de/), which has some ITK-related functionality --
although they have already implemented preliminary support for Bio-Formats
using vanilla JNI.

* Potential support from within other VTK/ITK-based applications (e.g.,
Mayavi2: http://code.enthought.com/projects/mayavi/)

---
PRIOR DISCUSSION

Back in July, Dan White briefly discussed a possible Bio-Formats integration
with VTK/ITK on the VTK mailing list:

http://www.vtk.org/pipermail/vtkusers/2008-July/096253.html
http://www.vtk.org/pipermail/vtkusers/2008-July/096255.html
http://www.vtk.org/pipermail/vtkusers/2008-July/096276.html
http://www.vtk.org/pipermail/vtkusers/2008-July/096278.html
http://www.vtk.org/pipermail/vtkusers/2008-July/096280.html
http://www.vtk.org/pipermail/vtkusers/2008-July/096286.html
http://www.vtk.org/pipermail/vtkusers/2008-July/096287.html
http://www.vtk.org/pipermail/vtkusers/2008-July/096291.html

---
LICENSING

First, I want to address the licensing concern raised by Karthik Krishnan
regarding use of Bio-Formats within ITK/VTK: Bio-Formats is GPL, while ITK
and VTK are BSD. Unfortunately, this means that a combined ITK/VTK +
Bio-Formats work becomes GPL -- in other words, Bio-Formats cannot be
distributed with ITK/VTK under the BSD license.

To be clear, LOCI and Glencoe Software strongly support open source
software, including BSD-licensed software. Ideally, our wish is for
Bio-Formats to be freely available for use with other open source projects
regardless of license. My concern is that a linking exception allowing
Bio-Formats to be distributed with VTK/ITK (e.g., using the LGPL instead of
the GPL) would also allow a combined work to be expropriated for use within
proprietary software, sidestepping the GPL.

One solution might be for ITK/VTK to provide a plugin infrastructure for use
by a Bio-Formats plugin module, which the end user would install separately
into ITK/VTK. We currently use this approach to provide a Bio-Formats plugin
for ImageJ even though ImageJ itself is public domain software.

This approach is a somewhat gray area; see:
http://www.gnu.org/licenses/gpl-faq.html#GPLPluginsInNF

The difference in this situation is that ITK/VTK is not a non-free program,
so the combined work would not violate the GPL, but it might be "infected"
by it.

If anyone with a better understanding of open source licensing has further
thoughts or corrections on this issue, I would greatly appreciate it.

---
METADATA

Also discussed in the prior thread are issues of metadata conversion, which
I am willing to discuss but frankly would prefer to postpone until an
initial integration is completed. Any C++ bindings for Bio-Formats would
include full access to its MetadataStore API, which would allow C++
applications to manipulate OME-XML metadata.

In response to the questions about OME-XML: the goal of the OME-XML schema
is to fully encapsulate the acquisition metadata from all supported
microscopy formats, with regular (at least 2/year) releases to continuously
improve and update the schema. *The primary goal of Bio-Formats is to
standardize proprietary metadata into this common data model.* So I strongly
agree with Dan that we want to preserve that standard whenever possible. It
would likely be possible to express the OME-XML schema as part of the ITK
metadata dictionary, though to be sure I would need to understand more about
how ITK models metadata.

---
PERFORMANCE

I have done a lot of research on time and space performance of Java versus
C++, as well as performance when integrating Java code with C++ through
various means. Much of those results fall outside the scope of this email,
but suffice to say that Java's I/O performance is comparable to C++, and
Bio-Formats is mostly I/O-bound. Any observed speed difference in the
Bio-Formats library compared to other implementations (e.g., Dan points out
that BioImageXD's LSM support is faster than reading an LSM file into ImageJ
using Bio-Formats) is most likely due to algorithm inefficiency in the
Bio-Formats code rather than a relative deficiency in the language itself.

Regardless, I think our best bet is to interface the Bio-Formats Java API
with C++ in the most performant way available. Any solution more involved
than that, such as language translating Bio-Formats into C++, has its own
serious pitfalls, would require months of effort at best, and would be
prohibitive for us to maintain. And less sophisticated solutions, such as
conversion of life sciences formats into the open OME-TIFF standard, then
reading the result into ITK, are not performant enough for many
applications.

---
QUESTIONS AND DISCUSSION

My goal with this email is to kickstart some discussion about integrating
Bio-Formats with VTK/ITK. Specifically, my questions for the community are:

1) Do others agree that this form of integration is a good idea? Or is there
a better way to accomplish the bulleted goals above?

2) I don't know much about ITK or VTK yet. Where is the right integration
point? Within ITK, accessible using ImageIOBase/ImageIOFactory? Or
elsewhere? Does VTK/ITK have an appropriate plugin infrastructure?

3) Does anyone have experience using Jace cross-platform? Or does anyone
know an open source solution other than Jace for accessing Java API from
C++? Unfortunately, WrapITK & CableSwig go the other way -- wrapping C++
code for access from Java -- which is not what we want here.

4) Any thoughts on the GPL/BSD licensing issue?

This project is currently my top priority, but I am not a C++ expert and I
would greatly appreciate help from anyone in these various communities.
Thanks in advance for any comments and replies!

Regards,
Curtis Rueden
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20090108/a38aa460/attachment.htm>


More information about the Insight-developers mailing list