Importer Manual / Version 2010
Table Of Contents
If transformations with regular expressions or XSLT are not sufficient, you can develop your
own transformer in Java based on the Importer API. In this section you can find out more
about formatting such transformers. The Importer API is closely related to the Java API for
XML processing (JAXP), especially with the javax.xml.transform
hierarchy. In
some places, however, JAXP is not powerful enough, or too XSLT-specific for the requirements
of CoreMedia, so that CoreMedia had to define some extensions.
In accordance with both JAXP and the document generator, transformers are not specified
directly, but rather indirectly via factories. Since the
javax.xml.transform.TransformerFactory
is very XSL-specific, the Importer API
defines a more general factory, the
GeneralTransformerFactory.
Like the document generator, the transformer factories are instantiated with
java.beans.Beans.instantiate
and can be configured with properties in the sense
of java.beans.BeanInfo
. In the configuration file, such a property is entered
with the prefix of the importer (with number), the keyword property
and the
actual name of the property. For example, the
XsltTransformerFactory
introduced above is passed to the style sheet via such a property.
import.transformer.10.class=XsltTransformerFactory import.transformer.10.name=My Stylesheet import.transformer.10.property.stylesheet=path/to/stylesheet.xsl
Example 4.17. Bean Property
The
GeneralTransformerFactory
interface consists of two methods, getTransformer
and getFeature
.
The method getFeature
is used in the sense of JAXP to find out whether the
transformers created by this factory support certain source and result formats. For example,
if your factory returns "true" for the call
factory.getFeature(DOMSource.FEATURE) factory.getFeature(StreamSource.FEATURE)
Example 4.18. getFeature
this means that the transformers created with factory.getTransformer(name)
accept both a StreamSource
and a DOMSource
as input documents. Therefore, if your
transformer does not contain an XML parser, but is confined to input documents in DOM
format, the factory should return "false" for getFeature(StreamSource.FEATURE)
.
On the other hand, if the transformer processes non-XML documents, the factory must return
"false" for getFeature(DOMSource.FEATURE)
, because otherwise the importer tries
to parse the supposed XML document, naturally leading to an error.
This is also true for DOMResult.FEATURE
and StreamResult.FEATURE
.
If your transformer works internally with a DOM tree, it should return it as such, and not
as Stream. If the next transformer in the chain expects a DOM tree as input, this saves a
new parsing of the document.
In this version, SAX is supported neither on the source nor on the result side.
If a transformer should be only called once for the whole document set rather than for each
source document individually, its factory must return "true" for
getFeature(MultiSource.FEATURE)
. In contrast to DOMSource
and
StreamSource
,
MultiSource
does not belong to JAXP, but is an extension by CoreMedia. If the transformer should return
multiple documents, the factory must return "true" for
getFeature(MultiResult.FEATURE)
. MultiResult
has already been introduced in
connection with the document generator.
The source and result formats of a transformer are completely independent from each other. For example, you can develop transformers which create a single Stream document from all source documents, or which produce a set of DOM trees from one Stream document.
To make the creation of factories easier, the Importer API contains the class
GeneralTransformerFactoryImpl,
which implements the
GeneralTransformerFactory
interface.
GeneralTransformerFactoryImpl
can be configured with three properties: transformerclass
,
sourceformat
and resultformat
. transformerclass
sets
the class of the actual transformer. This class must be a derivative of
javax.xml.transform.Transformer
(for more details see below), and must have a
default construction without parameters. sourceformat
and resultformat
give the source and result format. Valid values are stream
, dom
and multi
.
(Note: these values do not match those of the corresponding FEATURE constants. The latter
are opaque and are therefore not suitable for configuration via property files.)
The configuration of a transformer of the class MyTransformer
, which should be
called individually for each document, which processes the source documents as Stream and
which produces multiple result documents, therefore appears as follows:
import.transformer.20.class=GeneralTransformerFactoryImpl import.transformer.20.name=My special transformer import.transformer.20.property.transformerclass= com.mycompany.MyTransformer import.transformer.20.property.sourceformat=stream import.transformer.20.property.resultformat=multi
Example 4.19. Configuration of a transformer
GeneralTransformerFactoryImpl has further features: the transformers instantiated with this class automatically receive some parameters without these having to be explicitly configured. In particular, these are
the name of the transformer (that is the value of the
import.transformer.xx.name
property)a log object which the transformer can use for log outputs
a CoreMedia object which enables access to the CoreMedia repository
Details of the classes of these objects can be found in the Importer API.
By using getTransformer
the importer calls up an instance of the transformer
from the factory. As name argument, the importer passes getTransformer
the name
entered in the configuration file for this transformer with the name
property
(in the example above, therefore, "My special transformer"). The factory can use this name,
for example, for log outputs. However, the name is not intended for information that is
semantically more important. For this purpose there are properties.
The transformer itself is an object of the javax.xml.transform.Transformer
class.
The decisive abstract method of this class which you must implement within the
framework of a derivation of Transformer
in order to realize your
transformation is transform.
In addition to this, Transformer
has
a few other abstract methods whose function, however, is precisely specified by JAXP. To
save you work, these methods are already implemented in the Importer API: if you derive your
transformer from com.coremedia.publisher.importer.AbstractTransformer,
rather
than directly from javax.xml.transform.Transformer
, you only need to implement
the transform
method.
According to the getFeature
information of the factory, the importer calls the
transform
method either individually for each source document in the form of a
StreamSource
or DOMSource
, or once for all source documents in the
form of a
MultiSource.
In both cases, the importer calls the getTransformer
method of the factory only
once and transforms all documents with the same instance of the transformer.