Importer Manual / Version 2401
Table Of Contents
If transformations with regular expressions or XSLT are not sufficient, you can develop your
own transformer in Java based on the Importer API. In this section you can find out more
about formatting such transformers. The Importer API is closely related to the Java API for
XML processing (JAXP), especially with the javax.xml.transform hierarchy. In
some places, however, JAXP is not powerful enough, or too XSLT-specific for the requirements
of CoreMedia, so that CoreMedia had to define some extensions.
In accordance with both JAXP and the document generator, transformers are not specified
directly, but rather indirectly via factories. Since the
javax.xml.transform.TransformerFactory is very XSL-specific, the Importer API
defines a more general factory, the
GeneralTransformerFactory.
Like the document generator, the transformer factories are instantiated with
java.beans.Beans.instantiate and can be configured with properties in the sense
of java.beans.BeanInfo. In the configuration file, such a property is entered
with the prefix of the importer (with number), the keyword property and the
actual name of the property. For example, the
XsltTransformerFactory
introduced above is passed to the style sheet via such a property.
import.transformer.10.class=XsltTransformerFactory import.transformer.10.name=My Stylesheet import.transformer.10.property.stylesheet=path/to/stylesheet.xsl
Example 4.17. Bean Property
The
GeneralTransformerFactory
interface consists of two methods, getTransformer and getFeature.
The method getFeature is used in the sense of JAXP to find out whether the
transformers created by this factory support certain source and result formats. For example,
if your factory returns "true" for the call
factory.getFeature(DOMSource.FEATURE) factory.getFeature(StreamSource.FEATURE)
Example 4.18. getFeature
this means that the transformers created with factory.getTransformer(name)
accept both a StreamSource and a DOMSource as input documents. Therefore, if your
transformer does not contain an XML parser, but is confined to input documents in DOM
format, the factory should return "false" for getFeature(StreamSource.FEATURE).
On the other hand, if the transformer processes non-XML documents, the factory must return
"false" for getFeature(DOMSource.FEATURE), because otherwise the importer tries
to parse the supposed XML document, naturally leading to an error.
This is also true for DOMResult.FEATURE and StreamResult.FEATURE.
If your transformer works internally with a DOM tree, it should return it as such, and not
as Stream. If the next transformer in the chain expects a DOM tree as input, this saves a
new parsing of the document.
In this version, SAX is supported neither on the source nor on the result side.
If a transformer should be only called once for the whole document set rather than for each
source document individually, its factory must return "true" for
getFeature(MultiSource.FEATURE). In contrast to DOMSource and
StreamSource,
MultiSource
does not belong to JAXP, but is an extension by CoreMedia. If the transformer should return
multiple documents, the factory must return "true" for
getFeature(MultiResult.FEATURE). MultiResult has already been introduced in
connection with the document generator.
The source and result formats of a transformer are completely independent from each other. For example, you can develop transformers which create a single Stream document from all source documents, or which produce a set of DOM trees from one Stream document.
To make the creation of factories easier, the Importer API contains the class
GeneralTransformerFactoryImpl,
which implements the
GeneralTransformerFactory
interface.
GeneralTransformerFactoryImpl
can be configured with three properties: transformerclass,
sourceformat and resultformat. transformerclass sets
the class of the actual transformer. This class must be a derivative of
javax.xml.transform.Transformer (for more details see below), and must have a
default construction without parameters. sourceformat and resultformat
give the source and result format. Valid values are stream, dom
and multi.
(Note: these values do not match those of the corresponding FEATURE constants. The latter
are opaque and are therefore not suitable for configuration via property files.)
The configuration of a transformer of the class MyTransformer, which should be
called individually for each document, which processes the source documents as Stream and
which produces multiple result documents, therefore appears as follows:
import.transformer.20.class=GeneralTransformerFactoryImpl import.transformer.20.name=My special transformer import.transformer.20.property.transformerclass= com.mycompany.MyTransformer import.transformer.20.property.sourceformat=stream import.transformer.20.property.resultformat=multi
Example 4.19. Configuration of a transformer
GeneralTransformerFactoryImpl has further features: the transformers instantiated with this class automatically receive some parameters without these having to be explicitly configured. In particular, these are
the name of the transformer (that is the value of the
import.transformer.xx.nameproperty)a log object which the transformer can use for log outputs
a CoreMedia object which enables access to the CoreMedia repository
Details of the classes of these objects can be found in the Importer API.
By using getTransformer the importer calls up an instance of the transformer
from the factory. As name argument, the importer passes getTransformer the name
entered in the configuration file for this transformer with the name property
(in the example above, therefore, "My special transformer"). The factory can use this name,
for example, for log outputs. However, the name is not intended for information that is
semantically more important. For this purpose there are properties.
The transformer itself is an object of the javax.xml.transform.Transformer class.
The decisive abstract method of this class which you must implement within the
framework of a derivation of Transformer in order to realize your
transformation is transform. In addition to this, Transformer has
a few other abstract methods whose function, however, is precisely specified by JAXP. To
save you work, these methods are already implemented in the Importer API: if you derive your
transformer from com.coremedia.publisher.importer.AbstractTransformer, rather
than directly from javax.xml.transform.Transformer, you only need to implement
the transform method.
According to the getFeature information of the factory, the importer calls the
transform method either individually for each source document in the form of a
StreamSource or DOMSource, or once for all source documents in the
form of a
MultiSource.
In both cases, the importer calls the getTransformer method of the factory only
once and transforms all documents with the same instance of the transformer.


