close

Filter

Importer Manual / Version 2401

Table Of Contents

4.3.3 User-defined Transformers

If transformations with regular expressions or XSLT are not sufficient, you can develop your own transformer in Java based on the Importer API. In this section you can find out more about formatting such transformers. The Importer API is closely related to the Java API for XML processing (JAXP), especially with the javax.xml.transform hierarchy. In some places, however, JAXP is not powerful enough, or too XSLT-specific for the requirements of CoreMedia, so that CoreMedia had to define some extensions.

In accordance with both JAXP and the document generator, transformers are not specified directly, but rather indirectly via factories. Since the javax.xml.transform.TransformerFactory is very XSL-specific, the Importer API defines a more general factory, the GeneralTransformerFactory.

Like the document generator, the transformer factories are instantiated with java.beans.Beans.instantiate and can be configured with properties in the sense of java.beans.BeanInfo. In the configuration file, such a property is entered with the prefix of the importer (with number), the keyword property and the actual name of the property. For example, the XsltTransformerFactory introduced above is passed to the style sheet via such a property.

import.transformer.10.class=XsltTransformerFactory
import.transformer.10.name=My Stylesheet
import.transformer.10.property.stylesheet=path/to/stylesheet.xsl

Example 4.17. Bean Property


The GeneralTransformerFactory interface consists of two methods, getTransformer and getFeature.

The method getFeature is used in the sense of JAXP to find out whether the transformers created by this factory support certain source and result formats. For example, if your factory returns "true" for the call

factory.getFeature(DOMSource.FEATURE)
factory.getFeature(StreamSource.FEATURE)

Example 4.18. getFeature


this means that the transformers created with factory.getTransformer(name) accept both a StreamSource and a DOMSource as input documents. Therefore, if your transformer does not contain an XML parser, but is confined to input documents in DOM format, the factory should return "false" for getFeature(StreamSource.FEATURE). On the other hand, if the transformer processes non-XML documents, the factory must return "false" for getFeature(DOMSource.FEATURE), because otherwise the importer tries to parse the supposed XML document, naturally leading to an error.

This is also true for DOMResult.FEATURE and StreamResult.FEATURE. If your transformer works internally with a DOM tree, it should return it as such, and not as Stream. If the next transformer in the chain expects a DOM tree as input, this saves a new parsing of the document.

In this version, SAX is supported neither on the source nor on the result side.

If a transformer should be only called once for the whole document set rather than for each source document individually, its factory must return "true" for getFeature(MultiSource.FEATURE). In contrast to DOMSource and StreamSource, MultiSource does not belong to JAXP, but is an extension by CoreMedia. If the transformer should return multiple documents, the factory must return "true" for getFeature(MultiResult.FEATURE). MultiResult has already been introduced in connection with the document generator.

The source and result formats of a transformer are completely independent from each other. For example, you can develop transformers which create a single Stream document from all source documents, or which produce a set of DOM trees from one Stream document.

To make the creation of factories easier, the Importer API contains the class GeneralTransformerFactoryImpl, which implements the GeneralTransformerFactory interface. GeneralTransformerFactoryImpl can be configured with three properties: transformerclass, sourceformat and resultformat. transformerclass sets the class of the actual transformer. This class must be a derivative of javax.xml.transform.Transformer (for more details see below), and must have a default construction without parameters. sourceformat and resultformat give the source and result format. Valid values are stream, dom and multi. (Note: these values do not match those of the corresponding FEATURE constants. The latter are opaque and are therefore not suitable for configuration via property files.)

The configuration of a transformer of the class MyTransformer, which should be called individually for each document, which processes the source documents as Stream and which produces multiple result documents, therefore appears as follows:

import.transformer.20.class=GeneralTransformerFactoryImpl 
import.transformer.20.name=My special transformer 
import.transformer.20.property.transformerclass=
com.mycompany.MyTransformer 
import.transformer.20.property.sourceformat=stream 
import.transformer.20.property.resultformat=multi

Example 4.19. Configuration of a transformer


GeneralTransformerFactoryImpl has further features: the transformers instantiated with this class automatically receive some parameters without these having to be explicitly configured. In particular, these are

  • the name of the transformer (that is the value of the import.transformer.xx.name property)

  • a log object which the transformer can use for log outputs

  • a CoreMedia object which enables access to the CoreMedia repository

Details of the classes of these objects can be found in the Importer API.

By using getTransformer the importer calls up an instance of the transformer from the factory. As name argument, the importer passes getTransformer the name entered in the configuration file for this transformer with the name property (in the example above, therefore, "My special transformer"). The factory can use this name, for example, for log outputs. However, the name is not intended for information that is semantically more important. For this purpose there are properties.

The transformer itself is an object of the javax.xml.transform.Transformer class. The decisive abstract method of this class which you must implement within the framework of a derivation of Transformer in order to realize your transformation is transform. In addition to this, Transformer has a few other abstract methods whose function, however, is precisely specified by JAXP. To save you work, these methods are already implemented in the Importer API: if you derive your transformer from com.coremedia.publisher.importer.AbstractTransformer, rather than directly from javax.xml.transform.Transformer, you only need to implement the transform method.

According to the getFeature information of the factory, the importer calls the transform method either individually for each source document in the form of a StreamSource or DOMSource, or once for all source documents in the form of a MultiSource. In both cases, the importer calls the getTransformer method of the factory only once and transforms all documents with the same instance of the transformer.

Search Results

Table Of Contents
warning

Your Internet Explorer is no longer supported.

Please use Mozilla Firefox, Google Chrome, or Microsoft Edge.