Importer Manual / 4.4.1 DOM Transformation

Importer Manual / Version 2201

4.4.1 DOM Transformation

In the first step of the example a special transformer is created that generates any missing <base> elements in the source documents. These elements are optional according to the DTD, but in the framework of this example are vital for further transformation. The following section of an example document illustrates the position at which the <base> elements are located:

<?xml version="1.0"?>

<nitf>

<head>
<title>Snow, Freezing Rain Batter U.S. Northeast</title>
<base href="xmlnewssample.xml"/>
</head>

</nitf>

Example 4.20. Element

In this example, the href attribute of the <base> element becomes the ID of the document. Therefore, all generated <base> elements within an importer process must have different href values. The transformer should ensure this by means of a configurable prefix, extended with a consecutive number. The transformer processes all source documents at once. Here is the Java code of this transformer:

import javax.xml.transform.*;
import javax.xml.transform.dom.*;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.w3c.dom.*;

import com.coremedia.publisher.importer.*;

public class BaseMaker extends AbstractTransformer {

   private static final Logger LOG
     = LoggerFactory.getLogger(BaseMaker.class);

    // Returns the leftmost child of parent which is an Element 
   // of the specified tag name.
    protected Element getChild(Node parent, String tag) {
        NodeList children = parent.getChildNodes();
        for (int i=0; i<children.getLength(); i++) 
	            if (children.item(i).getNodeType() == 
Node.ELEMENT_NODE &&
                tag.equals(children.item(i).getNodeName()))
                return (Element)children.item(i);
        return null;
    }
    
    protected void makeBase(MultiSource src, MultiResult res) 
     throws Exception {
        // loop over the source documents
        for (int i=0; i<src.size(); i++) {
            // get a source document in DOM format
            Source domsrc = src.getSource(i, DOMSource.FEATURE);
            
            // check for a base Element
            Document doc = (Document)((DOMSource)domsrc).getNode();
            Element nitf = doc.getDocumentElement();
            Element head = getChild(nitf, "head");
            Element base = getChild(head, "base");
            
            if (base == null) {
                // no base Element yet, generate one
                base = doc.createElement("base");
                
                // don't forget to configure the factory to 
                //pass the prefix
                base.setAttribute("href", 
	getParameter("prefix").toString() + i);
                head.appendChild(base);
            }
            
            // modified or not, append the document to the 
            // MultiResult
            String systemId = domsrc.getSystemId();
            Result domres = res.addNewResult(DOMResult.FEATURE, 
	systemId);
            ((DOMResult)domres).setNode(doc);
        }
    }
    
    public void transform(Source src, Result res) throws 
     TransformerException {
        // Say "Hi!"...
        String name =
         getParameter(TransformerParameters.NAME).toString();
        LOG.info(name);

        try {
            // The factory assures that you have a MultiSource and
            // a MultiResult
            makeBase((MultiSource)src, (MultiResult)res);
        } catch (TransformerException exc) {
            throw exc;
        } catch (Exception exc) {
            throw new TransformerException(exc);
        }
    }
    
}

Example 4.21. BaseMaker.java

The transform method first creates a log output. The fact that the name is available to the transformer is a feature of the GeneralTransformerFactoryImpl factory, which you will use for instantiation of this transformer. The actual transformation is transferred to the makeBase method. transform only catches any exceptions and transforms these, if necessary, into TransformerExceptions.

Instead of transforming all documents at once, it would also be possible to administrate the consecutive numbers with a static variable and call the transformer individually for each document. This would make the loop over the source documents and the construction of the MultiResult unnecessary. However, this example is intended to show the use of the Importer API.

Next, a factory is required which instantiates and configures the transformer. GeneralTransformerFactoryImpl is not sufficient, because it does not support passing of the prefix to the transformer. The following extension puts this right:

import javax.xml.transform.*;
import com.coremedia.publisher.importer.*;

public class BaseMakerFactory extends
 GeneralTransformerFactoryImpl {
    private String prefix = null;

    public BaseMakerFactory() throws ClassNotFoundException {
      super(MultiSource.FEATURE, MultiResult.FEATURE, "BaseMaker");
    }

    public void setPrefix(String prefix) {
        this.prefix = prefix;
    }
    
    public Transformer getTransformer(String name) 
    throws Exception {
        Transformer trf = super.getTransformer(name);
        trf.setParameter("prefix", prefix);
        return trf;
    }
}

Example 4.22. BaseMakerFactory.java

The transformer now only has to be entered in the configuration file of the importer. Since you had to change GeneratorTransformerFactoryImpl anyway, in order to support the prefix, the class of the transformer (BaseMaker), as well as the source and result formats, have also coded straight into BaseMakerFactory. In the configuration file, therefore, only a prefix needs to be specified in addition to the class and the name.

import.transformer.10.class=BaseMakerFactory
import.transformer.10.name=Create missing base elements
import.transformer.10.property.prefix=XmlNews

Example 4.23. Configuration

Search Results

Table Of Contents

Filter

Importer Manual / Version 2201

4.4.1 DOM Transformation

Search Results