close

Filter

Importer Manual / Version 2401

Table Of Contents

4.2.3 Document Sets

If it is not possible to supply the importer with source documents according to the principle described in the previous section, you can implement your own mechanism on the basis of the Importer API. Understanding this section requires knowledge of Java; in particular, the Importer API is based on JAXP.

You can find the Importer API on the CoreMedia documentation site at https://documentation.coremedia.com/cmcc-12.

Unless the package is explicitly given, the classes and interfaces mentioned in this section come from the javax.xml hierarchy or from com.coremedia.publisher.importer. There are no name conflicts between these packages.

The importer internally functions not with files but with document sets. Such a document set contains the source documents to be imported in one operation. A source document can be represented in various ways, indirectly via a URL or directly as a series of bytes or as a DOM tree.

The combination of these document sets is controlled by a class which implements the MultiResultGeneratorFactory interface. The name of this class is entered in the configuration file:

### The com.coremedia.publisher.importer.MultiResult
### Generator interface 
### implementation to use 
import.multiResultGeneratorFactory.class=
com.coremedia.publisher.importer.SubDirGeneratorFactory

Example 4.11. MultiResultGeneratorFactory


The class SubDirGeneratorFactory creates generators which implement the file logic described in the previous section. SubDirGeneratorFactory is included with the delivery of the importer and is preset in the example configuration file properties/corem/cm-xmlimport.properties. Instead, you can use your own MultiResultGeneratorFactory.

The class is instantiated by the importer with java.beans.Beans.instantiate. If the class supports Properties in the sense of java.beans.BeanInfo (see Oracle JavaBeans API Specification), these properties can be specified in the configuration file and are then set by the importer. For example, SubDirGeneratorFactory requires the properties inbox and sleepingSeconds:

### Path to inbox (may be relative to $COREM_HOME):
import.multiResultGeneratorFactory.property.inbox = 
<my/inbox/directory>

### Seconds to sleep between importer runs
import.multiResultGeneratorFactory.property.sleepingSeconds = -1

Example 4.12. MultiResultGeneratorFactory


Such property entries have the format

import.multiResultGeneratorFactory.property.<propertyName> = <propertyValue>

and are specific in their meaning for the particular MultiResultGeneratorFactory implementation. Therefore, when configuring your own factory, you have free choice of names and number of properties. You are not confined to inbox and sleepingSeconds.

All properties are set by the importer as strings. For example, the class SubDirGeneratorFactory must transform the value of sleepingSeconds into a number.

After the properties have been set, the importer obtains the actual generator for the document set from the factory with getMultiResultGenerator(). In this situation, the use of factories has the advantage that the factory can use the properties to configure the generator in any desired way.

getMultiResultGenerator() must return a MultiResultGenerator object which supports the methods fail, next and success. next is called by the importer in order to create a new set of source documents. With success and fail, the importer informs the generator about the success or failure of the import of the source documents delivered by the previous next command. After calling success or fail, the importer no longer accesses the source documents.

For example, on each next command, the generator created by SubDirGeneratorFactory delivers a file directly from the inbox directory or all files of a subdirectory. After the complete contents of the inbox directory have been imported, the generator delays the next execution of the next method by the time determined in the property sleepingSeconds, and then delivers the files which have newly arrived in the meantime to the importer. The success method moves the file or subdirectory to the bak directory, fail to the err directory.

If the next method of the generator returns null, the importer ends. In the normal case, however, it delivers a new document set in the form of a MultiResult. MultiResult implements the Result interface and therefore fits into the concept of JAXP next to StreamResult, SAXResult and DOMResult. A new empty MultiResult is created via MultiResultFactory.getInstance().getMultiResult(). New documents are added to MultiResult via addNewResult. There are two variants of this method:

void addNewResult(String systemId) throws Exception;

Result addNewResult(String format, String systemId) 
throws Exception;

Example 4.13. MultiResult.addNewResult


The first variant is used for entering a document via a reference. The systemId must be an URL which can be read by the importer as an input stream, for example a file path.

With the second variant, the document data can be entered directly. The method returns a Result in which the data can be deposited. The parameter format determines whether the result is a DOMResult, a StreamResult or again a MultiResult. (SAX is not yet supported in this version). Valid values for format are StreamResult.FEATURE, DOMResult.FEATURE or MultiResult.FEATURE. In this variant, the systemID is not used as a data source, but, according to the JAXP concept, only as the basis for resolution of relative URLs. Depending on the result type, the generator can store the source document with DOMResult.setNode or StreamResult.setOutputStream().write, or construct a more deeply nested document hierarchy with MultiResult.addNewResult. (The nesting plays no role for the importer; it could at most be used by special transformers.)

A final example shows a simple next method which returns one file of a directory on each call.

File inbox = new File("/tmp/inbox");
File[] files = inbox.listFiles();
int index = 0;

public MultiResult next() {
    try {
        if (index < files.length) {
            MultiResult mr = MultiResultFactory.getInstance().
	getMultiResult();
            mr.addNewResult(files[index++].getAbsolutePath());
            return mr;
        }
    } catch (Exception e) {
        System.out.println("Something went wrong!");
    }
    return null;
}

Example 4.14. next


If no transformers are entered in the configuration file, the sets of source documents delivered by next are imported directly. Of course, this only works if the documents are CoreMedia XML documents matching the content types of the CoreMedia CMS. Typically, however, a transformation of the documents is necessary to achieve the correct format. This is the subject of the next section.

Search Results

Table Of Contents
warning

Your Internet Explorer is no longer supported.

Please use Mozilla Firefox, Google Chrome, or Microsoft Edge.