Importer Manual / Version 2401
Table Of ContentsIf it is not possible to supply the importer with source documents according to the principle described in the previous section, you can implement your own mechanism on the basis of the Importer API. Understanding this section requires knowledge of Java; in particular, the Importer API is based on JAXP.
You can find the Importer API on the CoreMedia documentation site at https://documentation.coremedia.com/cmcc-12.
Unless the package is explicitly given, the classes and interfaces mentioned in this section
come from the javax.xml
hierarchy or from
com.coremedia.publisher.importer
. There are no name conflicts between these
packages.
The importer internally functions not with files but with document sets. Such a document set contains the source documents to be imported in one operation. A source document can be represented in various ways, indirectly via a URL or directly as a series of bytes or as a DOM tree.
The combination of these document sets is controlled by a class which implements the MultiResultGeneratorFactory interface. The name of this class is entered in the configuration file:
### The com.coremedia.publisher.importer.MultiResult ### Generator interface ### implementation to use import.multiResultGeneratorFactory.class= com.coremedia.publisher.importer.SubDirGeneratorFactory
Example 4.11. MultiResultGeneratorFactory
The class
SubDirGeneratorFactory
creates generators which implement the file logic described in the previous section.
SubDirGeneratorFactory
is included with the delivery of the importer and is preset in the example configuration
file properties/corem/cm-xmlimport.properties
. Instead, you can use your own
MultiResultGeneratorFactory.
The class is instantiated by the importer with java.beans.Beans.instantiate
. If
the class supports Properties
in the sense of java.beans.BeanInfo
(see Oracle
JavaBeans API Specification), these properties can be specified in the configuration file
and are then set by the importer. For example,
SubDirGeneratorFactory
requires the properties inbox
and sleepingSeconds
:
### Path to inbox (may be relative to $COREM_HOME): import.multiResultGeneratorFactory.property.inbox = <my/inbox/directory> ### Seconds to sleep between importer runs import.multiResultGeneratorFactory.property.sleepingSeconds = -1
Example 4.12. MultiResultGeneratorFactory
Such property entries have the format
import.multiResultGeneratorFactory.property.<propertyName> =
<propertyValue>
and are specific in their meaning for the particular
MultiResultGeneratorFactory
implementation. Therefore, when configuring your own factory, you have free choice of names
and number of properties. You are not confined to inbox
and
sleepingSeconds
.
All properties are set by the importer as strings. For example, the class
SubDirGeneratorFactory
must transform the value of sleepingSeconds
into a number.
After the properties have been set, the importer obtains the actual generator for the
document set from the factory with getMultiResultGenerator()
. In this
situation, the use of factories has the advantage that the factory can use the properties to
configure the generator in any desired way.
getMultiResultGenerator()
must return a
MultiResultGenerator
object which supports the methods fail
, next
and
success
. next
is called by the importer in order to create a new
set of source documents. With success
and fail
, the importer
informs the generator about the success or failure of the import of the source documents
delivered by the previous next
command. After calling success or
fail
, the importer no longer accesses the source documents.
For example, on each next
command, the generator created by
SubDirGeneratorFactory
delivers a file directly from the inbox directory or all files of a subdirectory. After the
complete contents of the inbox directory have been imported, the generator delays the next
execution of the next
method by the time determined in the property
sleepingSeconds
, and then delivers the files which have newly arrived in the
meantime to the importer. The success
method moves the file or subdirectory to
the bak
directory, fail
to the err
directory.
If the next method of the generator returns null, the importer ends. In the normal case,
however, it delivers a new document set in the form of a
MultiResult.
MultiResult
implements the Result
interface and therefore fits into the concept of JAXP
next to StreamResult
, SAXResult
and DOMResult
. A new
empty
MultiResult
is created via MultiResultFactory.getInstance().getMultiResult()
. New documents
are added to
MultiResult
via addNewResult
. There are two variants of this method:
void addNewResult(String systemId) throws Exception; Result addNewResult(String format, String systemId) throws Exception;
Example 4.13. MultiResult.addNewResult
The first variant is used for entering a document via a reference. The systemId
must be an URL which can be read by the importer as an input stream, for example a file path.
With the second variant, the document data can be entered directly. The method returns a
Result
in which the data can be deposited. The parameter format
determines whether the result is a DOMResult
, a StreamResult
or
again a
MultiResult.
(SAX is not yet supported in this version). Valid values for format are
StreamResult.FEATURE
, DOMResult.FEATURE
or
MultiResult.FEATURE
. In this variant, the systemID
is not used as
a data source, but, according to the JAXP concept, only as the basis for resolution of
relative URLs. Depending on the result type, the generator can store the source document
with DOMResult.setNode
or StreamResult.setOutputStream().write
, or
construct a more deeply nested document hierarchy with
MultiResult.addNewResult
. (The nesting plays no role for the importer; it could
at most be used by special transformers.)
A final example shows a simple next
method which returns one file of a
directory on each call.
File inbox = new File("/tmp/inbox"); File[] files = inbox.listFiles(); int index = 0; public MultiResult next() { try { if (index < files.length) { MultiResult mr = MultiResultFactory.getInstance(). getMultiResult(); mr.addNewResult(files[index++].getAbsolutePath()); return mr; } } catch (Exception e) { System.out.println("Something went wrong!"); } return null; }
Example 4.14. next
If no transformers are entered in the configuration file, the sets of source documents
delivered by next
are imported directly. Of course, this only works if the
documents are CoreMedia XML documents matching the content types of the
CoreMedia CMS. Typically, however, a
transformation of the documents is necessary to achieve the correct format. This is the
subject of the next section.