Importer Manual / Version 2412.0
Table Of ContentsThe standard configuration of the importer is set up for the situation that the source documents exist as files in certain directories. These directories are set in the configuration file, separated by semicolons:
### Path to inbox (may be relative to $COREM_HOME): import.multiResultGeneratorFactory.property.inbox = <my/inbox/directory1>;<my/inbox/directory2>; <my/inbox/directory3>
Example 4.9. Inbox directories
Files which lie directly in the inbox are imported individually. If several source files should be imported in one operation, they must be combined in a subdirectory. Such subdirectories can have any desired name, only "bak" and "err" are reserved. The importer creates these two subdirectories and moves successfully imported source files and subdirectories to bak and failures to err.
At the time of the import, the files must be completely ready. In particular, all files must
be present in subdirectories which should be imported in one operation. Therefore, both
individual files and complete subdirectories should (under Unix) only be moved to the inbox
directories in one complete step with mv
, not by means of successive copying or
writing.
When all files are processed or the inbox directories are empty, the importer exits. To check
for new files continuously, environment variable IMPORT_SLEEPING_SECONDS
can be used
in a Docker setup (see Section 3.4, “Deployment and Operation of an Importer in Docker”).
### Seconds to sleep between importer runs import.multiResultGeneratorFactory.property.sleepingSeconds = -1
Example 4.10. Sleeping seconds
It may be the case that files are present in the inbox directories which should not be
imported as source documents. A typical example for this are graphics which are referenced
without path by a blob property in a CoreMedia XML document. On the one hand, such graphics
must lie in the same directories, so that the importer finds them, but on the other hand
should not be imported as independent source documents. For such cases, a further property
can be entered in addition to inbox
and sleepingSeconds
:
filenameFilterClass
. The value of this property must be the name of a Java
class which implements the java.io.FilenameFilter
interface. If this property
is specified, a file is only imported if its name is accepted by a FilenameFilter
of this
class. If your files have meaningful names, it is usually possible to decide according to
the filename extension whether it is a source document or another file.