Importer Manual / Version 2204
Table Of ContentsThe standard configuration of the importer is set up for the situation that the source documents exist as files in certain directories. These directories are set in the configuration file, separated by semicolons:
### Path to inbox (may be relative to $COREM_HOME): import.multiResultGeneratorFactory.property.inbox = <my/inbox/directory1>;<my/inbox/directory2>; <my/inbox/directory3>
Example 4.9. Inbox directories
Files which lie directly in the inbox are imported individually. If several source files should be imported in one operation, they must be combined in a subdirectory. Such subdirectories can have any desired name, only "bak" and "err" are reserved. The importer creates these two subdirectories and moves successfully imported source files and subdirectories to bak and failures to err.
At the time of the import, the files must be completely ready. In particular, all files must
be present in subdirectories which should be imported in one operation. Therefore, both
individual files and complete subdirectories should (under Unix) only be moved to the inbox
directories in one complete step with mv
, not by means of successive copying or
writing.
If the inbox directories are empty, the importer goes to sleep. You can configure the
sleeping time in seconds, using the property
import.multiResultGeneratorFactory.property.sleepingSeconds
. The special value
"-1" means that the importer does not wait for new files, but only imports the current
contents of the inbox directories once and then ends.
### Seconds to sleep between importer runs import.multiResultGeneratorFactory.property.sleepingSeconds = -1
Example 4.10. Sleeping seconds
It may be the case that files are present in the inbox directories which should not be
imported as source documents. A typical example for this are graphics which are referenced
without path by a blob property in a CoreMedia XML document. On the one hand, such graphics
must lie in the same directories, so that the importer finds them, but on the other hand
should not be imported as independent source documents. For such cases, a further property
can be entered in addition to inbox
and sleepingSeconds
:
filenameFilterClass
. The value of this property must be the name of a Java
class which implements the java.io.FilenameFilter
interface. If this property
is specified, a file is only imported if its name is accepted by a FilenameFilter
of this
class. If your files have meaningful names, it is usually possible to decide according to
the filename extension whether it is a source document or another file.