Studio Developer Manual / Version 2406.0
Table Of Contents
When a blob is uploaded into a property field, CoreMedia Studio
detects an appropriate MIME type based on name and content of the uploaded file.
This is done with the help of the mimeTypeService
bean, which is based
on Apache Tika. This service is able to detect many common file types.
If the file type is unknown, the MIME type suggested by the uploading browser
will be used.
MIME Type Service Configuration
If you need to adapt the MIME Type Service Configuration, as proposed in subsequent paragraphs, find more details in Section “MIME Type Mappings” in Content Application Developer Manual.
Adding new file types can be achieved by adding corresponding MIME type definitions to file
shared/common/modules/shared/custom-mime-types/src/main/resources/org/apache/tika/mime/custom-mimetypes.xml
.
The list of MIME type definition file names may be extended by setting
mimeTypeService.mimeTypesResourceNames
. If you want to place your MIME type
definitions in a file other than org/apache/tika/mime/custom-mimetypes.xml
,
create a corresponding file in
shared/common/modules/shared/custom-mime-types/src/main/resources/
.
Set mimeTypeService.mimeTypesResourceNames
to include the pre-defined pathes plus
your new file's relative path. See following example on how to add a new resource file
com/acme/project/acme-mimetypes.xml
.
mimeTypeService.mimeTypesResourceNames=org/apache/tika/mime/coremedia-tika-mimetypes.xml,org/apache/tika/mime/custom-mimetypes.xml,com/acme/project/acme-mimetypes.xml
Example 9.80. Add Custom Resource to MIME Type Definitions
You will find an example for a MIME type definition in
Example 9.81, “Override *.exe
MIME Type Detection”.
<?xml version="1.0" encoding="UTF-8"?> <mime-info> <mime-type type="application/acme"> <_comment>New MIME Type Mapping</_comment> <glob pattern="*.acme"/> </mime-type> <mime-type type="application/x-dosexec"> <_comment>Override Tika Default</_comment> <sub-class-of type="application/x-msdownload"/> <glob pattern="*.exe" weight="100"/> <magic priority="100"> <match value="MZ" type="string" offset="0"/> </magic> </mime-type> </mime-info>
Example 9.81. Override *.exe
MIME Type Detection
Details about the example:
The first entry is about adding some new MIME type for files with
acme
extension.The second entry overrides the default Tika configuration enforcing all
*.exe
to be mapped toapplication/x-dosexec
.While the default Tika configuration already maps
*.exe
to MIME typeapplication/x-dosexec
, it adds subsequent overrides toapplication/x-msdownload
withformat
property, to distinguish for example 32bit from 64bit applications.To override it, you need to duplicate the
<magic>
pattern of the original definition and provide a higher priority than in Tika's default configuration. Valid priorities are from 0 to 100, where 50 is the default.
For a reference of all elements and attributes in
custom-mimetypes.xml
have a look at
the API documentation of
org.apache.tika.mime.MimeTypesReader
.
As stated in the documentation, the DTD is compliant to
freedesktop MIME-info DTD. Note, though, that it only
contains a subset of attributes and elements. Nevertheless, you may find
some more valuable information in the official specification located at
freedesktop.org: Shared MIME Info Specification.
If you need to override existing mappings, the approach via
custom-mimetypes.xml
may not be sufficient.
In this case you may need to set mimeTypeService.tikaConfig
.
Note though, that, in contrast to custom-mimetypes.xml
,
this requires defining all MIME types by yourself. For a start, you may want
to take tika-mimetypes.xml
for reference, which can be
found in the
Apache Tika GitHub Repository.
Example where overriding may fail:
You may struggle with Tika reporting duplicate definitions. For example,
take the re-mapping of *.exe
above. If you skipped
the <magic>
element, Tika would report about a duplicate
definition for *.exe
without being able to get the
priorities straight. Thus, you need to tune your
adaptations and have a deep understanding about the
<mime-info>
configuration. And as Tika does not support
<glob-deleteall>
and <mime-deleteall>
as specified by freedesktop MIME-info DTD, there is
no straightforward way to enforce your MIME-type detection, while trying
to benefit from existing MIME-type detection configuration for types you
want to keep as is.