close

Filter

loading table of contents...

Studio Developer Manual / Version 2304

Table Of Contents

9.20 Configuring MIME Types

When a blob is uploaded into a property field, CoreMedia Studio detects an appropriate MIME type based on name and content of the uploaded file. This is done with the help of the mimeTypeService bean, which is based on Apache Tika. This service is able to detect many common file types. If the file type is unknown, the MIME type suggested by the uploading browser will be used.

Note

MIME Type Service Configuration

If you require to adapt the MIME Type Service Configuration, as proposed in subsequent paragraphs, find more details in Section “MIME Type Mappings” in Content Application Developer Manual.

Adding new file types can mostly be achieved by adding a file org/apache/tika/mime/custom-mimetypes.xml to the classpath of CoreMedia Studio Server. The path may be adapted by setting mimeTypeService.mimeTypesResourceNames accordingly.

You will find an example for such a configuration in Example 9.80, “Override *.exe MIME Type Detection”.

<?xml version="1.0" encoding="UTF-8"?>

<mime-info>

  <mime-type type="application/acme">
    <_comment>New MIME Type Mapping</_comment>
    <glob pattern="*.acme"/>
  </mime-type>

  <mime-type type="application/x-dosexec">
    <_comment>Override Tika Default</_comment>
    <sub-class-of type="application/x-msdownload"/>
    <glob pattern="*.exe" weight="100"/>
    <magic priority="100">
      <match value="MZ" type="string" offset="0"/>
    </magic>
  </mime-type>

</mime-info>

Example 9.80. Override *.exe MIME Type Detection


Details about the example:

  • The first entry is about adding some new MIME type for files with acme extension.

  • The second entry overrides the default Tika configuration enforcing all *.exe to be mapped to application/x-dosexec.

    While the default Tika configuration already maps *.exe to MIME type application/x-dosexec, it adds subsequent overrides to application/x-msdownload with format property, to distinguish for example 32bit from 64bit applications.

    To override it, you need to duplicate the <magic> pattern of the original definition and provide a higher priority than in Tika's default configuration. Valid priorities are from 0 to 100, where 50 is the default.

For a reference of all elements and attributes in custom-mimetypes.xml have a look at the API documentation of org.apache.tika.mime.MimeTypesReader. As stated in the documentation, the DTD is compliant to freedesktop MIME-info DTD. Note, though, that it only contains a subset of attributes and elements. Nevertheless, you may find some more valuable information in the official specification located at freedesktop.org: Shared MIME Info Specification.

If you need to override existing mappings, the approach via custom-mimetypes.xml may not be sufficient. In this case you may need to set mimeTypeService.tikaConfig. Note though, that, in contrast to custom-mimetypes.xml, this requires defining all MIME types by yourself. For a start, you may want to take tika-mimetypes.xml for reference, which can be found in the Apache Tika GitHub Repository.

Example where overriding may fail: You may struggle with Tika reporting duplicate definitions. For example, take the re-mapping of *.exe above. If you skipped the <magic> element, Tika would report about a duplicate definition for *.exe without being able to get the priorities straight. Thus, you need to tune your adaptations and have a deep understanding about the <mime-info> configuration. And as Tika does not support <glob-deleteall> and <mime-deleteall> as specified by freedesktop MIME-info DTD, there is no straightforward way to enforce your MIME-type detection, while trying to benefit from existing MIME-type detection configuration for types you want to keep as is.

Search Results

Table Of Contents
warning

Your Internet Explorer is no longer supported.

Please use Mozilla Firefox, Google Chrome, or Microsoft Edge.