close

Filter

Multi-Site Manual / Version 2506.0

Table Of Contents

3.2.1 Defining Valid Language Tags

Locales are denoted in content by IETF BCP 47 (Internet Engineering Task Force Best Current Practice no. 47) language tags as specified by RFC 5646. A typical language tag consists of a language and country element such as en‑US (English (United States)) or de‑DE (German (Germany)). Valid language subtags are registered by IANA (Internet Assigned Numbers Authority) in the IANA Language Subtag Registry. The most simple IETF BCP 47 language tags consist of the primary language subtag only, for example, en (English).

Along with the criteria mentioned in the specification an additional restriction applies for locales used in context of CoreMedia Studio:

Locales must differ in their display name as generated by your JDK.

Thus, it is especially discouraged using so-called Private Use Subtags separated by reserved single-character subtag 'x' as they are not represented in the display name.

Note

Display Name Localization Is Not Customizable

CoreMedia Content Cloud does not provide means for customizing localization of Locale display names. The display name is localized solely by your JDK, which in turn eases introducing new locales to your system without providing custom translations.

Display Names of Locales

If you are going to define a custom locale, it is important to ensure, that the display name of each locale offered to your editors differ regarding their display name. This section recommends possible options for designing a locale, so that customizations are visible in the display name.

In addition to primary language and region subtags the subtags listed in Table 3.1, “Subtags Represented in Display Name” are known to be part of the display name and thus may be taken into account for creating customized locales for special needs.

TypeDescriptionExample
Script

Script subtags are defined in RFC 5646, Section 2.2.3.

The length of a script subtag must be exactly four having only alphabetic characters. Script subtags are typically registered at IANA. One exception exists for script subtags Qaaa through Qabx which are for private use.

While Java only checks for well-formed script subtags, it is recommended sticking to the specification.

en-Qaaa-US; displayed as English (Qaaa, United States)
Variant

Variant subtags are defined in RFC 5646, Section 2.2.5.

A language tag may contain multiple variants separated by dashes. Each variant may only contain alphanumeric characters. In addition to the specification, current Java implementations limit the number of characters to eight at maximum.

Just as scripts, variants are typically registered at IANA. There is no concept of private use variant subtags. Nevertheless, Java does not validate against registered variants.

en-US-Variant; displayed as English (United States, Variant)
Extension: Unicode Locale Keyword

Extension subtags are defined in RFC 5646, Section 2.2.6. A special type of these extension subtags are Private Use Subtags defined in RFC 5646, Section 2.2.7.

The separator for extensions must be registered at IANA. The separator 'x' is used for private use subtags. While the private use subtags provide most freedom choosing custom subtags, they are not displayed in the display name of the locale.

Regarding Java CoreMedia recommends using so-called Unicode Locale Keywords, which allow a key-value based approach. These key-value pairs are prefixed with 'u' within the language tag.

The following restrictions apply to the keyword: it must have a length of two characters and consist of alphanumeric characters.

The following restrictions apply to the value: t may contain dash-separated values, where each single value has to be alphanumeric and have a length of three to eight (including) characters.

en-US-u-ky-value; displayed as English (United States, ky: value)

Table 3.1. Subtags Represented in Display Name


Disclaimer: The actual behavior relies on your type and version of JDK.

Tip: Design Help Via Java's Locale

Defining a proper language tag may be a tedious task, like reading and understanding the IETF BCP 47 specification and aligning it with your JDK's behavior. To ease this task, Java provides a class Locale.Builder which may be used to create and validate language tags.

Using the Locale.Builder for your custom locales will guide you through the process of creating a valid language tag, understood by your JDK. In Example 3.1, “Creating Locale with Locale.Builder you find an example of how to create a locale with Locale.Builder. You may execute the code in an independent project, or as a temporary unit test, or within some online Java playground, to eventually get your desired valid language tag.

        
        new Locale.Builder()
          // Base on existing locale.
          .setLocale(Locale.US)
          // ! Not part of Display Name !
          .setExtension(Locale.PRIVATE_USE_EXTENSION, "myExt")
          // Registered by IANA; Qaaa - Qabx for private use
          .setScript("Latf")
          // Custom key-value pairs.
          .setUnicodeLocaleKeyword("lK", "local-value")
          // Registered by IANA
          .setVariant("1994")
          .build()
          .toLanguageTag()

      

Example 3.1. Creating Locale with Locale.Builder


Warning

Beware of Locale.toString()

Especially in context of CoreMedia Multi-Site you should not rely on the representation of Locale.toString(), which, at first glance seems to be a language tag, just using underscores such as en_US. In contrast to IETF BCP 47 language tags this representation has no strict specification and as such cannot reliably be parsed from String representation back to a valid Locale. Use #toLanguageTag instead.

Search Results

Table Of Contents
warning

Your Internet Explorer is no longer supported.

Please use Mozilla Firefox, Google Chrome, or Microsoft Edge.