3.5. Searching in Different Languages

The CoreMedia Search Engine enables you to search in documents of many languages. This requires some preliminary processing steps:

  • Detecting the used language

  • Splitting the text into searchable words

  • Indexing the words into language dependent fields

  • Searching in language dependent fields

These steps are highly customizable. For standard western languages, such as English, German, French, you do not necessarily need to change the configuration, because the standard configuration already handles these languages quite well. If you use Asian languages, such as Chinese, Japanese or Korean (known as CJK languages) you have to do some configuration because these languages must be treated differently to extract searchable words.