4.4. Configure Search Suggestions for Studio

[Note]Note

Configuration not mandatory: Search suggestions in Studio work with the default configuration. This section describes how you can configure the index fields used for suggestions and how you can tune the performance of suggestions.

CoreMedia Studio shows autocomplete search suggestions when a user starts typing search queries in the library window. These suggestions are based on the indexed documents and computed by a special search component in Apache Solr, which can be configured in the Solr configuration file <solr-home>/configsets/content/conf/solrconfig.xml.

The configuration consists of:

  • Request handler parameters

    Studio uses the Solr request handler /editor for searching and getting search suggestions. Suggestions are configured with parameter suggest.spellcheck.dictionary as in the following example (the other parameters may vary in your configuration):

    <requestHandler name="/editor" class="solr.SearchHandler">
     <lst name="defaults">
      <str name="defType">cmdismax</str>
      <str name="echoParams">none</str>
      <float name="tie">0.1</float>
      <str name="qf">textbody name_tokenized^2 numericid^10</str>
      <str name="pf">textbody name_tokenized^2</str>
      <str name="mm">100%</str>
      <str name="q.alt">*:*</str>
      <str name="suggest.spellcheck.dictionary">textbody</str>
     </lst>
     ...
    

    The parameter suggest.spellcheck.dictionary references a Suggester dictionary to compute suggestions from. This dictionary must be configured in solrconfig.xml as well as described further below. In the default configuration it is named after the index field textbody but you can use different dictionary names as you like. You can also use multiple dictionaries to compute suggestions from the content of multiple document fields. To this end, you just need to repeat the element <str name="suggest.spellcheck.dictionary"> multiple times with different values. Note that you must also configure multiple dictionaries if you want to suggest words from language dependent fields. For example, if you've defined the fields textbody, textbody_en and textbody_de in the index schema as described in Section 3.5, “Searching in Different Languages”, then you need to add three dictionaries to get suggestions from all of these fields.

  • Request handler components

    The same request handler /editor is configured to use the necessary search components for suggestions as shown below. These referenced components are configured as <searchComponent ...> elements in solrconfig.xml as well.

    <requestHandler name="/editor" class="solr.SearchHandler">
      <lst name="defaults">
        ...
      </lst>
      <arr name="last-components">
        <str>suggest</str>
        <str>spellcheck</str>
      </arr>
    </requestHandler>
    
  • SpellCheckComponent and dictionary configuration

    The above configuration references the search component named spellcheck with a dictionary textbody. Now it's time to look at the configuration of that component. The relevant part for suggestions looks as follows:

    <searchComponent name="spellcheck"
                     class="solr.SpellCheckComponent">
    
      <str name="queryAnalyzerFieldType">text_general</str>
    
      <lst name="spellchecker">
        <str name="name">textbody</str>
        <str name="classname">
          org.apache.solr.spelling.suggest.Suggester
        </str>
        <str name="lookupImpl">
          org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
        </str>
        <str name="field">textbody</str>
        <float name="threshold">0.0001</float>
      </lst>
    
    </searchComponent>

    If you choose different names for spell check component or dictionary, make sure that you use the correct names in the configuration of the /editor request handler.

    The element <lst name="spellchecker"> configures a dictionary for suggestions based on the content of the index field textbody. The parameter threshold configures the dictionary to just consider words that occur in at least the given percentage of documents. It can take a value between 0 and 1. A value of 0.01 would mean that a word must appear in at least 1% of the documents in that field. More rare words will be ignored and not returned as suggestions. While you can set this value to 0 to include all words, this would increase the size of the in-memory data structure and the time needed to build it. You can use the parameter to tune the suggestions: higher values lead to smaller memory usage and better performance while smaller values provide more detailed suggestions.

    To define dictionaries for multiple index fields, you just need to repeat the <lst name="spellchecker"> section but use a different name for the dictionary in <str name="name"> and set the name of the index field in <str name="field">.

  • Dictionary rebuilding configuration

    Suggester dictionaries are in-memory data structures that must be rebuilt after index changes to make new words appear in the suggestions. The search component DictionaryRebuilder, which is also configured in file solrconfig.xml, rebuilds all configured dictionaries after index updates. Its configuration takes the name of the spell check component with parameter spellCheckComponent and the names of the dictionaries with parameter dictionary. For multiple dictionaries you just need to repeat the <str name="dictionary"> element with different values.

    <searchComponent name="dictionaryRebuilder"
          class="com.coremedia.solr.suggest.DictionaryRebuilder">
      <str name="spellCheckComponent">spellcheck</str>
      <str name="dictionary">textbody</str>
      <long name="minimumIntervalSeconds">60</long>
    </searchComponent>
    

    With the default configuration in parameter minimumIntervalSeconds, the dictionary will be rebuilt at most once per minute if the index is constantly changed.

    Note that Solr already provides a different method to rebuild dictionaries after commits, which can be enabled with parameter <str name="buildOnCommit">true</str> in the <lst name="spellchecker"> dictionary configuration. However, while it rebuilds the dictionary similarly to the DictionaryRebuilder, it will do this after every Solr commit even if commits come in very fast. It will also delay the visibility of the committed index changes in the search results as long as the dictionary is built. Depending on the size of the dictionary (affected by index size and the configured threshold parameter) it may take some seconds to rebuild a suggestion dictionary. Use the DictionaryRebuilder and not buildOnCommit to avoid such delays.