Search Manual / Version 2406.1
Table Of ContentsNote
Configuration not mandatory: Search suggestions in Studio work with the default configuration. This section describes how you can configure the index fields used for suggestions and how you can tune the performance of suggestions.
CoreMedia Studio shows autocomplete search suggestions when a user starts typing search
queries in the library window. These suggestions are based on the indexed content and computed by a special
search component in Apache Solr, which can be configured in the Solr configuration file
<solr-home>/configsets/content/conf/solrconfig.xml
.
The configuration consists of:
Request handler parameters
Studio uses the Solr request handler
/editor
for searching and getting search suggestions. Suggestions are configured with parametersuggest.spellcheck.dictionary
as in the following example (the other parameters may vary in your configuration):<requestHandler name="/editor" class="solr.SearchHandler"> <lst name="defaults"> <str name="defType">cmdismax</str> <str name="echoParams">none</str> <float name="tie">0.1</float> <str name="qf">textbody name^2 numericid^10</str> <str name="pf">textbody name^2</str> <str name="mm">100%</str> <str name="q.alt">*:*</str> <str name="suggest.spellcheck.dictionary">textbody</str> </lst> ...
The parameter
suggest.spellcheck.dictionary
references a Suggester dictionary to compute suggestions from. This dictionary must be configured insolrconfig.xml
as well as described further below. In the default configuration it is named after the index fieldtextbody
but you can use different dictionary names as you like. You can also use multiple dictionaries to compute suggestions from the content of multiple index document fields. To this end, you just need to repeat the element<str name="suggest.spellcheck.dictionary">
multiple times with different values. Note that you must also configure multiple dictionaries if you want to suggest words from language dependent fields. For example, if you've defined the fieldstextbody
,textbody_en
andtextbody_de
in the index schema as described in Section 3.8, “Searching in Different Languages”, then you need to add three dictionaries to get suggestions from all of these fields.Request handler components
The same request handler
/editor
is configured to use the necessary search components for suggestions as shown below. These referenced components are configured as<searchComponent ...>
elements insolrconfig.xml
as well.<requestHandler name="/editor" class="solr.SearchHandler"> <lst name="defaults"> ... </lst> <arr name="last-components"> <str>suggest</str> <str>spellcheck</str> </arr> </requestHandler>
SpellCheckComponent and dictionary configuration
The above configuration references the search component named
spellcheck
with a dictionarytextbody
. Now it's time to look at the configuration of that component. The relevant part for suggestions looks as follows:<searchComponent name="spellcheck" class="solr.SpellCheckComponent"> <str name="queryAnalyzerFieldType">text_general</str> <lst name="spellchecker"> <str name="name">textbody</str> <str name="classname"> org.apache.solr.spelling.suggest.Suggester </str> <str name="lookupImpl"> org.apache.solr.spelling.suggest.fst.WFSTLookupFactory </str> <str name="field">textbody</str> <float name="threshold">0.0005</float> </lst> </searchComponent>
If you choose different names for spell check component or dictionary, make sure that you use the correct names in the configuration of the
/editor
request handler.The element
<lst name="spellchecker">
configures a dictionary for suggestions based on the content of the index fieldtextbody
. The parameterthreshold
configures the dictionary to just consider words that occur in at least the given percentage of index documents. It can take a value between0
and1
. A value of0.01
would mean that a word must appear in at least 1% of the documents in that field. More rare words will be ignored and not returned as suggestions. While you can set this value to0
to include all words, this would increase the size of the in-memory data structure and the time needed to build it. You can use the parameter to tune the suggestions: higher values lead to smaller memory usage and better performance while smaller values provide more detailed suggestions.To define dictionaries for multiple index fields, you just need to repeat the
<lst name="spellchecker">
section but use a different name for the dictionary in<str name="name">
and set the name of the index field in<str name="field">
.Dictionary rebuilding configuration
Suggester dictionaries are in-memory data structures that must be rebuilt after index changes to make new words appear in the suggestions. The search component
DictionaryRebuilder
, which is also configured in filesolrconfig.xml
, rebuilds all configured dictionaries after index updates. Its configuration takes the name of the spell check component with parameterspellCheckComponent
and the names of the dictionaries with parameterdictionary
. For multiple dictionaries you just need to repeat the<str name="dictionary">
element with different values.<searchComponent name="dictionaryRebuilder" class="com.coremedia.solr.suggest.DictionaryRebuilder"> <str name="spellCheckComponent">spellcheck</str> <str name="dictionary">textbody</str> <long name="minimumIntervalSeconds">60</long> </searchComponent>
With the default configuration in parameter
minimumIntervalSeconds
, the dictionary will be rebuilt at most once per minute if the index is constantly changed.Note that Solr already provides a different method to rebuild dictionaries after commits, which can be enabled with parameter
<str name="buildOnCommit">true</str>
in the<lst name="spellchecker">
dictionary configuration. However, while it rebuilds the dictionary similarly to theDictionaryRebuilder
, it will do this after every Solr commit even if commits come in very fast. It will also delay the visibility of the committed index changes in the search results as long as the dictionary is built. Depending on the size of the dictionary (affected by index size and the configuredthreshold
parameter) it may take some seconds to rebuild a suggestion dictionary. Use theDictionaryRebuilder
and notbuildOnCommit
to avoid such delays.