Search Manual / Version 2107
Table Of ContentsThe Content Feeder can index content issues that are reported by content validators. For details about content validators, see Section 7.18.1, “Validators” in Studio Developer Manual. Validators need to be registered as Spring beans in the application context of the Content Feeder to make their reported issues available in the search index. If content issues are indexed, it will become possible to find content items with errors or warnings in the search view of the Studio library.
In the Solr index, issues are represented as nested index documents of their corresponding content.
These nested issue documents contain the value NESTED
in the index field feederstate
,
and data about the issue in several index fields like issueCode
and issueSeverity
.
For details about index fields, have a look at the field definitions in the file schema.xml
of the Content Feeder index.
In the default configuration, issue indexing is disabled. It can be enabled by setting the Content Feeder
configuration property feeder.content.issues.index
to true
. In addition to that, the Solr schema has
to be adjusted to support nested documents. To this end, the Solr fields _root_
and _nest_path_
must be declared in the Solr configuration file schema.xml
for the Content Feeder index.
The file schema.xml
from the Blueprint already contains these fields as comments, so that they can be easily
added. However, you must not add these fields to an existing index, because it would lead to index inconsistencies.
When adding or removing these fields, you must always recreate the Solr index from scratch, and let the
Content Feeder index all content items. It is not sufficient to trigger reindexing of content items
in an existing index.
If enabled, issue computation and indexing causes additional work for the Content Feeder, and can reduce its throughput. With enabled issue feeding, content issues are still not computed during initial feeding of an empty index, so that initial feeding is not delayed. The Content Feeder will index issues for all content items immediately after the index has been initialized. This happens with lower priority and does not block feeding of editorial changes.
Note, that indexed issues are not always up-to-date. Issues are recomputed and reindexed immediately,
when the properties of the corresponding content have changed. Issues are not updated immediately, if
other content items have changed or, for example, if a content was just renamed without a change to its
properties. To eventually still have correct issues in the search result, the
Content Feeder periodically recomputes and reindexes issues of all content items
with a configurable delay. For details, see the configuration properties starting with
feeder.content.issues
in Section 4.9.1, “Content Feeder Properties” in Deployment Manual.
Periodic issue reindexing happens with lower priority in the background and does not block feeding of
editorial changes.
Section 6.2, “Content Feeder Metrics” describes some metrics that may be helpful to understand
Content Feeder performance in general and the impact of issue feeding.
Furthermore, you may query Solr directly to check how up-to-date indexed issues really are:
The Solr field issuesUpdated
of an indexed content contains the date when indexed issues
were last computed for that content. The Solr Stats Component can be used in a Solr query to check
the maximum age of issues in the index. For example, a native Solr query could be extended with
stats=true&stats.field=issuesUpdated
to get the minimum and maximum date
values, or with stats=true&stats.field={!func}ms(NOW,issuesUpdated)
to get the minimum and maximum age in milliseconds. The Solr Stats Component is described in the
Solr Reference Guide at https://solr.apache.org/guide/8_6/the-stats-component.html.