close

Filter

Analytics Connectors Manual / Version 2404

Table Of Contents

4. Retrieval

Data aggregated by analytics service providers can not only be used to generate tables and diagrams but also to generate "top-n-lists" for use on the delivery side of a content management environment. To generate "top-n-lists" of contents based on their rank within an analytics report, report data is gathered and cached using CoreMedia Elastic Core infrastructure.

The following components play key roles in this setting:

  • CMALXBaseList

    content objects

    Instances of this content type serve as configuration objects for a retrieval task that fetches the corresponding data. The content beans are also used at rendering time to retrieve the content objects corresponding to the tracking data cached using CoreMedia Elastic Core.

  • AnalyticsServiceProvider

    Implementations actually access the third-party analytics service provider and gather data. Data is persisted using the CMALXBaseListModelService. This model service retrieves and stores objects of type ReportModel which hold the current configuration of the report, the preprocessed report data and a timestamp to represent the reports freshness.

  • FetchReportsTask

    An elastic worker task iterates over the "top-n-list" content items of a tenant and uses all AnalyticsServiceProvider implementations available in the current Spring context to retrieve data for them. First the task gets all root navigation items and "top-n-list" items for a tenant and executes for "top-n-list" content items and the root content item of the same site. Then task checks, if the corresponding ReportModel is too old or differs in its configuration. This ensures that changes in the configuration trigger an almost immediate new retrieval of data. If retrieval is due, data is fetched and passed to the CMALXBaseList instance to preprocess the result list. Then the report model is saved.

    There are different types of time intervals involved, which can be confusing:

    • Interval of the FetchReportsTask - the task is executed quite often, for example every minute, but only fetches data if necessary.
    • Interval for retrieving data from a specific analytics provider - the effective retrieval interval in which data is actually retrieved if the configuration has not changed, for example every 180 minutes. It is configured per top-n-list instance using the interval property (see Table 4.1, “Generic Retrieval Configuration Options”).
    • Time range of the fetched data - usually you only retrieve data for a certain time range, for example you are interested in the report data for the last week.
Note

Note

FetchReportTask assumes that data is fetched synchronously. If the analytics service provider provides asynchronous access only, you will have to set up additional tasks that fetch the report data. An implementation of ElasticAnalytcisServiceProvider should then store information used by the additional tasks (for subsequent calls to the analytics service provider) in the report model and return an empty list themselves.

Top-n-lists (CMALXBaseList instances) provide an analyticsProvider property that determines the analytics service provider to use.

Retrieval configuration is stored in the settings of the content (or one of its channels). Note that settings defined at a content override settings defined by its channel. For a Page content bean named page, the Google Analytics configuration, for example, is stored under the property path page.settings.googleAnalytics.

The following configuration properties apply to all analytics service provider implementations:

Technical Variable Name Description/Value Required
analyticsProvider The service key of the analytics provider to use. If not set, your list will be empty, even if data is cached. false
maxLength The 'n' of a top-n-list (its maximum length). Default is 10. false
interval The interval in minutes to fetch report data. Default is 24 hours. false
limit The maximum number of records of a fetched report. Default is 1000. false

Table 4.1. Generic Retrieval Configuration Options


Note

Note

Note that retrieval can be temporarily disabled (even for a particular page) by setting the service's interval property, for example googleAnalytics.interval, to 0.

Subtypes of CMALXBaseList define report types to fetch. The following examples are provided by CoreMedia Analytics Connectors:

  • CMALXPageList

    Instances of this content type refer to a report containing page views. Generic properties used at retrieval time are documentType and baseChannel limiting the items to display at rendering time. Hence, the "top-n-list" will be made up of content objects of type documentType below channel baseChannel. The property defaultContent defines a list of content objects to be displayed if no report data is available.

  • CMALXEventList

    Instances of this content type refer to a report containing tracked events. Generic event properties are category and action. The category is the name you supply for the group of objects to track. The action is a string identifying the type of user interaction to be tracked. The pair of category and action should uniquely identify the event.

On the retrieval side, only implementations of AnalyticsServiceProvider are specific to an analytics service provider. In the next sections, the existing service provider implementations are presented.

Note

Note

Cached report models are cleared, if the data has not been updated in the last 30 days.

Caution

Caution

The tables in the following sections use the technical names of configuration options. Look them up in the resource bundles of the corresponding Studio extension modules to find out the localized names of the properties.

Caution

Caution

The analytics service providers restrict usage in different ways (with respect to request frequency, request count per time unit or response size in terms of record count). Ensure that your configuration matches those limitations.

Caution

Caution

Some settings can contain secret information and should not be published.

Therefore these internal settings should be separated from settings, that are required on live systems.

Internal analytics settings can be configured for a site in [site]/Options/Settings/Internal/InternalAnalyticsSettings.xml.

For the document InternalAnalyticsSettings some specific rules apply:

  • It should not be published.
  • It should not be linked in other documents.
  • If a setting is configured in linked/local settings and in InternalAnalyticsSettings, the value of linked/local settings is applied.
  • Retrieval tasks: InternalAnalyticsSettings are only applied, if tracking is configured (linked/local settings contain analyticsProvider and provider specific properties).
  • Headless Server: The containing folder [site]/Options/Settings/Internal should be configured in the blocklist to prevent delivery of the folder content in headless environments.

Search Results

Table Of Contents
warning

Your Internet Explorer is no longer supported.

Please use Mozilla Firefox, Google Chrome, or Microsoft Edge.