4.8. CM Watchdog/Probedog

The following actions can be executed with the watchdog:

  • Applications of the CoreMedia system can be monitored for functioning.

  • Applications can be locally stopped and/or restarted.

  • Depending on the state of the system, further actions can be triggered.

The watchdog is delivered in two flavors:

  • watchdog

  • The watchdog is an independent monitoring process which is used to test regularly whether applications are functioning and, in case of error, to restart them. The watchdog will be installed as a web application.

  • probedog

  • The probedog is a process for one-time checking of the status of an application. This variant is used with integration of the CoreMedia system in a high-availability cluster. The probedog can be used as a diagnostic tool for the momentary status during operation. The probedog delivers a return code which can be evaluated by a shell script. It is deployed as part of the standalone server applications respectively the server tools.

The watchdog's main configuration file is the watchdog.xml, where the applications and actions are defined. To configure the path of this file and which of the defined applications should be watched, the following two properties can be used:

# the file where the applications are defined
watchdog.config=properties/corem/watchdog.xml

# a space separated list of applications that should be watched by this watchdog
watchdog.components=WatchContentServer

  • watchdog.config: Configure the location of watchdog.xml (default is properties/corem/watchdog.xml)

  • watchdog.components: A space separated list of applications that should be watched.

The following table shows the actions you can use to monitor a specific application. In addition, you can use the <Script> and <Custom> actions to define your own monitoring. See Section 4.8.3.2, “Action Elements” for a detailed description of the actions.

Application Action
Content Server
  • <ServerQuery>: Checks the overall status

  • <ServiceStatus>: Checks the status of a specific service

  • <ServerMode>: Checks the runlevel

Workflow Server

  • <WorkflowServerQuery>: Checks the overall status of a Workflow Server

  • <ProcessStatus>: Checks if a specified number of processes is running

Database
  • <DB>: Checks the status of the database

CAE
  • <Http>: Checks if the response code is "200" and if the response matches a given regular expression

Table 4.10. Action to be used to check a certain application