Content Server Manual / 3.13.2.4 Content UUID Migration and Transfer

Content Server Manual / Version 2401

3.13.2.4 Content UUID Migration and Transfer

This section describes three tools. These are meant for migration scenarios of Content Management Server in multi-environment setup, as for example a development and a production stage, related to UUIDs of contents. Another application is transfer of UUIDs from a Content Management Server to its Master Live Server and Replication Live Servers. For details on UUID support in CoreMedia Content Cloud see Section 5.2, “UUIDs” in Unified API Developer Manual.

To finish the process, a server downtime is required. Thus, please plan your upgrade scenario carefully. Find more details at Section “Preparation”.

Purpose

The tool set has the following applications:

Multi-Environment Migration: share the same UUIDs for contents between servers in different environments.
Live Servers Transfer: batch transfer of UUIDs from a Content Management Server to its Master Live Server and Replication Live Servers.

To keep UUIDs synchronized between two environments after an initial migration step, consider using serverexport and serverimport with enabled UUID export. For details see Section 3.13.2.17, “Serverimport/Serverexport”. If you forgot to include UUIDs during content transfer with serverexport and serverimport, you may repeat synchronization of UUIDs with the tools described here.

UUIDs of contents will be synchronized to Live Servers automatically upon publication. Therefore, the tools typically need to be applied only once within a single environment.

Example Scenario: Multi-Environment Migration

For the following example, this setup is expected:

Development Environment: Instance of Content Management Server used by developers.
Production Environment: Instance of Content Management Server used by editors. The assumption is, that this system received contents from the development environment some time before.

Typical Usage

The following is a typical flow of actions when migrating UUIDs. Assuming that your Development Environment is hosted at dev.host and your Production Environment is hosted at prod.host the following shows the order of commands. Remember that you need to shut down your server at Production Environment before running the last command:

$ cm content-uuid-export \
    --user admin \
    --url https://dev.host/coremedia/ior \
    --query "BELOW PATH '/Sites'" \
    --dburl jdbc:mysql://dev.host:3306/cm_management \
    --dbuser cm_management \
    --dbpassword ... \
    --output dev-export.csv

$ cm generate-content-uuid-map \
    --dburl jdbc:mysql://prod.host:3306/cm_management \
    ...
    --input dev-export.csv \
    --output prod-map.csv

$ cm content-uuid-import \
    --dburl jdbc:mysql://prod.host:3306/cm_management \
    ...
    --input prod-map.csv

Example 3.19. Typical Usage Example

Note, that you not only have to shut down your server, but also restart any client. This applies to clients with the ability to reconnect as well, as they will not get notified on changed UUIDs.

Preparation

To finish the process of UUID synchronization you will require a server downtime for Production Environment when using the last of the three tools. For a rough estimation on possible downtimes see Section “Reference Times for Downtime during Import”. With a careful preparation you may reduce the impact of server downtime by incremental migration. For details see Section “Incremental Migration”.

UUIDs on Live Servers

Content UUIDs will be transferred to live servers upon publication automatically. Publishing newly imported content with UUIDs will thus transfer these UUIDs to the live servers as well. In a UUID migration between environments, you can apply the import file used for the Content Management Server to all live servers, too. See import guidelines for live servers in Section “Typical Usage” for details.

The first two tools content-uuid-export and generate-content-uuid-map do not require a downtime. Nevertheless, it is recommended to execute these tools during limited editorial activity. This will reduce a possible performance impact of content-uuid-export and the generated data may be more consistent. Otherwise, data may contain for example undesirable references to contents deleted concurrently.

Unlike content-uuid-export, you may execute generate-content-uuid-map without running Content Management Server. This may be convenient, if you want to run generate-content-uuid-map and content-uuid-import on Production Environment without interruption.

The last of these tools, content-uuid-import, requires a downtime of the Content Management Server in the Production Environment. If you have any clients running with automatic reconnect, these need to be shut down as well, as the change to UUIDs will not be propagated to them until a restart.

All three tools require direct access to the database. The tools use the configuration available in properties/corem/sql.properties. If these properties do not match the database to access, you may specify the database connection options via command line parameters like --dburl. Thus, if applicable, ensure that you have the following properties at hand:

sql.store.url,
sql.store.driver (guessed from URL if not given),
sql.store.user,
sql.store.password, and possibly
sql.store.login-user-name (required for some databases)

See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual for details on these properties.

Example Scenario: Live Servers Transfer

For the following example, a single CoreMedia Content Cloud instance with Content Management Server, Master Live Server, and one or more Replication Live Servers is expected.

Typical Usage

The following is a typical flow of actions when transferring UUIDs to live servers. Assuming that your Content Management Server is hosted at cms.host and server databases are hosted on cms.db.host, mls.db.host and rls.db.host, the following shows the order of commands. Repeat the last command for each Replication Live Server in your instance and adjust database connection accordingly. Remember that you need to shut down each live server before running the corresponding import command:

$ cm content-uuid-export \
    --user admin \
    --url https://cms.host/coremedia/ior \
    --query "BELOW PATH '/Sites' AND isPublished" \
    --dburl jdbc:mysql://cms.db.host:3306/cm_management \
    --dbuser cm_management \
    --dbpassword ... \
    --toimportformat \
    --output cms-export.csv

$ cm content-uuid-import \
    --dburl jdbc:mysql://mls.db.host:3306/cm_master \
    ...
    --input cms-export.csv

$ cm content-uuid-import \
    --dburl jdbc:mysql://rls.db.host:3306/cm_replication \
    ...
    --input cms-export.csv

Example 3.20. Typical Usage Example

Be extra careful when importing the file to any live server to make sure that you are only importing to a live server which has published content from the very Content Management Server that you created the export from. Importing to a live server which received content from another Content Management Server will introduce wrong UUIDs. When you discover that this happened accidentally, simply repeat the export on the correct Content Management Server and fix your damaged live server content with it by importing this correct export file.

Note, that you not only have to shut down your servers, but also restart any of its clients. This applies to clients with the ability to reconnect as well, as they will not get notified on changed UUIDs.

Preparation

To finish the process of UUID synchronization you will require a server downtime for live servers when using the import tool. For a rough estimation on possible downtimes see Section “Reference Times for Downtime during Import”. With a careful preparation you may reduce the impact of server downtime by incremental migration. For details see Section “Incremental Migration”.

The first tool, content-uuid-export, does not require a downtime. Nevertheless, it is recommended to execute the tool during limited editorial activity. This will reduce a possible performance impact and the generated data may be more consistent. Otherwise, data may contain for example undesirable references to contents unpublished concurrently.

The second tool, content-uuid-import, requires a downtime of the corresponding Master Live Server or Replication Live Server. If you have any clients running with automatic reconnect, these need to be shut down as well, as the change to UUIDs will not be propagated to them until a restart.

Both tools require direct access to the database. The tools use the configuration available in properties/corem/sql.properties. If these properties do not match the database to access, you may specify the database connection options via command line parameters like --dburl. Thus, if applicable, ensure that you have the following properties at hand:

sql.store.url,
sql.store.driver (guessed from URL if not given),
sql.store.user,
sql.store.password, and possibly
sql.store.login-user-name (required for some databases)

See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual for details on these properties.

Incremental Migration

An incremental migration approach increases the effort while decreasing the downtime of the servers and clients for one iteration. Having several options working with increments, choose the best of the options described in the following sections, or combine them. Whatever suits you best.

The strategies for incremental application apply to both scenarios alike (multi-environment migration and live servers transfer).

Incremental Export and Mapping

While content-uuid-export retrieves the content data by direct database access (due to performance reasons), you select the content to export by Unified API queries. For syntax and details see Section 5.7, “Query Service” in Unified API Developer Manual.

Split by Content Queries: It is recommended to split your migration tasks by paths. You may for example start with the root folder of your master site. Or you may start with everything outside of your sites folder. Using content queries provides a rich mechanism of partitioning your contents. For some examples on queries, consider calling content-uuid-export with --query-help.

Only when in a multi-environment migration scenario, you run generate-content-uuid-map with the results from content-uuid-export.

Incremental Import

The content-uuid-import provides the options --from and --to. These limit the processed elements provided by generate-content-uuid-map. --from specifies the first element to import. --to specifies the last element to import. If you run content-uuid-import with --from 10 and --to 20, elements 10 to 20 (inclusive) will be processed.

How can I ensure that all UUIDs have been imported correctly in a multi-environment migration scenario? The key to this is that generate-content-uuid-map will only output those mappings which require the UUID of a content to be changed. Thus, it ignores especially UUIDs which already match at Production Environment (for example from previous migration runs). Therefore, you could run generate-content-uuid-map again after your import. If this results in an empty mapping/file, all your UUIDs were mapped.

How to recover from aborted import? content-uuid-import can safely be aborted at any time. This, of course, also applies to unexpected abortions due to connection loss to database for example. To recover, you have several options, where the easiest one is just starting content-uuid-import again. This is because updating the UUID twice (to the same value) does no harm to the system. To skip already processed elements, you may either run generate-content-uuid-map again prior to starting content-uuid-import, or you may adjust the parameters --from and --to accordingly.

Reference Times for Downtime during Import

As stated before, while executing content-uuid-import the Content Server and related clients have to be stopped. This section will give you a rough overview on possible downtimes to expect for the last step.

Test Setup: A million contents were updated during the test. All databases were set up locally within Docker containers, each having memory of 5 GB and 8 CPUs. content-uuid-import was executed with --threads 8 and --batchsize 1000 (default).

Database	Duration in [s]
PostgreSQL 9.6	145
MySQL Database 5.7	728
MS SQL Server 2017-latest	353
Oracle 19.3.0-se2	32

Table 3.26. Reference times for content-uuid-import for a million contents

Command Reference

In the following you will get detailed help on the specific commands in the order of their execution during the migration process.

content-uuid-export

content-uuid-export is the first tool to be executed. It will generate a CSV file (more specifically a semicolon-separated file, so that you may open it for example in several office applications directly). Without option --toimportformat , the CSV file will contain the following data:

Path: The path of the content.
Type: The content type of the content.
UUID: The UUID of the content.

Both path and type will be used to identify the content in Production Environment.

With option --toimportformat, the data contained in the CSV file will be in the format as the output of generate-content-uuid-map. See Section “generate-content-uuid-map” for details.

For exported contents that don't have a UUID yet, a new UUID will be created and persisted in the database. Creating a lot of missing UUIDs may severely slow down the export.

cm content-uuid-export [(1) connection options]
[(2) database options]
[(3) query options]
[ -? | --help ]
[ { -bs | --batchsize } count ]
[ { -E | --encoding } encoding ]
[ -M | --meta ]
[ { -o | --output } file ]
[ --toimportformat ]
[ -v | --verbose ](1) { -u | --user } user [ { -d | --domain } domain ] [ { -p | --password } password ] [--url IOR URL](2) [--dbdriver class] [--dbloginname name] [--dbpassword password] [--dburl JDBC URL] [--dbuser name](3) { -q | --query | -qf | --queryfile } query or file [ { -l | --limit } limit ] [ { -qh | --query-help } ]

Example 3.21. Usage of content-uuid-export

The options have the following meaning:

Parameter	Description
`-?` \| `--help`	Will output usage information.
{ `-bs` \| `--batchsize` } `count`	The batch size defines how many elements are processed in one SQL statement. A larger number decreases the amount of queries against the database, and therefore increases the performance. However, if too high the SQL statement could become to large for the database, resulting in an I/O Error while connecting to the database. The default value is 1,000.
`--dbdriver` `class`	The database driver class. Overrides `sql.store.driver`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbloginname` `name`	The login username (needed for PostgreSQL on Azure for example). Overrides `sql.store.login-user-name`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbpassword` `password`	The database password. Overrides `sql.store.password`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dburl` `JDBC URL`	The JDBC URL for the database. Overrides `sql.store.url`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbuser` `user`	The database user name. Overrides `sql.store.user`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
{ `-E` \| `--encoding` } `encoding`	Defines the encoding for the output. Default is UTF-8.
{ `-l` \| `--limit` } `limit`	Query parameter, which limits the number of returned contents.
`-M` \| `--meta`	Will add meta information as line comments to the output. Meta information for example include the IOR URL, start and end time as well as contents processed. Note, that using this option typically prohibits opening the CSV file in third-party applications, as line comments are not part of the CSV standard.
{ `-o` \| `--output` } `file`	Will output the CSV to the given file. The file will be created relative to your current working directory. If not set, output will be printed to console instead.
{ `--toimportformat` }	Will output the CSV in a format that can be imported to a connected live server without further mapping by generate-content-uuid-map.
{ `-q` \| `--query` } `query`	The query to select contents to export. Either this option or `--queryfile` must be given. For details on version see Section 5.7, “Query Service” in Unified API Developer Manual.
{ `-qf` \| `--queryfile` } `file`	A file containing the query to execute. Convenient, if your operating system limits the character length for your command. Either this option or `--query` must be given. For details on version see Section 5.7, “Query Service” in Unified API Developer Manual.
`-qh` \| `--query-help`	Provides some examples for Unified API queries. If given, the tool will quit as soon as examples have been shown.
`-v` \| `--verbose`	Toggle verbose output.

Table 3.27. Parameters of content-uuid-export

generate-content-uuid-map

generate-content-uuid-map is the tool to be executed in a multi-environment migration scenario. It will generate a CSV file for import (more specifically a semicolon-separated file, so that you may open it for example in several office applications directly). The CSV file will contain the following information, which are required to continue with the next task:

Content ID: The numeric ID of the content, which should be set to the given UUID.
UUID: The UUID to be set for content identified by the content ID.

The ID will be used to identify the content in the database in Production Environment during the next task.

cm generate-content-uuid-map [(1) database options]
[ -? | --help ]
[ { -bs | --batchsize } count ]
[ { -E | --encoding } encoding ]
[ { -i | --input } file ]
[ -M | --meta ]
[ { -o | --output } file ]
[ -v | --verbose ](1) [--dbdriver class] [--dbloginname name] [--dbpassword password] [--dburl JDBC URL] [--dbuser name]

Example 3.22. Usage of generate-content-uuid-map

Parameter	Description
`-?` \| `--help`	Will output usage information.
{ `-bs` \| `--batchsize` } `count`	The batch size defines how many elements are processed in one SQL statement. A larger number decreases the amount of queries against the database, and therefore increases the performance. However, if too high the SQL statement could become too large for the database, resulting in an I/O Error while connecting to the database. The default value is 1,000.
`--dbdriver` `class`	The database driver class. Overrides `sql.store.driver`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbloginname` `name`	The login username (needed for PostgreSQL on Azure for example). Overrides `sql.store.login-user-name`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbpassword` `password`	The database password. Overrides `sql.store.password`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dburl` `JDBC URL`	The JDBC URL for the database. Overrides `sql.store.url`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbuser` `user`	The database user name. Overrides `sql.store.user`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
{ `-E` \| `--encoding` } `encoding`	Defines the encoding for input and output. Default is UTF-8.
{ `-i` \| `--input` } `file`	Input file generated by content-uuid-export to be processed. The file is relative to your current working directory. If unset, defaults to standard input instead.
`-M` \| `--meta`	Will add meta information as line comments to the output. Meta information for example include the IOR URL, start and end time as well as contents processed. Note, that using this option typically prohibits opening the CSV file in third-party applications, as line comments are not part of the CSV standard.
{ `-o` \| `--output` } `file`	Will output the CSV to the given file. The file will be created relative to your current working directory. If not set, output will be printed to console instead.
`-v` \| `--verbose`	Toggle verbose output.

Table 3.28. Parameters of generate-content-uuid-map

content-uuid-import

content-uuid-import is the last tool to be executed. It will update the contents of a Content Server to have the corresponding new UUIDs. In a multi-environment migration scenario, its input file will be the output of generate-content-uuid-map. In a live servers transfer scenario, its input file will be the output of content-uuid-export (called with option --toimportformat).

Prior to starting this tool, you need to shut down the target Content Management Server, Master Live Server, or Replication Live Server. To reduce downtimes, you may want to read Section “Incremental Migration”.

cm content-uuid-import [(1) database options]
[ -? | --help ]
[ { -bs | --batchsize } count ]
[ { -E | --encoding } encoding ]
[ { -F | --from } count ]
[ { -i | --input } file ]
[ { -T | --to } count ]
[ { -t | --threads } threads ]
[ -v | --verbose ](1) [--dbdriver class] [--dbloginname name] [--dbpassword password] [--dburl JDBC URL] [--dbuser name]

Example 3.23. Usage of content-uuid-import

Parameter	Description
`-?` \| `--help`	Will output usage information.
{ `-bs` \| `--batchsize` } `count`	The batch size defines how many elements are processed in one SQL statement. A larger number decreases the amount of queries against the database, and therefore increases the performance. However, if too high the sql statement could become to large for the database, resulting in an I/O Error while connecting to the database. The default value is 1,000.
`--dbdriver` `class`	The database driver class. Overrides `sql.store.driver`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbloginname` `name`	The login username (needed for PostgreSQL on Azure for example). Overrides `sql.store.login-user-name`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbpassword` `password`	The database password. Overrides `sql.store.password`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dburl` `JDBC URL`	The JDBC URL for the database. Overrides `sql.store.url`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbuser` `user`	The database user name. Overrides `sql.store.user`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
{ `-E` \| `--encoding` } `encoding`	Defines the encoding for input. Default is UTF-8.
{ `-F` \| `--from` } `count`	Position from which entry (inclusive, starting with 1) entries should be processed. Combine with `--to` to import UUIDs partially.
{ `-i` \| `--input` } `file`	Input file with content ID and UUID data to be processed. The file is relative to your current working directory. If unset, defaults to standard input instead.
{ `-t` \| `--threads` } `threads`	In order to increase performance, this program runs multi threaded. Therefore, this parameter defines how many threads are used to process the database queries. The default value is 4. If your system has more threads available it is recommended to set this parameter higher, as this will increase performance. The number of threads used also results to the amount of database connections being opened at the same time.
{ `-T` \| `--to` } `count`	Position up to which entry (inclusive, starting with 1) entries should be processed. Combine with `--from` to import UUIDs partially.
`-v` \| `--verbose`	Toggle verbose output.

Table 3.29. Parameters of content-uuid-import

Search Results

Table Of Contents

Filter

Content Server Manual / Version 2401

3.13.2.4 Content UUID Migration and Transfer

Purpose

Example Scenario: Multi-Environment Migration

Typical Usage

Preparation

Example Scenario: Live Servers Transfer

Typical Usage

Preparation

Incremental Migration

Incremental Export and Mapping

Incremental Import

Reference Times for Downtime during Import

Command Reference

content-uuid-export

generate-content-uuid-map

content-uuid-import

Search Results