Content Server Manual / 3.13.2.4 Content-UUID-Migration

Content Server Manual / Version 2204

3.13.2.4 Content-UUID-Migration

This section describes three tools, which are meant for upgrade scenarios of Content Management Server in multi-environment setup, as for example a development and a production stage, related to UUIDs of contents. For details on UUID support in CoreMedia Content Cloud see Section 5.2, “UUIDs” in Unified API Developer Manual.

To finish the process, a server downtime is required. Thus, please plan your upgrade scenario carefully. Find more details at Section “Preparation”.

Purpose

The goal of these tools is sharing the same UUIDs for contents between two servers in different environments.

To keep the UUID synchronized after this migration step, consider using serverexport and serverimport with enabled UUID export. For details see Section 3.13.2.17, “Serverimport/Serverexport”.

While being dedicated to this upgrade scenario, you may use the tools to synchronize UUIDs if you have forgotten to include the UUIDs during transfer with serverexport and serverimport.

Example Scenario

For the following descriptions the following example setup is expected:

Development Environment: Instance of Content Management Server used by developers.
Production Environment: Instance of Content Management Server used by editors. The assumption is, that this system receives contents from the development environment sometimes.

Typical Usage

The following is a typical flow of actions when migrating UUIDs. Assuming that your Development Environment is hosted at dev.host and your Production Environment is hosted at prod.host the following shows the order of commands. Remember, that you need to shut down your server at Production Environment when running the last command:

$ cm content-uuid-export \
    --user admin \
    --url https://dev.host/coremedia/ior \
    --query "BELOW PATH '/Sites'" \
    --dburl jdbc:mysql://dev.host:3306/cm_management \
    --dbuser cm_management \
    --dbpassword ... \
    --output dev-export.csv

$ cm generate-content-uuid-map \
    --dburl jdbc:mysql://prod.host:3306/cm_management \
    ...
    --input dev-export.csv \
    --output prod-map.csv

$ cm content-uuid-import \
    --dburl jdbc:mysql://prod.host:3306/cm_management \
    ...
    --input prod-map.csv

Example 3.19. Typical Usage Example

Note, that you not only have to shut down your server, but also restart any client. This applies to clients with the ability to reconnect as well, as they will not get notified on changed UUIDs.

Preparation

To finish the process of UUID synchronization you will require a server downtime for Production Environment when using the last of the three tools. For a rough estimation on possible downtimes see Section “Reference times for downtime during import”. With a careful preparation you may reduce the impact of server downtime by incremental migration. For details see Section “Incremental Migration”.

Live Servers not affected

Because live servers (master live server and replication live server) are unaware of content UUIDs, they require neither a shut down nor a restart. Same applies of course to all clients connected to live servers.

For details on support for UUIDs for contents see Section 5.2, “UUIDs” in Unified API Developer Manual.

The first two tools content-uuid-export and generate-content-uuid-map do not require a downtime. Nevertheless, it is recommended to execute these tools during limited editorial activity. This will reduce a possible performance impact of content-uuid-export and the generated data may be more consistent. Otherwise, data may contain for example undesirable references to contents deleted concurrently.

Unlike content-uuid-export, you may execute generate-content-uuid-map without running Content Management Server. This may be convenient, if you want to run generate-content-uuid-map and content-uuid-import on Production Environment without interruption.

The last of these tools, content-uuid-import, requires a downtime of the Content Management Server in the Production Environment. If you have any clients running with automatic reconnect, these need to be shut down as well, as the change to UUIDs will not be propagated to them until a restart.

All three tools require direct access to the database. The tools use the configuration available in properties/corem/sql.properties. If these properties do not match the database to access, you may specify the database connection options via command line parameters like --dburl. Thus, if applicable, ensure that you have the following properties at hand:

sql.store.url,
sql.store.driver (guessed from URL if not given),
sql.store.user,
sql.store.password, and possibly
sql.store.login-user-name (required for some databases)

See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual for details on these properties.

Incremental Migration

An incremental migration approach increases the effort, while decreasing the downtime of the servers and clients for one iteration. Having several options working with increments, choose the best of the options described in the following sections, or combine them. Whatever suits you best.

Incremental Export and Mapping

While content-uuid-export retrieves the content data by direct database access (due to performance reasons), you select the content to export by Unified API queries. For syntax and details see Section 5.7, “Query Service” in Unified API Developer Manual.

Split by Content Queries: It is recommended to split your migration tasks by paths. You may for example start with the root folder of your master site. Or you may start with everything outside of your sites folder. Using content queries provides a rich mechanism of partitioning your contents. For some examples on queries, consider calling content-uuid-export with --query-help.

After you exported your content, you can run generate-content-uuid-map with the results from content-uuid-export.

Incremental Import

The content-uuid-import provides the options --from and --to. These limit the processed elements provided by generate-content-uuid-map. --from specifies the first element to import. --to specifies the last element to import. If you run content-uuid-import with --from 10 and --to 20, elements 10 to 20 (inclusive) will be processed.

How can I ensure that all UUIDs have been imported correctly? The key to this is, that generate-content-uuid-map will only output those mappings, which require the UUID of a content to be changed. Thus, it ignores especially UUIDs, which already match at Production Environment (for example from previous migration runs). Therefore, you could run generate-content-uuid-map again after your import. If this results in an empty mapping/file, all your UUIDs were mapped.

How to recover from aborted import? content-uuid-import can safely be aborted at any time. This, of course, also applies to unexpected abortions due to connection loss to database for example. To recover, you have several options, where the easiest one is just starting content-uuid-import again. This is, because updating the UUID twice (to the same value) does no harm to the system. To skip already processed elements, you may either run generate-content-uuid-map again prior to starting content-uuid-import, or you may adjust the parameters --from and --to accordingly.

Reference times for downtime during import

As stated before, while executing content-uuid-import the Content Management Server and related clients have to be stopped. This section will give you a rough overview on possible downtimes to expect for the last step.

Test Setup: A million contents were updated during the test. All databases were set up locally within Docker containers, each having memory of 5 GB and 8 CPUs. content-uuid-import was executed with --threads 8 and --batchsize 1000 (default).

Database	Duration in [s]
PostgreSQL 9.6	145
MySQL Database 5.7	728
MS SQL Server 2017-latest	353
Oracle 19.3.0-se2	32

Table 3.25. Reference times for content-uuid-import for a million contents

Command Reference

In the following you will get detailed help on the specific commands in the order of their execution during the migration process.

content-uuid-export

content-uuid-import is the first tool to be executed. It will generate a CSV file (more specifically a semicolon separated file, so that you may open it for example in several office applications directly). The CSV file will contain the following information, which are required to continue with the next tasks:

Path: The path of the content.
Type: The content type of the content.
UUID: The UUID of the content.

Both, path and type, will be used to identify the content in Production Environment.

cm content-uuid-export [(1) connection options]
[(2) database options]
[(3) query options]
[ -? | --help ]
[ { -bs | --batchsize } count ]
[ { -E | --encoding } encoding ]
[ -M | --meta ]
[ { -o | --output } file ]
[ -v | --verbose ](1) { -u | --user } user [ { -d | --domain } domain ] [ { -p | --password } password ] [--url IOR URL](2) [--dbdriver class] [--dbloginname name] [--dbpassword password] [--dburl JDBC URL] [--dbuser name](3) { -q | --query | -qf | --queryfile } query or file [ { -l | --limit } limit ] [ { -e | --versions } ] [ { -qh | --query-help } ]

Example 3.20. Usage of content-uuid-export

The options have the following meaning:

Parameter	Description
`-?` \| `--help`	Will output usage information.
{ `-bs` \| `--batchsize` } `count`	The batch size defines how many elements are processed in one SQL statement. A larger number decreases the amount of queries against the database, and therefore increases the performance. However, if too high the SQL statement could become to large for the database, resulting in an I/O Error while connecting to the database. The default value is 1,000.
`--dbdriver` `class`	The database driver class. Overrides `sql.store.driver`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbloginname` `name`	The login username (needed for PostgreSQL on Azure for example). Overrides `sql.store.login-user-name`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbpassword` `password`	The database password. Overrides `sql.store.password`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dburl` `JDBC URL`	The JDBC URL for the database. Overrides `sql.store.url`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbuser` `user`	The database user name. Overrides `sql.store.user`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
{ `-E` \| `--encoding` } `encoding`	Defines the encoding for the output. Default is UTF-8.
{ `-l` \| `--limit` } `limit`	Query parameter, which limits the number of returned contents.
`-M` \| `--meta`	Will add meta information as line comments to the output. Meta information for example include the IOR URL, start and end time as well as contents processed. Note, that using this option typically prohibits opening the CSV file in third-party applications, as line comments are not part of the CSV standard.
{ `-o` \| `--output` } `file`	Will output the CSV to the given file. The file will be created relative to your current working directory. If not set, output will be printed to console instead.
{ `-q` \| `--query` } `query`	The query to select contents to export. Either this option or `--queryfile` must be given. For details on version see Section 5.7, “Query Service” in Unified API Developer Manual.
{ `-qf` \| `--queryfile` } `file`	A file containing the query to execute. Convenient, if your operating system limits the character length for your command. Either this option or `--query` must be given. For details on version see Section 5.7, “Query Service” in Unified API Developer Manual.
`-qh` \| `--query-help`	Provides some examples for Unified API queries. If given, the tool will quit as soon as examples have been shown.
`-v` \| `--verbose`	Toggle verbose output.

Table 3.26. Parameters of content-uuid-export

generate-content-uuid-map

generate-content-uuid-map is the second tool to be executed. It will generate a CSV file (more specifically a semicolon separated file, so that you may open it for example in several office applications directly). The CSV file will contain the following information, which are required to continue with the next task:

Content ID: The numeric ID of the content, which should be set to the given UUID.
UUID: The UUID to be set for content identified by the content ID.

The ID will be used to identify the content in the database in Production Environment during the next task.

cm generate-content-uuid-map [(1) database options]
[ -? | --help ]
[ { -bs | --batchsize } count ]
[ { -E | --encoding } encoding ]
[ { -i | --input } file ]
[ -M | --meta ]
[ { -o | --output } file ]
[ -v | --verbose ](1) [--dbdriver class] [--dbloginname name] [--dbpassword password] [--dburl JDBC URL] [--dbuser name]

Example 3.21. Usage of generate-content-uuid-map

Parameter	Description
`-?` \| `--help`	Will output usage information.
{ `-bs` \| `--batchsize` } `count`	The batch size defines how many elements are processed in one SQL statement. A larger number decreases the amount of queries against the database, and therefore increases the performance. However, if too high the SQL statement could become too large for the database, resulting in an I/O Error while connecting to the database. The default value is 1,000.
`--dbdriver` `class`	The database driver class. Overrides `sql.store.driver`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbloginname` `name`	The login username (needed for PostgreSQL on Azure for example). Overrides `sql.store.login-user-name`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbpassword` `password`	The database password. Overrides `sql.store.password`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dburl` `JDBC URL`	The JDBC URL for the database. Overrides `sql.store.url`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbuser` `user`	The database user name. Overrides `sql.store.user`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
{ `-E` \| `--encoding` } `encoding`	Defines the encoding for input and output. Default is UTF-8.
{ `-i` \| `--input` } `file`	Input file generated by content-uuid-export to be processed. The file is relative to your current working directory. If unset, defaults to standard input instead.
`-M` \| `--meta`	Will add meta information as line comments to the output. Meta information for example include the IOR URL, start and end time as well as contents processed. Note, that using this option typically prohibits opening the CSV file in third-party applications, as line comments are not part of the CSV standard.
{ `-o` \| `--output` } `file`	Will output the CSV to the given file. The file will be created relative to your current working directory. If not set, output will be printed to console instead.
`-v` \| `--verbose`	Toggle verbose output.

Table 3.27. Parameters of generate-content-uuid-map

content-uuid-import

content-uuid-import is the third and last tool to be executed. It will update the contents identified by generate-content-uuid-map having the corresponding new UUID.

Prior to starting this tool, you need to shut down the Content Management Server. To reduce downtimes, you may want to read Section “Incremental Migration”.

cm content-uuid-import [(1) database options]
[ -? | --help ]
[ { -bs | --batchsize } count ]
[ { -E | --encoding } encoding ]
[ { -F | --from } count ]
[ { -i | --input } file ]
[ { -T | --to } count ]
[ { -t | --threads } threads ]
[ -v | --verbose ](1) [--dbdriver class] [--dbloginname name] [--dbpassword password] [--dburl JDBC URL] [--dbuser name]

Example 3.22. Usage of content-uuid-import

Parameter	Description
`-?` \| `--help`	Will output usage information.
{ `-bs` \| `--batchsize` } `count`	The batch size defines how many elements are processed in one SQL statement. A larger number decreases the amount of queries against the database, and therefore increases the performance. However, if too high the sql statement could become to large for the database, resulting in an I/O Error while connecting to the database. The default value is 1,000.
`--dbdriver` `class`	The database driver class. Overrides `sql.store.driver`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbloginname` `name`	The login username (needed for PostgreSQL on Azure for example). Overrides `sql.store.login-user-name`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbpassword` `password`	The database password. Overrides `sql.store.password`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dburl` `JDBC URL`	The JDBC URL for the database. Overrides `sql.store.url`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
`--dbuser` `user`	The database user name. Overrides `sql.store.user`. See Section 3.2.4, “Properties for the Connection to the Database” in Deployment Manual.
{ `-E` \| `--encoding` } `encoding`	Defines the encoding for input. Default is UTF-8.
{ `-F` \| `--from` } `count`	Position from which entry (inclusive, starting with 1) entries should be processed. Combine with `--to` to import UUIDs partially.
{ `-i` \| `--input` } `file`	Input file generated by generate-content-uuid-map to be processed. The file is relative to your current working directory. If unset, defaults to standard input instead.
{ `-t` \| `--threads` } `threads`	In order to increase performance, this program runs multi threaded. Therefore, this parameter defines how many threads are used to process the database queries. The default value is 4. If your system has more threads available it is recommended to set this parameter higher, as this will increase performance. The number of threads used also results to the amount of database connections being opened at the same time.
{ `-T` \| `--to` } `count`	Position up to which entry (inclusive, starting with 1) entries should be processed. Combine with `--from` to import UUIDs partially.
`-v` \| `--verbose`	Toggle verbose output.

Table 3.28. Parameters of content-uuid-import

Search Results

Table Of Contents

Filter