Skip to content

mo_cdc Data Synchronization

CDC (Change Data Capture) is a technology that captures real-time changes in a database, recording insert, update, and delete operations. By monitoring database changes, it enables real-time data synchronization and incremental processing, ensuring consistency across different systems. CDC is suitable for scenarios such as real-time data synchronization, data migration, disaster recovery, and audit tracking. It reads transaction logs to reduce the pressure of full data replication and improves system performance and efficiency. Its advantages include low latency, high real-time capability, flexible support for multiple databases and systems, and adaptability to evolving large-scale data environments.

Before performing CDC synchronization, it is necessary to first establish PITR (Point-in-Time Recovery) capabilities covering the synchronization scope, with a recommended coverage of at least 2 hours of changes. This ensures that if the synchronization task is interrupted or encounters an exception, the system can backtrack and re-read the changed data, avoiding data loss or inconsistency.

MatrixOne supports data synchronization at the tenant/database/table level through the mo_cdc utility. This section introduces the usage of mo_cdc.

Note

mo_cdc is an enterprise-level data synchronization tool. You need to contact your MatrixOne account manager to obtain the download path.

Command Reference Guide

help - Print the reference guide.

(base) admin@admindeMBP mo-backup % ./mo_cdc help
This command allows you to manage CDC Task, including task create, task show, task pause, task resume, task restart, and task drop.

Usage:
  mo_cdc [flags]
  mo_cdc [command]

Available Commands:
  completion  Generate the autocompletion script for the specified shell
  help        Help about any command
  task        Manage Task

Flags:
  -h, --help   help for mo_cdc

Use "mo_cdc [command] --help" for more information about a command.

Create a Task

Syntax

mo_cdc task create
    --task-name 
    --source-uri 
    --sink-type 
    --sink-uri 
    --level 
      account|database|table
    --databases
    --tables 
    --no-full 
    --start-ts
    --end-ts
    --start-ts 
    --end-ts
    --send-sql-timeout 
    --max-sql-length 
    --exclude
    --error-handle-option

Parameter Description

Parameter Description
task-name Synchronization task name
source-uri Source (MatrixOne) connection string
sink-type Downstream type, currently supports mysql and matrixone
level Synchronization scope: account, database, or table
databases Optional, required when the scope is database-level
tables Optional, required when the scope is table-level
no-full Optional, enables full synchronization by default; adding this parameter disables it
start-ts Optional, starts pulling data from a specific timestamp in the database (must be earlier than the current time)
end-ts Optional, stops pulling data at a specified timestamp (must be later than start-ts if specified)
max-sql-length Optional, limits the length of a single SQL statement (defaults to the smaller of 4MB or the downstream max_packet_size variable)
exclude Optional, specifies objects to exclude (supports regex)
error-handle-option Optional, stop or ignore. Controls behavior when encountering errors during synchronization (default: stop; ignore skips the error and continues)

Examples

>./mo_cdc task create --task-name "ms_task1" --source-uri "mysql://root:111@127.0.0.1:6001" --sink-uri "mysql://root:111@127.0.0.1:3306" --sink-type "mysql" --level table --tables "db1.t1:db1.t1"  

>./mo_cdc task create --task-name "ms_task2" --source-uri "mysql://acc1:admin:111@127.0.0.1:6001" --sink-uri "mysql://root:111@127.0.0.1:3306" --sink-type "mysql" --level "account" 

>./mo_cdc task create --task-name "mo_task1" --source-uri "mysql://root:111@127.0.0.1:6001" --sink-uri "mysql://root:111@10.222.xx.xx:6001" --sink-type "matrixone" --level database --databases "db1:db2" 

View Tasks

Only tasks created by the current connected user can be viewed.

Syntax

mo_cdc task show
    --source-uri 
    --all 
    --task-name 

Parameter Description

Parameter Description
source-uri Source server address
all View all synchronization tasks
task-name Synchronization task name

Response Fields

Field Description
task-id Task ID
task-name Task name
source-uri Source server address
sink-uri Downstream resource identifier
state Task status (running or stopped)
checkpoint Synchronization progress
timestamp Current timestamp

Examples

# View all synchronization tasks
> ./mo_cdc task show --source-uri "mysql://root:111@127.0.0.1:6001"  --all
[
  {
    "task-id": "0195db8d-1a36-73d0-9fa3-e37839638b4b",
    "task-name": "mo_task1",
    "source-uri": "mysql://root:******@127.0.0.1:6001",
    "sink-uri": "mysql://root:******@10.222.xx.xx:6001",
    "state": "running",
    "err-msg": "",
    "checkpoint": "{\n  \"db1.t1\": 2025-03-28 15:00:35.790209 +0800 CST,\n}",
    "timestamp": "2025-03-28 15:00:36.207296 +0800 CST"
  },
  {
    "task-id": "0195db5c-6406-73d8-bbf6-25fb8b9dd45d",
    "task-name": "task1",
    "source-uri": "mysql://root:******@127.0.0.1:6001",
    "sink-uri": "mysql://root:******@127.0.0.1:3306",
    "state": "running",
    "err-msg": "",
    "checkpoint": "{\n  \"source_db.orders\": 2025-03-28 15:00:35.620173 +0800 CST,\n}",
    "timestamp": "2025-03-28 15:00:36.207296 +0800 CST"
  },
  {
    "task-id": "0195db82-7d6f-7f2a-a6d0-24cbe6ae8896",
    "task-name": "ms_task1",
    "source-uri": "mysql://root:******@127.0.0.1:6001",
    "sink-uri": "mysql://root:******@127.0.0.1:3306",
    "state": "running",
    "err-msg": "",
    "checkpoint": "{\n  \"db1.t1\": 2025-03-28 15:00:35.632194 +0800 CST,\n}",
    "timestamp": "2025-03-28 15:00:36.207296 +0800 CST"
  }
]


# View a specific task
>./mo_cdc task show --source-uri "mysql://acc1:admin:111@127.0.0.1:6001"   --task-name "ms_task2"
[
  {
    "task-id": "0195db8c-c15a-742e-8d0d-598529ab3f1e",
    "task-name": "ms_task2",
    "source-uri": "mysql://acc1:admin:******@127.0.0.1:6001",
    "sink-uri": "mysql://root:******@127.0.0.1:3306",
    "state": "running",
    "err-msg": "",
    "checkpoint": "{\n  \"db1.t1\": 2025-03-28 15:01:44.030821 +0800 CST,\n  \"db1.table1\": 2025-03-28 15:01:43.998759 +0800 CST,\n}",
    "timestamp": "2025-03-28 15:01:44.908341 +0800 CST"
  }
]

Pause a Task

Syntax

mo_cdc task pause
    --source-uri 
    --all 
    --task-name 

Parameter Description

Parameter Description
source-uri Source server address
all Pause all synchronization tasks
task-name Synchronization task name

Examples

# Pause a specific task
./mo_cdc task pause  --source-uri "mysql://acc1:admin:111@127.0.0.1:6001" --task-name "ms_task2"

# Pause all tasks
./mo_cdc task pause --source-uri "mysql://root:111@127.0.0.1:6001"  --all

Resume a Task

A task can only be resumed if its status is stopped. The resumption process supports checkpoint recovery. If the pause duration exceeds the GC retention period, operations during that time will not be synchronized, and only the final data state will be synced.

Syntax

mo_cdc task resume
    --source-uri 
    --task-name 

Parameter Description

Parameter Description
source-uri Source server address
task-name Synchronization task name

Examples

./mo_cdc task resume  --source-uri "mysql://acc1:admin:111@127.0.0.1:6001" --task-name "ms_task2"

Restart a Task

Restarting a CDC task ignores previous synchronization progress and starts from the beginning.

Syntax

mo_cdc task restart
    --source-uri 
    --task-name 

Parameter Description

Parameter Description
source-uri Source server address
task-name Synchronization task name

Examples

./mo_cdc task restart  --source-uri "mysql://acc1:admin:111@127.0.0.1:6001" --task-name "ms_task2"

Delete a Task

Syntax

mo_cdc task drop
    --source-uri 
    --all
    --task-name 

Parameter Description

Parameter Description
source-uri Source server address
all Delete all synchronization tasks
task-name Delete a specific task

Examples

# Delete a specific task
./mo_cdc task drop  --source-uri "mysql://acc1:admin:111@127.0.0.1:6001" --task-name "ms_task2"

# Delete all tasks
./mo_cdc task drop  --source-uri  "mysql://root:111@127.0.0.1:6001" --all