Skip to content

MatrixOne to MySQL CDC function

Scene description

An online retail company uses MatrixOne as the production database for its order management system to store order data. In order to support the real-time analysis needs of the business (such as order quantity, sales trends, customer behavior, etc.), order data needs to be synchronized from MatrixOne to the MySQL analysis database in real time for use by the data analysis team and business systems. Through the mo_cdc tool, real-time synchronization of order data can be efficiently achieved, allowing the analysis system to obtain the latest order information at any time.

-Source database (production database): The orders table in MatrixOne contains order data and records the details of each order, including order ID, customer ID, order time, order amount and status. -Target database (analysis database): orders_backup table in MySQL, used for real-time statistics and analysis of order information. Ensure that the sales team can grasp sales dynamics in real time. -Synchronization requirements: Use mo_cdc to synchronize data in MatrixOne's orders table to MySQL's orders_backup table in real time to ensure that the analysis system data is consistent with the production system.

Operation process

Create table structure

Ensure that the table structures in the source database MatrixOne and the target database MySQL are identical to facilitate seamless data synchronization.

  • orders table in MatrixOne:
CREATE TABLE source_db.orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATETIME,
    amount DECIMAL(10, 2),
    status VARCHAR(20)
);
INSERT INTO source_db.orders (order_id, customer_id, order_date, amount, status) VALUES
 (1, 101, '2024-01-15 14:30:00', 99.99, 'Shipped'),
 (2, 102, '2024-02-10 10:00:00', 149.50, 'Delivered'),
 (3, 103, '2024-03-05 16:45:00', 75.00, 'Processing'),
 (4, 104, '2024-04-20 09:15:00', 200.00, 'Shipped'),
 (5, 105, '2024-05-12 14:00:00', 49.99, 'Delivered');
  • orders_backup table in MySQL:
CREATE TABLE analytics_db.orders_backup (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATETIME,
    amount DECIMAL(10, 2),
    status VARCHAR(20)
);

Create mo_cdc synchronization task

Create a synchronization task through the mo_cdc tool to push MatrixOne's order data to MySQL in real time.

>./mo_cdc task create \
       --task-name "task1" \
       --source-uri "mysql://root:111@127.0.0.1:6001" \
       --sink-type "mysql" \
       --sink-uri "mysql://root:111@127.0.0.1:3306" \
       --tables "source_db.orders:analytics_db.orders_backup" \
       --level "account" \
       --account "sys"

View task status

> ./mo_cdc task show \
       --task-name "task1" \
       --source-uri "mysql://root:111@127.0.0.1:6001"
[
  {
    "task-id": "0192d76f-d89a-70b3-a60d-615c5f2fd33d",
    "task-name": "task1",
    "source-uri": "mysql://root:******@127.0.0.1:6001",
    "sink-uri": "mysql://root:******@127.0.0.1:3306",
    "state": "running",
    "checkpoint": "{\n  \"source_db.orders\": 2024-10-29 16:43:00.318404 +0800 CST,\n}",
    "timestamp": "2024-10-29 16:43:01.299298 +0800 CST"
  }
] 

Connect to downstream mysql to view full data synchronization status

mysql> select * from analytics_db.orders_backup;
+----------+-------------+---------------------+--------+------------+
| order_id | customer_id | order_date          | amount | status     |
+----------+-------------+---------------------+--------+------------+
|        1 |         101 | 2024-01-15 14:30:00 |  99.99 | Shipped    |
|        2 |         102 | 2024-02-10 10:00:00 | 149.50 | Delivered  |
|        3 |         103 | 2024-03-05 16:45:00 |  75.00 | Processing |
|        4 |         104 | 2024-04-20 09:15:00 | 200.00 | Shipped    |
|        5 |         105 | 2024-05-12 14:00:00 |  49.99 | Delivered  |
+----------+-------------+---------------------+--------+------------+
5 rows in set (0.01 sec)

Incremental synchronization task

After the task is established, perform data change operations on the upstream MatrixOne

INSERT INTO source_db.orders (order_id, customer_id, order_date, amount, status) VALUES
(6, 106, '2024-10-29 12:00:00', 150.00, 'New');
DELETE FROM source_db.orders WHERE order_id = 6;
UPDATE source_db.orders SET status = 'Delivered' WHERE order_id = 4;

mysql> select * from source_db.orders;
+----------+-------------+---------------------+--------+------------+
| order_id | customer_id | order_date          | amount | status     |
+----------+-------------+---------------------+--------+------------+
|        4 |         104 | 2024-04-20 09:15:00 | 200.00 | Delivered  |
|        1 |         101 | 2024-01-15 14:30:00 |  99.99 | Shipped    |
|        2 |         102 | 2024-02-10 10:00:00 | 149.50 | Delivered  |
|        3 |         103 | 2024-03-05 16:45:00 |  75.00 | Processing |
|        5 |         105 | 2024-05-12 14:00:00 |  49.99 | Delivered  |
+----------+-------------+---------------------+--------+------------+
5 rows in set (0.00 sec)

Connect to downstream mysql to view incremental data synchronization status

mysql> select * from analytics_db.orders_backup;
+----------+-------------+---------------------+--------+------------+
| order_id | customer_id | order_date          | amount | status     |
+----------+-------------+---------------------+--------+------------+
|        1 |         101 | 2024-01-15 14:30:00 |  99.99 | Shipped    |
|        2 |         102 | 2024-02-10 10:00:00 | 149.50 | Delivered  |
|        3 |         103 | 2024-03-05 16:45:00 |  75.00 | Processing |
|        4 |         104 | 2024-04-20 09:15:00 | 200.00 | Delivered  |
|        5 |         105 | 2024-05-12 14:00:00 |  49.99 | Delivered  |
+----------+-------------+---------------------+--------+------------+
5 rows in set (0.00 sec)

Resume upload from breakpoint

Now the mission is interrupted due to an accident.

> ./mo_cdc task pause \
       --task-name "task1" \
       --source-uri "mysql://root:111@127.0.0.1:6001"

During the task interruption, data continues to be inserted into the upstream MatrixOne.

INSERT INTO source_db.orders (order_id, customer_id, order_date, amount, status) VALUES
(11, 111, '2024-06-15 08:30:00', 250.75, 'Processing');
INSERT INTO source_db.orders (order_id, customer_id, order_date, amount, status) VALUES
(12, 112, '2024-07-22 15:45:00', 399.99, 'Shipped');
INSERT INTO source_db.orders (order_id, customer_id, order_date, amount, status) VALUES
(13, 113, '2024-08-30 10:20:00', 599.99, 'Delivered');

mysql> select * from source_db.orders;
+----------+-------------+---------------------+--------+------------+
| order_id | customer_id | order_date          | amount | status     |
+----------+-------------+---------------------+--------+------------+
|        1 |         101 | 2024-01-15 14:30:00 |  99.99 | Shipped    |
|        2 |         102 | 2024-02-10 10:00:00 | 149.50 | Delivered  |
|        3 |         103 | 2024-03-05 16:45:00 |  75.00 | Processing |
|        4 |         104 | 2024-04-20 09:15:00 | 200.00 | Delivered  |
|        5 |         105 | 2024-05-12 14:00:00 |  49.99 | Delivered  |
|       11 |         111 | 2024-06-15 08:30:00 | 250.75 | Processing |
|       12 |         112 | 2024-07-22 15:45:00 | 399.99 | Shipped    |
|       13 |         113 | 2024-08-30 10:20:00 | 599.99 | Delivered  |
+----------+-------------+---------------------+--------+------------+
8 rows in set (0.01 sec)

Manual recovery tasks.

> ./mo_cdc task resume \
       --task-name "task1" \
       --source-uri "mysql://root:111@127.0.0.1:6001"

Connect to the downstream mysql to check the resumed transmission status.

mysql> select * from analytics_db.orders_backup;
+----------+-------------+---------------------+--------+------------+
| order_id | customer_id | order_date          | amount | status     |
+----------+-------------+---------------------+--------+------------+
|        1 |         101 | 2024-01-15 14:30:00 |  99.99 | Shipped    |
|        2 |         102 | 2024-02-10 10:00:00 | 149.50 | Delivered  |
|        3 |         103 | 2024-03-05 16:45:00 |  75.00 | Processing |
|        4 |         104 | 2024-04-20 09:15:00 | 200.00 | Delivered  |
|        5 |         105 | 2024-05-12 14:00:00 |  49.99 | Delivered  |
|       11 |         111 | 2024-06-15 08:30:00 | 250.75 | Processing |
|       12 |         112 | 2024-07-22 15:45:00 | 399.99 | Shipped    |
|       13 |         113 | 2024-08-30 10:20:00 | 599.99 | Delivered  |
+----------+-------------+---------------------+--------+------------+
8 rows in set (0.00 sec)

Apply effects

Through this solution, retail companies can synchronize order data to the analysis database in real time to implement application scenarios such as order statistics, sales trend analysis, and customer behavior insights to support business decisions. At the same time, breakpoint resumption ensures data consistency when network delays or task interruptions occur, so that the data analysis system always maintains an accurate and reliable data source.