Flink CDC 实战

Posted 宝哥大数据

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Flink CDC 实战相关的知识,希望对你有一定的参考价值。

本文基于 Flink-1.12

一、操作

1.1、mysql 创建表结构

-- MySQL
CREATE DATABASE mydb;
USE mydb;
CREATE TABLE products (
  id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  description VARCHAR(512)
);
ALTER TABLE products AUTO_INCREMENT = 101;

INSERT INTO products
VALUES (default,"scooter","Small 2-wheel scooter"),
       (default,"car battery","12V car battery"),
       (default,"12-pack drill bits","12-pack of drill bits with sizes ranging from #40 to #3"),
       (default,"hammer","12oz carpenter's hammer"),
       (default,"hammer","14oz carpenter's hammer"),
       (default,"hammer","16oz carpenter's hammer"),
       (default,"rocks","box of assorted rocks"),
       (default,"jacket","water resistent black wind breaker"),
       (default,"spare tire","24 inch spare tire");

CREATE TABLE orders (
  order_id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
  order_date DATETIME NOT NULL,
  customer_name VARCHAR(255) NOT NULL,
  price DECIMAL(10, 5) NOT NULL,
  product_id INTEGER NOT NULL,
  order_status BOOLEAN NOT NULL -- 是否下单
) AUTO_INCREMENT = 10001;

INSERT INTO orders
VALUES (default, '2020-07-30 10:08:22', 'Jark', 50.50, 102, false),
       (default, '2020-07-30 10:11:09', 'Sally', 15.00, 105, false),
       (default, '2020-07-30 12:00:30', 'Edward', 25.25, 106, false);


 CREATE TABLE shipments (
  shipment_id INTEGER NOT NULL AUTO_INCREMENT  PRIMARY KEY,
  order_id INTEGER NOT NULL,
  origin VARCHAR(255) NOT NULL,
  destination VARCHAR(255) NOT NULL,
  is_arrived BOOLEAN NOT NULL
);

ALTER TABLE shipments AUTO_INCREMENT = 101;

INSERT INTO shipments
VALUES (default,10001,'Beijing','Shanghai',false),
       (default,10002,'Hangzhou','Shanghai',false),
       (default,10003,'Shanghai','Hangzhou',false);

1.2、JDBC Connector

1.2.0、下载以下 jar 包到 <FLINK_HOME>/lib/:

flink-sql-connector-mysql-cdc-1.0.0.jar

1.2.1、Flink SQL 创建表结构

启动 Flink 集群,再启动 SQL CLI

 -- Flink SQL
	   
CREATE TABLE products (
  id INT,
  name STRING,
  description STRING
) WITH (
  'connector' = 'jdbc',
  'url' = 'jdbc:mysql://localhost:3306/mydb',
  'username' = 'root',
  'password' = '123456',
  'table-name' = 'products'
);




CREATE TABLE orders (
  order_id INT,
  order_date TIMESTAMP(0),
  customer_name STRING,
  price DECIMAL(10, 5),
  product_id INT,
  order_status BOOLEAN
) WITH (
  'connector' = 'jdbc',
  'url' = 'jdbc:mysql://localhost:3306/mydb',
  'username' = 'root',
  'password' = '123456',
  'table-name' = 'orders'
);

CREATE TABLE shipments (
  shipment_id INT,
  order_id INT,
  origin STRING,
  destination STRING,
  is_arrived BOOLEAN
) WITH (
  'connector' = 'jdbc',
  'url' = 'jdbc:mysql://locahost:3306/mydb',
  'username' = 'root',
  'password' = '123456',
  'table-name' = 'shipments'
);

1.2.3、测试

1、多表关联

SELECT o.*, p.name, p.description, s.shipment_id, s.origin, s.destination, s.is_arrived
FROM orders AS o
LEFT JOIN products AS p ON o.product_id = p.id
LEFT JOIN shipments AS s ON o.order_id = s.order_id;

2、插入新的 订单表 和 出货表

--MySQL
INSERT INTO orders
VALUES (default, '2020-07-30 15:22:00', 'Jark', 29.71, 104, false);

--MySQL
INSERT INTO shipments
VALUES (default,10004,'Shanghai','Beijing',false);

3、修改订单状态

--MySQL
UPDATE orders SET order_status = true WHERE order_id = 10004;

多表联合的数据 的 状态并没有变化,Flink SQL Connector 不能实时的更新数据的状态。

1.3、Flink CDC

为了能够实时捕获 数据库 的 动态变更, 解决上面问题

1.3.1、Flink SQL CLI 创建表

--FlinkSQL
CREATE TABLE products (
  id INT,
  name STRING,
  description STRING
) WITH (
  'connector' = 'mysql-cdc',
  'hostname' = 'localhost',
  'port' = '3306',
  'username' = 'root',
  'password' = '123456',
  'database-name' = 'mydb',
  'table-name' = 'products'
);

CREATE TABLE orders (
  order_id INT,
  order_date TIMESTAMP(0),
  customer_name STRING,
  price DECIMAL(10, 5),
  product_id INT,
  order_status BOOLEAN
) WITH (
  'connector' = 'mysql-cdc',
  'hostname' = 'localhost',
  'port' = '3306',
  'username' = 'root',
  'password' = '123456',
  'database-name' = 'mydb',
  'table-name' = 'orders'
);

CREATE TABLE shipments (
  shipment_id INT,
  order_id INT,
  origin STRING,
  destination STRING,
  is_arrived BOOLEAN
) WITH (
  'connector' = 'mysql-cdc',
  'hostname' = 'localhost',
  'port' = '3306',
  'username' = 'root',
  'password' = '123456',
  'database-name' = 'mydb',
  'table-name' = 'shipments'
);

二、问题

1、[ERROR] Could not execute SQL statement. Reason: java.lang.ClassNotFoundException: com.alibaba.ververica.cdc.debezium.DebeziumSourceFunction
Flink 与 MySql-cdc的版本兼容问题

关注我的公众号【宝哥大数据】

以上是关于Flink CDC 实战的主要内容,如果未能解决你的问题,请参考以下文章

Flink 实战系列Flink CDC 实时同步 Mysql 全量加增量数据到 Hudi

Doris通过Flink CDC接入MySQL实战

Flink SQL CDC

Flink SQL CDC

Flink SQL CDC

Flink CDC详细教程(介绍原理代码样例)