Hudi基础 -- Spark SQL DDL

Posted 2023-03-02 墨砚

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Hudi基础 -- Spark SQL DDL相关的知识，希望对你有一定的参考价值。

5.CTAS(Create table as select) 创建表

5.1 CTAS创建非分区表

5.2 CTAS创建分区、主键表

Spark Create Table 关键参数：

参数名称	描述	可选项:默认值
primaryKey	表的主键名称，组合主键使用逗号分隔；	(Optional) : id
type	表类型：cow’ 或 ‘mor’，默认是cow;	(Optional) : cow
preCombineField	当数据的主键相同时，会根据这个字段判断是否要更新此主键的数据。不指定默认保留最新 ;	(Optional) : ts

1.创建一个cow类型的表

-- create a non-primary key table
create table if not exists hudi_table2(
  id int,
  name string,
  price double
) using hudi
options (
  type = 'cow'
);

2.创建一个cow类型，主键为ID的表

-- create a managed cow table
create table if not exists hudi_table0 (
  id int,
  name string,
  price double
) using hudi
options (
  type = 'cow',
  primaryKey = 'id'
);

3.创建mor类型主键更新的表

create table if not exists hudi_table1 (
  id int,
  name string,
  price double,
  ts bigint
) using hudi
options (
  type = 'mor',
  primaryKey = 'id,name',
  preCombineField = 'ts'
);

4. 创建分区表

create table if not exists hudi_table_p0 (
id bigint,
name string,
dt string,
hh string  
) using hudi
options (
  type = 'cow',
  primaryKey = 'id',
  preCombineField = 'ts'
)
partitioned by (dt, hh);

5.CTAS(Create table as select) 创建表

Hudi支持 CTAS(Create table as select)的方式创建表

5.1 CTAS创建非分区表

create table h3 using hudi
as
select 1 as id, 'a1' as name, 10 as price;

5.2 CTAS创建分区、主键表

create table h2 using hudi
options (type = 'cow', primaryKey = 'id')
partitioned by (dt)
as
select 1 as id, 'a1' as name, 10 as price, 1000 as dt;

更多的 ddl 参考 SQL DDL | Apache Hudi

以上是关于Hudi基础 -- Spark SQL DDL的主要内容，如果未能解决你的问题，请参考以下文章