Hive 的正则应用,用会了没?
Posted 有关SQL
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hive 的正则应用,用会了没?相关的知识,希望对你有一定的参考价值。
环境:
Hive: 2.7.7
Oracle SQL Developer
Cloudera JDBC Driver
案例 - 1 : regexp_replace(s,regex,replacement)
select regexp_replace(name,'ali','alibaba') as clientName
from default.employee ;
select regexp_replace(name,'[ali]','alibaba') as clientName
from default.employee ;
结果:
将 name 等于 'ali' 替换成 'alibaba', 但不更改表级别的数据
注意:正则表达式一定是符合 Java 正则规范的,因为 Hive 是 Java 写的嘛
案例 - 2 : regexp_extract(subject,regex_pattern,index)
select regexp_extract(name,'[ali]([a-z])*',0)
,regexp_extract(name,'[ali]([a-z])*',1)
from default.employee ;
结果:
注意:index 的选择很重要。它是extract group 的位置
案例 - 3 : where name RLike '.(a|l|i)'
select * from default.employee
where name RLike '.*(a|l|i)+'
数据部署:
insert into default.employee (name,salary,subordinates,deductions,address)
select 'liton'
,345
,array('liton','acai','ayun') as sub
, map('liton',1,'acai',2,'ayun',7) as ded
,named_struct('street',"zhejiang", 'city',"hangzhou", 'state',"hubin", 'zip',"201210")
from student
limit 1
insert into default.employee (name,salary,subordinates,deductions,address)
select 'tencent'
,543
,array('tencent','acai','ayun') as sub
, map('tencent',1,'acai',2,'ayun',7) as ded
,named_struct('street',"zhejiang", 'city',"hangzhou", 'state',"hubin", 'zip',"201210")
from student
limit 1
结果:
往期回顾:
以上是关于Hive 的正则应用,用会了没?的主要内容,如果未能解决你的问题,请参考以下文章