impala基础学习——part1
Posted 桃桃琪的学习日常
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了impala基础学习——part1相关的知识,希望对你有一定的参考价值。
impala介绍:
表名:table_a1
user_id | age | amount | interest |
zhangsan | 23 | 400 | 10 |
lisi | 28 | 2500 | 40 |
wangwu | 18 | 100 | 4 |
wangwu | 18 | 60 | 1 |
表名:table_a2
user_id | log_in | id |
zhangsan | 1 | 上海 |
lisi | 3 | 北京 |
liuer |
5 | 广州 |
SELECT <attributes>
FROM <one or more relations>
WHERE <conditions>
1. select语句
#查询特定列,并显示列名
select user_id,age
from table_a1 as a;
user_id age
zhangsan 23
lisi 28
wangwu 18
wangwu 18
#可以使用表达式、运算符(+,-,*,/)和函数
#upper()全部字母大写,lower()全部字母小写
select user_id,(amount+interest) as '金额'
from table_a1;
user_id 金额
zhangsan 410
lisi 2540
wangwu 104
wangwu 61
select upper(user_id) as '名字'
from table_a1;
名字
ZHANGSAN
LISI
WANGWU
WANGWU
#用where限定条件查询语句
select user_id,age,amount
from table_a1
where amount >100;
user_id age amount
zhangsan 23 400
lisi 28 2500
#用where过滤条件,一般表达式包括:=,!=,<>,like,between,and/or
#between:左闭右闭
select user_id,amount
from table_a1
where interest between 1 and 5 or amount = 2500;
user_id amount
lisi 2500
wangwu 100
wangwu 60
#group by
#sum()加总,count()计数,max()最大,min()最小,avg()平均值
select user_id,sum(amount) as '金额'
from table_a1
group by user_id;
--王五的金额被加总统计了
user_id amount
zhangsan 400
lisi 2500
wangwu 160
select user_id,max(interest) as interest_max
from table_a1
group by 1;
user_id interst_max
lisi 40
#having:可以直接筛选聚合函数
select user_id,sum(amount) as '金额'
from table_a1
group by 1
having sum(amount) >=400;
user_id amount
zhangsan 400
lisi 2500
#数据库中储存的是无序的,默认升序(asc),desc降序
select user_id,age
from table_a1
order by age desc;
user_id age
lisi 28
zhangsan 23
wangwu 18
wangwu 18
#distinct:对第一个提取项去重
select distinct user_id,age
from table_a1;
user_id age
lisi 28
zhangsan 23
wangwu 18
#group by 聚合相当于去重
6.子查询
#from子查询
select distinct user_id
from
(select user_id,age
from table_a1
where amount >0) as a #子查询必须取别名
where age >20;
user_id
zhangsan
lisi
#where子查询
select distinct user_id
from table_a1
where user_id not in
(select user_id
from table_a1
where amount>2000
user_id
zhangsan
wangwu
二. join关联表
•INNER JOIN: 记录同时在左表和右表中存在时才出现在结果集中;
•LEFT OUTER JOIN:左表中的记录全部出现在结果集中,如果右表中没有匹配的记录,则从右表中取出的字段为空;
•RIGHT OUTER JOIN: 与LEFT OUTER JOIN 类似;
•FULL OUTER JOIN: 左表和右表的记录全部出现在结果集中,如果另一表中没有相匹配的记录,则从另一表中取的字段为空;
•ON: 用于指定关联的条件;如果没有指定ON或者是非等级连接,那么会是笛卡尔积,千万要避免;
#inner join 取交集(两张表都有的才会提取)
select user_id,amount,id
from table_a1 as a
inner join table_a2 as b
on a.user_id = b.user_id;
user_id amount id
zhangsan 400 上海
lisi 2500 北京
#left join 右表并到左表
select user_id,amount,id
from table_a1 as a
left join table_a2 as b
on a.user_id = b.user_id
user_id amount id
zhangsan 400 上海
lisi 2500 北京
wangwu 100 null
wangwu 60 null
函数 |
用途 | 举例 |
substr() |
截取字符串 | substr('2019-09-29',8,2) : 29 |
sum/avg()... | 统计函数 |
-- |
cast( as ) | 转换表达式类型 | cast('5' as int) :5(string到int) |
nvl() | 是否空值,并转化 |
nvl(null,0):0 (是空值就变为0) |
isnull() |
是否空值,并转化 | isnull(null,0):0 (是空值就变为0) |
concat() |
拼接字符串 | concat(x,x):xx |
like() | 模糊查询 | like '%xx%':带xx的全部选出来 |
date_add() |
加日期 | date_add(to_date(now(),1):加一天 |
date_sub() |
减日期 | date_sub(to_date(now(),1):减一天 |
datediff() |
两日期差多少天 | datediff('2019-10-10','2019-10-01'): 9 |
from_unixtime() | 秒数转化为日期 | from_unixtime(1556108753,“yyyy-MM-dd”) |
case when() |
模式匹配 |
见下 |
# case when
select user_id
,case when age <=18 then 1 else
when age >18 and age <=23 then 2 else
when age >23 then 3 else 0 end as amount_new
from table_a1;
user_id amount_new
zhangsan 2
lisi 3
wangwu 1
wangwu 1
#case when + 聚合函数
select
sum(case when amount >500 then 1 else 0 end) as renshu
from table_a1;
--amount>500的人,人数(有1个计数1)
renshu
1
DDL数据库模式定义语言(Data Definition Language )
1. 常用数据类型
类别 | 数据类型 | 示例 |
数值 | TYNYINT[1] | 123 |
INT[4] | 12324 | |
BIGINT[8] | 123465 | |
BOOLEAN | TRUE | |
DECIMAL(p, s) | 124.20 (一般decimal(38,2)) |
|
DOUBLE | 124.2 | |
字符 | STRING | ‘abc’ |
CHAR(n) | ‘abc ‘ | |
VARCHAR(n) | ‘abc ‘ | |
日期 | TIMESTAMP | ‘2015-04-07 15:43:02.892’, |
2. 创建表删除表
#删除表
drop table if exists test.wwq_191010_a1;
drop table test.wwq_191010_a1;
#创建表
create table test.wwq_191010_a1 as
#创建外部表
create external table test.wwq_191010_a1
(userid string
,username string
) row format delimited fields terminated by ',' stored as textfile locationg '/external/wwq';
3. 查看表
#查看字段和介绍
describe test.wwq_191010_a1
#查看所有带wwq的表
show table in test like '*wwq*'
#查看表的分区
show partitions test.wwq_191010_a1
#刷新元数据
invalidate metadata table_name
refresh table_name
以上是关于impala基础学习——part1的主要内容,如果未能解决你的问题,请参考以下文章
[vscode]--HTML代码片段(基础版,reactvuejquery)