❤️手撕这十道HiveSQL题还不能吊打面试官,却能保你不被吊打❤️推荐收藏
Posted Lansonli
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了❤️手撕这十道HiveSQL题还不能吊打面试官,却能保你不被吊打❤️推荐收藏相关的知识,希望对你有一定的参考价值。
全网最详细的大数据Hive文章系列,强烈建议收藏加关注!
新文章都已经列出历史文章目录,帮助大家回顾前面的知识重点。
目录
系列历史文章
2021年大数据Hive(五):Hive的内置函数(数学、字符串、日期、条件、转换、行转列)
2021年大数据Hive(三):手把手教你如何吃透Hive数据库和表操作(学会秒变数仓大佬)
2021年大数据Hive(二):Hive的三种安装模式和MySQL搭配使用
前言
2021大数据领域优质创作博客,带你从入门到精通,该博客每天更新,逐渐完善大数据各个知识体系的文章,帮助大家更高效学习。
下面是基础的十道HiveSQL题,必须都能手撕出来,不然面试得凉凉了,离吊打面试官你还差100道HiveSQL题,先别想这么多,给自己一个小目标,搞定这十道题!
HiveSQL十题
第一题
1、需求
userId
| visitDate | visitCount |
u01
|
2021/1/21
| 5 |
u02
| 2021/1/23 | 6 |
u03
| 2021/1/22 | 8 |
u04
| 2021/1/20 | 3 |
u01
| 2021/1/23 | 6 |
u01
| 2021/2/21 | 8 |
u02
| 2021/1/23 | 6 |
u01 | 2021/2/22 | 4 |
用户id | 月份 | 小计 | 累计 |
u01 | 2021-01 | 11 | 11 |
u01 | 2021-02 | 12 | 23 |
u02 | 2021-01 | 12 | 12 |
u03 | 2021-01 | 8 | 8 |
u04 | 2021-01 | 3 | 3 |
2、数据准备
CREATE TABLE test_sql.test1 ( userId string, visitDate string, visitCount INT ) ROW format delimited FIELDS TERMINATED BY "\\t";
INSERT INTO TABLE test_sql.test1
VALUES
( 'u01', '2021/1/21', 5 ),
( 'u02', '2021/1/23', 6 ),
( 'u03', '2021/1/22', 8 ),
( 'u04', '2021/1/20', 3 ),
( 'u01', '2021/1/23', 6 ),
( 'u01', '2021/2/21', 8 ),
( 'u02', '2021/1/23', 6 ),
( 'u01', '2021/2/22', 4 );
3、查询SQL
SELECT
t2.userid,
t2.visitmonth,
subtotal_visit_cnt,
sum( subtotal_visit_cnt ) over ( PARTITION BY userid ORDER BY visitmonth ) AS total_visit_cnt
FROM
(
SELECT
userid,
visitmonth,
sum( visitcount ) AS subtotal_visit_cnt
FROM
( SELECT userid, date_format( regexp_replace ( visitdate, '/', '-' ), 'yyyy-MM' ) AS visitmonth, visitcount FROM test_sql.test1 ) t1
GROUP BY
userid,
visitmonth
) t2
ORDER BY
t2.userid,
t2.visitmonth;
4、执行结果
第二题
1、需求
user_id | shop |
u1 | a |
u2 | b |
u1 | b |
u1 | a |
u3 | c |
u4 | b |
u1 | a |
u2 | c |
u5 | b |
u4 | b |
u6 | c |
u2 | c |
u1 | b |
u2 | a |
u2 | a |
u3 | a |
u5 | a |
u5 | a |
u5 | a |
请统计:
2、数据准备
CREATE TABLE test_sql.test2 ( user_id string, shop string ) ROW format delimited FIELDS TERMINATED BY '\\t';
INSERT INTO TABLE test_sql.test2
VALUES
( 'u1', 'a' ),
( 'u2', 'b' ),
( 'u1', 'b' ),
( 'u1', 'a' ),
( 'u3', 'c' ),
( 'u4', 'b' ),
( 'u1', 'a' ),
( 'u2', 'c' ),
( 'u5', 'b' ),
( 'u4', 'b' ),
( 'u6', 'c' ),
( 'u2', 'c' ),
( 'u1', 'b' ),
( 'u2', 'a' ),
( 'u2', 'a' ),
( 'u3', 'a' ),
( 'u5', 'a' ),
( 'u5', 'a' ),
( 'u5', 'a' );
3、查询SQL实现
# 每个店铺的UV(访客数)
SELECT shop,count(DISTINCT user_id) FROM test_sql.test2 GROUP BY shop
方式2:
#每个店铺的UV(访客数)
SELECT
t.shop,
count(*)
FROM
( SELECT user_id, shop FROM test_sql.test2 GROUP BY user_id, shop ) t
GROUP BY
t.shop;
(2)
#每个店铺访问次数top3的访客信息。输出店铺名称、访客id、访问次数
SELECT
t2.shop,
t2.user_id,
t2.cnt
FROM
(
SELECT
t1.*,
row_number() over ( PARTITION BY t1.shop ORDER BY t1.cnt DESC ) rank
FROM
( SELECT user_id, shop, count(*) AS cnt FROM test_sql.test2 GROUP BY user_id, shop ) t1
) t2
WHERE
rank <= 3;
4、执行结果
(1)
(2)
第三题
1、需求
2、数据准备
CREATE TABLE test_sql.test3 ( dt string, order_id string, user_id string, amount DECIMAL ( 10, 2 ) ) ROW format delimited FIELDS TERMINATED BY '\\t';
INSERT INTO TABLE test_sql.test3
VALUES
( '2021-01-01', '10029028', '1000003251', 33.57 );
INSERT INTO TABLE test_sql.test3
VALUES
( '2021-01-01', '10029029', '1000003251', 33.57 );
INSERT INTO TABLE test_sql.test3
VALUES
( '2021-01-01', '100290288', '1000003252', 33.57 );
INSERT INTO TABLE test_sql.test3
VALUES
( '2021-02-02', '10029088', '1000003251', 33.57 );
INSERT INTO TABLE test_sql.test3
VALUES
( '2021-02-02', '100290281', '1000003251', 33.57 );
INSERT INTO TABLE test_sql.test3
VALUES
( '2021-02-02', '100290282', '1000003253', 33.57 );
INSERT INTO TABLE test_sql.test3
VALUES
( '2021-11-02', '10290282', '100003253', 234 );
INSERT INTO TABLE test_sql.test3
VALUES
( '2018-11-02', '10290284', '100003243', 234 );
3、查询SQL
SELECT
t1.mon,
count( t1.order_id ) AS order_cnt,
count( DISTINCT t1.user_id ) AS user_cnt,
sum( amount ) AS total_amount
FROM
(
SELECT
order_id,
user_id,
amount,
date_format( dt, 'yyyy-MM' ) mon
FROM
test_sql.test3
WHERE
date_format( dt, 'yyyy' ) = '2021'
) t1
GROUP BY
t1.mon;
SELECT
count( user_id )
FROM
test_sql.test3
GROUP BY
user_id
HAVING
date_format( min( dt ), 'yyyy-MM' )= '2021-11';
4、 执行结果
(1)
(2)
第四题
1、需求
2、数据准备
CREATE TABLE test_sql.test4user ( user_id string, NAME string, age INT );
CREATE TABLE test_sql.test4log ( user_id string, url string );
INSERT INTO TABLE test_sql.test4user
VALUES
( '001', 'u1', 10 );
INSERT INTO TABLE test_sql.test4user
VALUES
( '002', 'u2', 15 );
INSERT INTO TABLE test_sql.test4user
VALUES
( '003', 'u3', 15 );
INSERT INTO TABLE test_sql.test4user
VALUES
( '004', 'u4', 20 );
INSERT INTO TABLE test_sql.test4user
VALUES
( '005', 'u5', 25 );
INSERT INTO TABLE test_sql.test4user
VALUES
( '006', 'u6', 35 );
INSERT INTO TABLE test_sql.test4user
VALUES
( '007', 'u7', 40 );
INSERT INTO TABLE test_sql.test4user
VALUES
( '008', 'u8', 45 );
INSERT INTO TABLE test_sql.test4user
VALUES
( '009', 'u9', 50 );
INSERT INTO TABLE test_sql.test4user
VALUES
( '0010', 'u10', 65 );
INSERT INTO TABLE test_sql.test4log
VALUES
( '001', 'url1' );
INSERT INTO TABLE test_sql.test4log
VALUES
( '002', 'url1' );
INSERT INTO TABLE test_sql.test4log
VALUES
( '003', 'url2' );
INSERT INTO TABLE test_sql.test4log
VALUES
( '004', 'url3' );
INSERT INTO TABLE test_sql.test4log
VALUES
( '005', 'url3' );
INSERT INTO TABLE test_sql.test4log
VALUES
( '006', 'url1' );
INSERT INTO TABLE test_sql.test4log
VALUES
( '007', 'url5' );
INSERT INTO TABLE test_sql.test4log
VALUES
( '008', 'url7' );
INSERT INTO TABLE test_sql.test4log
VALUES
( '009', 'url5' );
INSERT INTO TABLE test_sql.test4log
VALUES
( '0010', 'url1' );
3、查询SQL
方式1
SELECT
t2.age_phase,
sum( t1.cnt ) AS view_cnt
FROM
( SELECT user_id, count(*) cnt FROM test_sql.test4log GROUP BY user_id ) t1
JOIN (
SELECT
user_id,
CASE
WHEN age <= 10 AND age > 0 THEN
'0-10'
WHEN age <= 20 AND age > 10 THEN
'10-20'
WHEN age > 20
AND age <= 30 THEN '20-30' WHEN age > 30
AND age <= 40 THEN '30-40' WHEN age > 40
AND age <= 50 THEN '40-50' WHEN age > 50
AND age <= 60 THEN '50-60' WHEN age > 60
AND age <= 70 THEN
'60-70' ELSE '70以上'
END AS age_phase
FROM
test_sql.test4user
) t2 ON t1.user_id = t2.user_id
GROUP BY
t2.age_phase;
方式2
SELECT
concat( phase - 10, '-', phase ),
sum( cnt ) sum_movies
FROM
(
SELECT
*,
ceil( age / 10 ) * 10 phase
FROM
test4user a
JOIN ( SELECT user_id, count( url ) cnt FROM test4log GROUP BY user_id ) b ON a.user_id = b.user_id
) c
GROUP BY
c.phase;
4、执行结果
方式1
方式2
第五题
1、需求
日期 用户 年龄
2019-02-11,test_1,23
2019-02-11,test_2,19
2019-02-11,test_3,39
2019-02-11,test_1,23
2019-02-11,test_3,39
2019-02-11,test_1,23
2019-02-12,test_2,19
2019-02-13,test_1,23
2019-02-15,test_2,19
2019-02-16,test_2,19
|
2、数据准备
CREATE TABLE test5 ( dt string, user_id string, age INT ) ROW format delimited FIELDS TERMINATED BY ',';
INSERT INTO TABLE test_sql.test5
VALUES
( '2019-02-11', 'test_1', 23 );
INSERT INTO TABLE test_sql.test5
VALUES
( '2019-02-11', 'test_2', 19 );
INSERT INTO TABLE test_sql.test5
VALUES
( '2019-02-11', 'test_3', 39 );
INSERT INTO TABLE test_sql.test5
VALUES
( '2019-02-11', 'test_1', 23 );
INSERT INTO TABLE test_sql.test5
VALUES
( '2019-02-11', 'test_3', 39 );
INSERT INTO TABLE test_sql.test5
VALUES
( '2019-02-11', 'test_1', 23 );
INSERT INTO TABLE test_sql.test5
VALUES
( '2019-02-12', 'test_2', 19 );
INSERT INTO TABLE test_sql.test5
VALUES
( '2019-02-13', 'test_1', 23 );
INSERT INTO TABLE test_sql.test5
VALUES
( '2019-02-15', 'test_2', 19 );
INSERT INTO TABLE test_sql.test5
VALUES
( '2019-02-16', 'test_2', 19 );
3、查询SQL
方式1
SELECT
sum( total_user_cnt ) total_user_cnt,
sum( total_user_avg_age ) total_user_avg_age,
sum( two_days_cnt ) two_days_cnt,
sum( avg_age ) avg_age
FROM
(
SELECT
0 total_user_cnt,
0 total_user_avg_age,
count(*) AS two_days_cnt,
cast(
sum( age ) / count(*) AS DECIMAL ( 5, 2 )) AS avg_age
FROM
(
SELECT
user_id,
max( age ) age
FROM
(
SELECT
user_id,
max( age ) age
FROM
(
SELECT
user_id,
age,
date_sub( dt, rank ) flag
FROM
(
SELECT
dt,
user_id,
max( age ) age,
row_number() over ( PARTITION BY user_id ORDER BY dt ) rank
FROM
test_sql.test5
GROUP BY
dt,
user_id
) t1
) t2
GROUP BY
user_id,
flag
HAVING
count(*) >= 2
) t3
GROUP BY
user_id
) t4 UNION ALL
SELECT
count(*) total_user_cnt,
cast(
sum( age ) / count(*) AS DECIMAL ( 5, 2 )) total_user_avg_age,
0 two_days_cnt,
0 avg_age
FROM
( SELECT user_id, max( age ) age FROM test_sql.test5 GROUP BY user_id ) t5
) t6;
方式2
SELECT
*
FROM
(
SELECT
count( user_id ) total_cnt_users,
avg( age ) total_avg_age
FROM
( SELECT user_id, max( age ) age FROM test5 GROUP BY user_id ) g
) h
CROSS JOIN (
SELECT
count( user_id ) hot_users_count,
avg( age ) hot_age_avg
FROM
(
SELECT
d.user_id,
max( d.age ) age
FROM
(
SELECT
user_id,
max( age ) age,
count( 1 ) cnt
FROM
(
SELECT
*,
date_sub( dt, rank ) dt2
FROM
(
SELECT
user_id,
dt,
max( age ) age,
ROW_NUMBER() over ( PARTITION BY a.user_id ORDER BY a.dt ) rank
FROM
( SELECT DISTINCT dt, age, user_id FROM test5 ) a
GROUP BY
a.user_id,
a.dt
) b
) c
GROUP BY
c.user_id,
c.dt2
HAVING
cnt > 1
) d
GROUP BY
d.user_id
) e
) f ON 1 = 1;
4、执行结果
方式1
方式2
第六题
1、需求
2、数据准备
CREATE TABLE test_sql.test6 ( userid string, money DECIMAL ( 10, 2 ), paymenttime string, orderid string );
INSERT INTO TABLE test_sql.test6
VALUES
( '001', 100, '2021-10-01', '123' );
INSERT INTO TABLE test_sql.test6
VALUES
( '001', 200, '2021-10-02', '124' );
INSERT INTO TABLE test_sql.test6
VALUES
( '002', 500, '2021-10-01', '125' );
INSERT INTO TABLE test_sql.test6
VALUES
( '001', 100, '2021-11-01', '126' );
3、查询SQL
SELECT
userid,
paymenttime,
money,
orderid
FROM
(
SELECT
userid,
money,
paymenttime,
orderid,
row_number() over ( PARTITION BY userid ORDER BY paymenttime ) rank
FROM
test_sql.test6
WHERE
date_format( paymenttime, 'yyyy-MM' ) = '2021-10'
) t
WHERE
rank = 1;
4、执行结果
第七题
1、需求
序号 | 字段名称 | 字段描述 | 字段类型 |
1 | BOOK_ID | 总编号 | 文本 |
2 | SORT | 分类号 | 文本 |
3 | BOOK_NAME | 书名 | 文本 |
4 | WRITER | 作者 | 文本 |
5 | OUTPUT | 出版单位 | 文本 |
6 | PRICE | 单位 | 数值(保留小数点后2位) |
读者(数据表名:READER)
序号 | 字段名称 | 字段描述 | 字段类型 |
1 | READER_ID | 借书证号 | 文本 |
2 | COMPANY | 单位 | 文本 |
3 | NAME | 姓名 | 文本 |
4 | SEX | 性别 | 文本 |
5 | GRADE | 职称 | 文本 |
6 | ADDR | 地址 | 文本 |
序号 | 字段名称 | 字段描述 | 字段类型 |
1 | READER_ID | 借书证号 | 文本 |
2 | BOOK_ID | 总编号 | 文本 |
3 | BORROW_DATE | 借书日期 | 日期 |
2、数据准备
CREATE TABLE test_sql.book (
book_id string,
`SORT` string,
book_name string,
writer string,
OUTPUT string,
price DECIMAL ( 10, 2 ));
INSERT INTO TABLE test_sql.book
VALUES
( '001', 'TP391', '信息处理', 'author1', '机械工业出版社', '20' );
INSERT INTO TABLE test_sql.book
VALUES
( '002', 'TP392', '数据库', 'author12', '科学出版社', '15' );
INSERT INTO TABLE test_sql.book
VALUES
( '003', 'TP393', '计算机网络', 'author3', '机械工业出版社', '29' );
INSERT INTO TABLE test_sql.book
VALUES
( '004', 'TP399', '微机原理', 'author4', '科学出版社', '39' );
INSERT INTO TABLE test_sql.book
VALUES
( '005', 'C931', '管理信息系统', 'author5', '机械工业出版社', '40' );
INSERT INTO TABLE test_sql.book
VALUES
( '006', 'C932', '运筹学', 'author6', '科学出版社', '55' );
INSERT INTO TABLE test_sql.book
VALUES
( '007', 'C939', '大数据平台架构与原型实现', 'author7', '高等教育出版社', '66' );
(2)创建读者表reader
CREATE TABLE test_sql.reader ( reader_id string, company string, NAME string, sex string, grade string, addr string );
INSERT INTO TABLE test_sql.reader
VALUES
( '0001', '阿里巴巴', 'jack', '男', 'vp', 'addr1' );
INSERT INTO TABLE test_sql.reader
VALUES
( '0002', '百度', 'robin', '男', 'vp', 'addr2' );
INSERT INTO TABLE test_sql.reader
VALUES
( '0003', '腾讯', 'tony', '男', 'vp', 'addr3' );
INSERT INTO TABLE test_sql.reader
VALUES
( '0004', '京东', 'jasper', '男', 'cfo', 'addr4' );
INSERT INTO TABLE test_sql.reader
VALUES
( '0005', '网易', 'zhangsan', '女', 'ceo', 'addr5' );
INSERT INTO TABLE test_sql.reader
VALUES
( '0006', '搜狐', 'lisi', '女', 'ceo', 'addr6' );
INSERT INTO TABLE test_sql.reader
VALUES
( '0007', '美团', '李哥', '男', '大数据开发', 'addr7' );
(3)创建借阅记录表borrow_log
CREATE TABLE test_sql.borrow_log ( reader_id string, book_id string, borrow_date string );
INSERT INTO TABLE test_sql.borrow_log
VALUES
( '0001', '002', '2021-10-14' );
INSERT INTO TABLE test_sql.borrow_log
VALUES
( '0002', '001', '2021-10-13' );
INSERT INTO TABLE test_sql.borrow_log
VALUES
( '0003', '005', '2021-09-14' );
INSERT INTO TABLE test_sql.borrow_log
VALUES
( '0004', '006', '2021-08-15' );
INSERT INTO TABLE test_sql.borrow_log
VALUES
( '0005', '003', '2021-10-10' );
INSERT INTO TABLE test_sql.borrow_log
VALUES
( '0006', '004', '2021-12-13' );
INSERT INTO TABLE test_sql.borrow_log
VALUES
( '0007', '003', '2021-10-16' );
INSERT INTO TABLE test_sql.borrow_log
VALUES
( '0007', '008', '2021-10-16' );
3、查询SQL
(1)创建图书管理库的图书、读者和借阅三个基本表的表结构。请写出建表语句。
CREATE TABLE test_sql.book (
book_id string,
`SORT` string,
book_name string,
writer string,
OUTPUT string,
price DECIMAL ( 10, 2 ));
CREATE TABLE test_sql.reader ( reader_id string, company string, NAME string, sex string, grade string, addr string );
CREATE TABLE test_sql.borrow_log ( reader_id string, book_id string, borrow_date string );
(2) 找出姓李的读者姓名(NAME)和所在单位(COMPANY)。
SELECT name, company FROM test_sql.reader WHERE name LIKE '李%';
(3)查找“高等教育出版社”的所有图书名称(BOOK_NAME)及单价(PRICE),结果按单价降序排序。
SELECT
book_name,
price
FROM
test_sql.book
WHERE
OUTPUT = "高等教育出版社"
ORDER BY
price DESC;
(4) 查找价格介于10元和20元之间的图书种类(SORT)出版单位(OUTPUT)和单价(PRICE),结 果按出版单位(OUTPUT)和单价(PRICE)升序排序。
方式1
SELECT
sort,
output,
price
FROM
test_sql.book
WHERE
price >= 10
AND price <= 20
ORDER BY
output,
price;
方式2
SELECT
sort,
output,
price
FROM
book
WHERE
price BETWEEN 10
AND 20
ORDER BY
output ASC,
price ASC;
(5)查找所有借了书的读者的姓名(NAME)及所在单位(COMPANY)。
SELECT
b.NAME,
b.company
FROM
test_sql.borrow_log a
JOIN test_sql.reader b ON a.reader_id = b.reader_id;
(6)求”科学出版社”图书的最高单价、最低单价、平均单价。
SELECT
max( price ),
min( price ),
avg( price )
FROM
test_sql.book
WHERE
OUTPUT = '科学出版社';
(7)找出当前至少借阅了2本图书(大于等于2本)的读者姓名及其所在单位。
SELECT
b.NAME,
b.company
FROM
( SELECT reader_id FROM test_sql.borrow_log GROUP BY reader_id HAVING count(*) >= 2 ) a
JOIN test_sql.reader b ON a.reader_id = b.reader_id;
(8)考虑到数据安全的需要,需定时将“借阅记录”中数据进行备份,请使用一条SQL语句,在备份用 户bak下创建与“借阅记录”表结构完全一致的数据表BORROW_LOG_BAK.井且将“借阅记录”中现有 数据全部复制到BORROW_L0G_ BAK中。
CREATE TABLE test_sql.borrow_log_bak AS SELECT
*
FROM
test_sql.borrow_log;
(9)现在需要将原Oracle数据库中数据迁移至Hive仓库,请写出“图书”在Hive中的建表语句(Hive实现,提示:列分隔符|;数据表数据需要外部导入:分区分别以month_part、day_part 命名)
CREATE TABLE book_hive
( book_id string, SORT string, book_name string, writer string, OUTPUT string, price DECIMAL ( 10, 2 ) )
partitioned BY ( month_part string, day_part string )
ROW format delimited FIELDS TERMINATED BY '\\|' stored AS textfile;
(10)Hive中有表A,现在需要将表A的月分区 202106 中 user_id为20000的user_dinner字段更新为bonc8920,其他用户user_dinner字段数据不变,请列出更新的方法步骤。(Hive实现,提示:Hlive中无update语法,请通过其他办法进行数据更新)
方式1:配置hive支持事务操作,分桶表,orc存储格式方式2:第一步找到要更新的数据,将要更改的字段替换为新的值,第二步找到不需要更新的数据,第三步将上两步的数据插入一张新表中。
4、执行结果
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
第八题
1、需求
2、数据准备
CREATE TABLE test_sql.test8 ( `date` string, interface string, ip string );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 11:22:05', '/api/user/login', '110.23.5.23' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 11:23:10', '/api/user/detail', '57.3.2.16' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 23:59:40', '/api/user/login', '200.6.5.166' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 11:14:23', '/api/user/login', '136.79.47.70' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 11:15:23', '/api/user/detail', '94.144.143.141' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 11:16:23', '/api/user/login', '197.161.8.206' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 12:14:23', '/api/user/detail', '240.227.107.145' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 13:14:23', '/api/user/login', '79.130.122.205' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 14:14:23', '/api/user/detail', '65.228.251.189' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 14:15:23', '/api/user/detail', '245.23.122.44' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 14:17:23', '/api/user/detail', '22.74.142.137' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 14:19:23', '/api/user/detail', '54.93.212.87' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 14:20:23', '/api/user/detail', '218.15.167.248' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 14:24:23', '/api/user/detail', '20.117.19.75' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 15:14:23', '/api/user/login', '183.162.66.97' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 16:14:23', '/api/user/login', '108.181.245.147' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 14:17:23', '/api/user/login', '22.74.142.137' );
INSERT INTO TABLE test_sql.test8
VALUES
( '2016-11-09 14:19:23', '/api/user/login', '22.74.142.137' );
3、查询SQL
SELECT
ip,
count(*) AS cnt
FROM
test_sql.test8
WHERE
date_format( `date`, 'yyyy-MM-dd HH' ) >= '2016-11-09 14'
AND date_format( `date`, 'yyyy-MM-dd HH' ) < '2016-11-09 15'
AND interface = '/api/user/login'
GROUP BY
ip
ORDER BY
cnt DESC
LIMIT 10;
4、执行结果
第九题
1、需求
`dist_id` int '区组id',`account` string '账号',`money` int '充值金额',`create_time` string '订单时间'
2、数据准备
CREATE TABLE test_sql.test9 ( dist_id string COMMENT '区组id', account string COMMENT '账号', `money` DECIMAL ( 10, 2 ) COMMENT '充值金额', create_time string COMMENT '订单时间' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '11', 100006, '2021-01-02 13:00:01' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '22', 110000, '2021-01-02 13:00:02' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '33', 102000, '2021-01-02 13:00:03' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '44', 100300, '2021-01-02 13:00:04' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '55', 100040, '2021-01-02 13:00:05' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '66', 100005, '2021-01-02 13:00:06' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '77', 180000, '2021-01-03 13:00:07' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '88', 106000, '2021-01-02 13:00:08' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '99', 100400, '2021-01-02 13:00:09' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '12', 100030, '2021-01-02 13:00:10' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '13', 100003, '2021-01-02 13:00:20' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '14', 100020, '2021-01-02 13:00:30' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '15', 100500, '2021-01-02 13:00:40' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '16', 106000, '2021-01-02 13:00:50' );
INSERT INTO TABLE test_sql.test9
VALUES
( '1', '17', 100800, '2021-01-02 13:00:59' );
INSERT INTO TABLE test_sql.test9
VALUES
( '2', '18', 100800, '2021-01-02 13:00:11' );
INSERT INTO TABLE test_sql.test9
VALUES
( '2', '19', 100030, '2021-01-02 13:00:12' );
INSERT INTO TABLE test_sql.test9
VALUES
( '2', '10', 100000, '2021-01-02 13:00:13' );
INSERT INTO TABLE test_sql.test9
VALUES
( '2', '45', 100010, '2021-01-02 13:00:14' );
INSERT INTO TABLE test_sql.test9
VALUES
( '2', '78', 100070, '2021-01-02 13:00:15' );
3、查询SQL
WITH TEMP AS (
SELECT
dist_id,
account,
sum( `money` ) sum_money
FROM
test_sql.test9
WHERE
date_format( create_time, 'yyyy-MM-dd' ) = '2021-01-02'
GROUP BY
dist_id,
account
) SELECT
t1.dist_id,
t1.account,
t1.sum_money
FROM
(
SELECT
temp.dist_id,
temp.account,
temp.sum_money,
rank() over ( PARTITION BY temp.dist_id ORDER BY temp.sum_money DESC ) ranks
FROM
TEMP
) t1
WHERE
ranks = 1;
4、执行结果
第十题
1、需求
dist_id string '区组id',account string '账号',gold int '金币'
2、数据准备
CREATE TABLE test_sql.test10 ( `dist_id` string COMMENT '区组id', `account` string COMMENT '账号', `gold` INT COMMENT '金币' );
INSERT INTO TABLE test_sql.test10
VALUES
( '1', '77', 18 );
INSERT INTO TABLE test_sql.test10
VALUES
( '1', '88', 106 );
INSERT INTO TABLE test_sql.test10
VALUES
( '1', '99', 10 );
INSERT INTO TABLE test_sql.test10
VALUES
( '1', '12', 13 );
INSERT INTO TABLE test_sql.test10
VALUES
( '1', '13', 14 );
INSERT INTO TABLE test_sql.test10
VALUES
( '1', '14', 25 );
INSERT INTO TABLE test_sql.test10
VALUES
( '1', '15', 36 );
INSERT INTO TABLE test_sql.test10
VALUES
( '1', '16', 12 );
INSERT INTO TABLE test_sql.test10
VALUES
( '1', '17', 158 );
INSERT INTO TABLE test_sql.test10
VALUES
( '2', '18', 12 );
INSERT INTO TABLE test_sql.test10
VALUES
( '2', '19', 44 );
INSERT INTO TABLE test_sql.test10
VALUES
( '2', '10', 66 );
INSERT INTO TABLE test_sql.test10
VALUES
( '2', '45', 80 );
INSERT INTO TABLE test_sql.test10
VALUES
( '2', '78', 98 );
3、 查询SQL
SELECT
dist_id,
account,
gold
FROM
( SELECT dist_id, account, gold, row_number () over ( PARTITION BY dist_id ORDER BY gold DESC ) rank FROM test_sql.test10 ) t
WHERE
rank <= 10;
4、执行结果
最后
以上十道HiveSQL熟练掌握可以有效阻止被面试官的吊打,但是离吊打面试官的实力,还得需要经典100道Hive题加训练,后续会继续创作下一篇经典100道Hive题吊打面试官系列,强烈建议收藏加关注!
- 📢博客主页:https://lansonli.blog.csdn.net
- 📢欢迎点赞 👍 收藏 ⭐留言 📝 如有错误敬请指正!
- 📢本文由 Lansonli 原创,首发于 CSDN博客🙉
- 📢大数据系列文章会每天更新,停下休息的时候不要忘了别人还
以上是关于❤️手撕这十道HiveSQL题还不能吊打面试官,却能保你不被吊打❤️推荐收藏的主要内容,如果未能解决你的问题,请参考以下文章
❤️爆肝!整理了一周的Spring面试大全含答案,吊打Java面试官建议收藏!❤️