MYSQL/HIVESQL笔试题:HIVESQL

Posted 秋华

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了MYSQL/HIVESQL笔试题:HIVESQL相关的知识,希望对你有一定的参考价值。

4 手写HQL 4

已知一个表STG.ORDER,有如下字段:DateOrder_idUser_idamount。请给出sql进行统计:数据样例:2017-01-01,10029028,1000003251,33.57

1)给出 2017年每个月的订单数、用户数、总成交金额。

2)给出201711月的新客数(指在11月才有第一笔订单)

建表

create table order_tab(dt string,order_id string,user_id string,amount decimal(10,2)) row format delimited fields terminated by \'\\t\';

1)给出 2017年每个月的订单数、用户数、总成交金额。

 

select
   date_format(dt,\'yyyy-MM\'),
   count(order_id),
   count(distinct user_id),
   sum(amount)
from
   order_tab
where
   date_format(dt,\'yyyy\')=\'2017\'
group by
   date_format(dt,\'yyyy-MM\');

 

2)给出201711月的新客数(指在11月才有第一笔订单)

 

select
   count(user_id)
from
   order_tab
group by
   user_id
having
   date_format(min(dt),\'yyyy-MM\')=\'2017-11\';

 

5 手写HQL 5

有日志如下,请写出代码求得所有用户和活跃用户的总数及平均年龄。(活跃用户指连续两天都有访问记录的用户)日期 用户 年龄

数据集

2019-02-11,test_1,23
2019-02-11,test_2,19
2019-02-11,test_3,39
2019-02-11,test_1,23
2019-02-11,test_3,39
2019-02-11,test_1,23
2019-02-12,test_2,19
2019-02-13,test_1,23
2019-02-15,test_2,19
2019-02-16,test_2,19

1)建表

create table user_age(dt string,user_id string,age int)row format delimited fields terminated by \',\';

2)按照日期以及用户分组,按照日期排序并给出排名

select
   dt,
   user_id,
   min(age) age,
   rank() over(partition by user_id order by dt) rk
from
   user_age
group by
   dt,user_id;t1

3)计算日期及排名的差值

select
   user_id,
   age,
   date_sub(dt,rk) flag
from
   t1;t2

4)过滤出差值大于等于2的,即为连续两天活跃的用户

 

select
   user_id,
   min(age) age
from
   t2
group by
   user_id,flag
having
   count(*)>=2;t3

 

5)对数据进行去重处理(一个用户可以在两个不同的时间点连续登录),例如:a用户在110111号以及120号和1214天登录。

 

select
   user_id,
   min(age) age
from
   t3
group by
   user_id;t4

 

6)计算活跃用户(两天连续有访问)的人数以及平均年龄

select
   count(*) ct,
   cast(sum(age)/count(*) as decimal(10,2))
from t4;

7)对全量数据集进行按照用户去重

select
   user_id,
   min(age) age 
from
   user_age 
group by 
   user_id;t5

8)计算所有用户的数量以及平均年龄

select
   count(*) user_count,
   cast((sum(age)/count(*)) as decimal(10,1)) 
from 
   t5;

9)将第5步以及第7步两个数据集进行union all操作

 

select
   0 user_total_count,
   0 user_total_avg_age,
   count(*) twice_count,
   cast(sum(age)/count(*) as decimal(10,2)) twice_count_avg_age
from 
(
   select
   user_id,
   min(age) age
from
   (select
   user_id,
   min(age) age
from
   (
    select
   user_id,
   age,
   date_sub(dt,rk) flag
from
   (
    select
       dt,
       user_id,
       min(age) age,
       rank() over(partition by user_id order by dt) rk
    from
       user_age
    group by
       dt,user_id
    )t1
    )t2
group by
   user_id,flag
having
   count(*)>=2)t3
group by
   user_id 
)t4

union all

select
   count(*) user_total_count,
   cast((sum(age)/count(*)) as decimal(10,1)),
   0 twice_count,
   0 twice_count_avg_age
from 
   (
      select
         user_id,
         min(age) age 
      from 
         user_age 
      group by 
         user_id
   )t5;t6

 

10)求和并拼接为最终SQL

 

select 
    sum(user_total_count),
    sum(user_total_avg_age),
    sum(twice_count),
    sum(twice_count_avg_age)
from 
(select
   0 user_total_count,
   0 user_total_avg_age,
   count(*) twice_count,
   cast(sum(age)/count(*) as decimal(10,2)) twice_count_avg_age
from 
(
   select
   user_id,
   min(age) age
from
   (select
   user_id,
   min(age) age
from
   (
    select
   user_id,
   age,
   date_sub(dt,rk) flag
from
   (
    select
       dt,
       user_id,
       min(age) age,
       rank() over(partition by user_id order by dt) rk
    from
       user_age
    group by
       dt,user_id
    )t1
    )t2
group by
   user_id,flag
having
   count(*)>=2)t3
group by
   user_id 
)t4

union all

select
   count(*) user_total_count,
   cast((sum(age)/count(*)) as decimal(10,1)),
   0 twice_count,
   0 twice_count_avg_age
from 
   (
      select
         user_id,
         min(age) age 
      from 
         user_age 
      group by 
         user_id
   )t5)t6;

 

6 手写HQL 6

请用sql写出所有用户中在今年10月份第一次购买商品的金额,表ordertable字段(购买用户:userid,金额:money,购买时间:paymenttime(格式:2017-10-01),订单idorderid

1)建表

 

create table ordertable(
    userid string,
    money int,
    paymenttime string,
    orderid string)
row format delimited fields terminated by \'\\t\';

 

2)查询出

 

select
   userid,
   min(paymenttime) paymenttime
from
   ordertable
where
   date_format(paymenttime,\'yyyy-MM\')=\'2017-10\'
group by
   userid;t1

 

select
   t1.userid,
   t1.paymenttime,
   od.money
from
   t1
join
   ordertable od
on
   t1.userid=od.userid
   and
   t1.paymenttime=od.paymenttime;
select
   t1.userid,
   t1.paymenttime,
   od.money
from
   (select
   userid,
   min(paymenttime) paymenttime
from
   ordertable
where
   date_format(paymenttime,\'yyyy-MM\')=\'2017-10\'
group by
   userid)t1
join
   ordertable od
on
   t1.userid=od.userid
   and
   t1.paymenttime=od.paymenttime;

 

以上是关于MYSQL/HIVESQL笔试题:HIVESQL的主要内容,如果未能解决你的问题,请参考以下文章

MYSQL/HIVESQL笔试题:HIVESQL

MYSQL/HIVESQL笔试题:HIVESQL

MYSQL/HIVESQL笔试题:HIVESQL分组求TopN/行转列/列转行

经典PHP笔试题

留学生编程代写,留学生编程作业代写(微信leechanx)

笔试题Java笔试题知识点