大数据仓库技术实训任务4

Posted 陈希瑞

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了大数据仓库技术实训任务4相关的知识,希望对你有一定的参考价值。

大数据仓库技术实训——任务4

实验背景:

Student(Sid,Sname,Sage,Ssex)学生表

Sid:学号

Sname:学生姓名

Sbirth:学生生日

Ssex:学生性别


Course(Cid,Cname,T#)课程表

Cid:课程编号

Cname:课程名称

Tid:教师编号


SC(Sid,Cid,score)成绩表

Sid:学号

Cid:课程编号

score:成绩


Teacher(Tid,Tname)教师表

Tid:教师编号:

Tname:教师名字


学生表数据:

01,zhaolei,1990-01-01,nan

02,qiandian,1990-12-21,nan

03,sunfeng,1990-05-20,nv

04,liyun,1990-08-06,nan

05,zhoumei,1991-12-01,nv

06,wulan,1992-03-01,nv

07,zhengzhu,1989-07-01,nv

08,wangju,1990-01-20,nv


课程表数据:

01,yuwen,02

02,shuxue,01

03,yingyu,03


教师表数据:

01,zhangsan

02,lisi

03,wangwu


成绩表数据:

01,01,80

01,02,90

01,03,99

02,01,70

02,02,60

02,03,80

03,01,80

03,02,80

03,03,80

04,01,50

04,02,30

04,03,20

05,01,76

05,02,87

06,01,31

06,03,34

07,02,89

07,03,98

提示:有些题目中wangju同学很特殊别漏掉


1. 创实验背景下的四个表,并映射成功(结构化数据自己做,可用vim也可在windows下创建好后上传)

  • 创建相关表
--创建学生表
create table Student(
Sid int,
    Sname string,
    Sbirth date,
    Ssex string
)
row format delimited
fields terminated by ',';

--创建课程表
create table Course(
Cid int,
    Cname string,
    Tid int
)
row format delimited
fields terminated by ',';

--创建成绩表
create table SC(
Sid int,
Cid int,
score int 
)
row format delimited
fields terminated by ',';

--创建教师表
create table Teacher(
Tid int,
    Tname string
)
row format delimited
fields terminated by ',';

  • 导入数据至表中
load data local inpath "/root/hivedata/任务四-数据/学生表.txt" into table darcy.Student;
load data local inpath "/root/hivedata/任务四-数据/课程表.txt" into table darcy.course;
load data local inpath "/root/hivedata/任务四-数据/成绩表.txt" into table darcy.SC;
load data local inpath "/root/hivedata/任务四-数据/教师表.txt" into table darcy.Teacher;

2. 查询四个表,检验是否映射成功

select * from Student;
select * from Course;
select * from SC;
select * from Teacher;

3. 查询01课程比02课程成绩高的所有学生的学号

select a.Sid,a.score,b.score from 
(select * from SC where SC.Cid=01) a, (select * from SC where SC.Cid=02) b
where a.Sid=b.Sid and a.score>b.score;

4. 查询平均成绩大于60分的同学的学号和平均成绩

select Sid, avg(score) avg_score from SC group by Sid having avg(score)>60;

5. 查询所有同学的学号、姓名、选课数、总成绩

Select stu.Sid,stu.Sname,count(c.Cid) course_nums, sum(c.Score) sum_score from Student stu left join SC c on stu.Sid=c.Sid
group by stu.Sid,stu.Sname;

6. 查询"li"姓老师的数量

select count(Tid) from teacher where Tname like 'li%';

7. 查询学过"zhangsan"授课的同学的信息

select st.* from student st 
left join sc on sc.Sid=st.Sid 
left join course c on c.Cid=sc.Cid
left join teacher t on t.Tid=c.Tid
where t.Tname="zhangsan";

8. 查询没学过"zhangsan"老师授课的同学的信息

select s.* from student s where s.Sid NOT IN(
select 
st.Sid
from student st
left join sc ON sc.Sid=st.Sid
left join course c ON c.Cid=sc.Cid
left join teacher t ON t.Tid=c.Tid
where t.Tname="zhangsan"
);

9. 查询学过编号为"01"并且也学过编号为"02"的课程的同学的信息

--法一
select a.* from Student a,SC b,SC c
where a.Sid=b.Sid and a.Sid=c.Sid 
and b.Cid=01 and c.Cid=02;

--法二
select s.*,sc1.score,sc2.score from student s
left join (select * from sc where cid = '01') sc1 on s.sid = sc1.sid
left join (select * from sc where cid = '02') sc2 on s.sid = sc2.sid
where sc1.cid = '01' and sc2.cid = '02';

10. 查询学过编号为"01"但是没有学过编号为"02"的课程的同学的信息

select s.*,sc1.score,sc2.score from student s
left join (select * from sc where cid = '01') sc1 on s.sid = sc1.sid
left join (select * from sc where cid = '02') sc2 on s.sid = sc2.sid
where sc1.cid = '01' and sc2.cid is null;

11. 查询没有学全所有课程的同学的信息(请尝试用不同的思路解题):

select st.sname,st.sid,st.Sbirth from student st left join SC on st.Sid=SC.Sid
group by st.sid,st.sname,st.Sbirth having count(SC.Cid)<3;

12. 查询男生、女生人数:

select Ssex, count(Sid) from student group by Ssex;

13. 查询平均成绩大于等于85的所有学生的学号、姓名和平均成绩:

select st.Sid, st.Sname,avg(SC.score) avg_score from student st full join SC on SC.Sid=st.Sid
group by  st.sid, st.sname having avg_score>85;

14. 求每门课程的学生人数

select cour.Cname, count(SC.Cid) stu_nums from Course cour left join SC on cour.Cid=SC.Cid
group by cour.Cname;

15. 求学生总成绩按照从高到低的顺序排序

select Sid, sum(score) sum_score from SC group by Sid order by sum_score desc;

16. 检索"01"课程分数小于60,按分数降序排列的学生信息

--法一
select stu.Sid, stu.Sname,stu.Sbirth,stu.Ssex,SC.score from Student stu left join SC on stu.Sid=SC.Sid group by stu.Sid, stu.Sname,Stu.Sbirth,SC.Cid,stu.Ssex,SC.score having SC.Cid=01 and SC.score<60 order by SC.score desc;

--法二(简洁)
select stu.*, SC.score from Student stu 
left join SC on SC.Sid=stu.Sid
where SC.Cid=01 and SC.score<60
order by SC.score desc;

17. 查询张老师教的课的平均成绩

select tname,cid,cname,tid,avg(score) avg_score from (select t2.Tname tname, c.*,t1.score score from Course c
left join (select * from SC ) t1 on c.Cid =t1.Cid
left join (select * from Teacher) t2 on c.Tid=t2.Tid) as k
group by tname,cid,cname,tid having tname like 'zhang%';

18. 查询课程不及格学生信息

select c.Cname, info.* from (select stu.*, SC.Cid, SC.score from Student stu 
left join SC on SC.Sid=stu.Sid
where SC.score<60) as info left join Course c on info.Cid=c.Cid;

以上是关于大数据仓库技术实训任务4的主要内容,如果未能解决你的问题,请参考以下文章

大数据仓库技术实训任务4

大数据仓库技术实训任务4

大数据仓库技术实训任务2

大数据仓库技术实训任务2

大数据仓库技术实训任务2

大数据仓库技术实训任务1