大数据仓库技术实训任务4
Posted 陈希瑞
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了大数据仓库技术实训任务4相关的知识,希望对你有一定的参考价值。
大数据仓库技术实训——任务4
实验背景:
Student(Sid,Sname,Sage,Ssex)学生表
Sid:学号
Sname:学生姓名
Sbirth:学生生日
Ssex:学生性别
Course(Cid,Cname,T#)课程表
Cid:课程编号
Cname:课程名称
Tid:教师编号
SC(Sid,Cid,score)成绩表
Sid:学号
Cid:课程编号
score:成绩
Teacher(Tid,Tname)教师表
Tid:教师编号:
Tname:教师名字
学生表数据:
01,zhaolei,1990-01-01,nan
02,qiandian,1990-12-21,nan
03,sunfeng,1990-05-20,nv
04,liyun,1990-08-06,nan
05,zhoumei,1991-12-01,nv
06,wulan,1992-03-01,nv
07,zhengzhu,1989-07-01,nv
08,wangju,1990-01-20,nv
课程表数据:
01,yuwen,02
02,shuxue,01
03,yingyu,03
教师表数据:
01,zhangsan
02,lisi
03,wangwu
成绩表数据:
01,01,80
01,02,90
01,03,99
02,01,70
02,02,60
02,03,80
03,01,80
03,02,80
03,03,80
04,01,50
04,02,30
04,03,20
05,01,76
05,02,87
06,01,31
06,03,34
07,02,89
07,03,98
提示:有些题目中wangju同学很特殊别漏掉
1. 创实验背景下的四个表,并映射成功(结构化数据自己做,可用vim也可在windows下创建好后上传)
- 创建相关表
--创建学生表
create table Student(
Sid int,
Sname string,
Sbirth date,
Ssex string
)
row format delimited
fields terminated by ',';
--创建课程表
create table Course(
Cid int,
Cname string,
Tid int
)
row format delimited
fields terminated by ',';
--创建成绩表
create table SC(
Sid int,
Cid int,
score int
)
row format delimited
fields terminated by ',';
--创建教师表
create table Teacher(
Tid int,
Tname string
)
row format delimited
fields terminated by ',';
- 导入数据至表中
load data local inpath "/root/hivedata/任务四-数据/学生表.txt" into table darcy.Student;
load data local inpath "/root/hivedata/任务四-数据/课程表.txt" into table darcy.course;
load data local inpath "/root/hivedata/任务四-数据/成绩表.txt" into table darcy.SC;
load data local inpath "/root/hivedata/任务四-数据/教师表.txt" into table darcy.Teacher;
2. 查询四个表,检验是否映射成功
select * from Student;
select * from Course;
select * from SC;
select * from Teacher;
3. 查询01课程比02课程成绩高的所有学生的学号
select a.Sid,a.score,b.score from
(select * from SC where SC.Cid=01) a, (select * from SC where SC.Cid=02) b
where a.Sid=b.Sid and a.score>b.score;
4. 查询平均成绩大于60分的同学的学号和平均成绩
select Sid, avg(score) avg_score from SC group by Sid having avg(score)>60;
5. 查询所有同学的学号、姓名、选课数、总成绩
Select stu.Sid,stu.Sname,count(c.Cid) course_nums, sum(c.Score) sum_score from Student stu left join SC c on stu.Sid=c.Sid
group by stu.Sid,stu.Sname;
6. 查询"li"姓老师的数量
select count(Tid) from teacher where Tname like 'li%';
7. 查询学过"zhangsan"授课的同学的信息
select st.* from student st
left join sc on sc.Sid=st.Sid
left join course c on c.Cid=sc.Cid
left join teacher t on t.Tid=c.Tid
where t.Tname="zhangsan";
8. 查询没学过"zhangsan"老师授课的同学的信息
select s.* from student s where s.Sid NOT IN(
select
st.Sid
from student st
left join sc ON sc.Sid=st.Sid
left join course c ON c.Cid=sc.Cid
left join teacher t ON t.Tid=c.Tid
where t.Tname="zhangsan"
);
9. 查询学过编号为"01"并且也学过编号为"02"的课程的同学的信息
--法一
select a.* from Student a,SC b,SC c
where a.Sid=b.Sid and a.Sid=c.Sid
and b.Cid=01 and c.Cid=02;
--法二
select s.*,sc1.score,sc2.score from student s
left join (select * from sc where cid = '01') sc1 on s.sid = sc1.sid
left join (select * from sc where cid = '02') sc2 on s.sid = sc2.sid
where sc1.cid = '01' and sc2.cid = '02';
10. 查询学过编号为"01"但是没有学过编号为"02"的课程的同学的信息
select s.*,sc1.score,sc2.score from student s
left join (select * from sc where cid = '01') sc1 on s.sid = sc1.sid
left join (select * from sc where cid = '02') sc2 on s.sid = sc2.sid
where sc1.cid = '01' and sc2.cid is null;
11. 查询没有学全所有课程的同学的信息(请尝试用不同的思路解题):
select st.sname,st.sid,st.Sbirth from student st left join SC on st.Sid=SC.Sid
group by st.sid,st.sname,st.Sbirth having count(SC.Cid)<3;
12. 查询男生、女生人数:
select Ssex, count(Sid) from student group by Ssex;
13. 查询平均成绩大于等于85的所有学生的学号、姓名和平均成绩:
select st.Sid, st.Sname,avg(SC.score) avg_score from student st full join SC on SC.Sid=st.Sid
group by st.sid, st.sname having avg_score>85;
14. 求每门课程的学生人数
select cour.Cname, count(SC.Cid) stu_nums from Course cour left join SC on cour.Cid=SC.Cid
group by cour.Cname;
15. 求学生总成绩按照从高到低的顺序排序
select Sid, sum(score) sum_score from SC group by Sid order by sum_score desc;
16. 检索"01"课程分数小于60,按分数降序排列的学生信息
--法一
select stu.Sid, stu.Sname,stu.Sbirth,stu.Ssex,SC.score from Student stu left join SC on stu.Sid=SC.Sid group by stu.Sid, stu.Sname,Stu.Sbirth,SC.Cid,stu.Ssex,SC.score having SC.Cid=01 and SC.score<60 order by SC.score desc;
--法二(简洁)
select stu.*, SC.score from Student stu
left join SC on SC.Sid=stu.Sid
where SC.Cid=01 and SC.score<60
order by SC.score desc;
17. 查询张老师教的课的平均成绩
select tname,cid,cname,tid,avg(score) avg_score from (select t2.Tname tname, c.*,t1.score score from Course c
left join (select * from SC ) t1 on c.Cid =t1.Cid
left join (select * from Teacher) t2 on c.Tid=t2.Tid) as k
group by tname,cid,cname,tid having tname like 'zhang%';
18. 查询课程不及格学生信息
select c.Cname, info.* from (select stu.*, SC.Cid, SC.score from Student stu
left join SC on SC.Sid=stu.Sid
where SC.score<60) as info left join Course c on info.Cid=c.Cid;
以上是关于大数据仓库技术实训任务4的主要内容,如果未能解决你的问题,请参考以下文章