数据库的查询操作是重点中的重点,最核心的内容就是它!
在mysql管理软件中,可以通过SQL语句中的DML语言来实现数据的操作,包括
- 使用INSERT实现数据的插入
- UPDATE实现数据的更新
- 使用DELETE实现数据的删除
- 使用SELECT查询数据以及。
insert 使用方法:
1. 插入完整数据(顺序插入) 语法一: INSERT INTO 表名(字段1,字段2,字段3…字段n) VALUES(值1,值2,值3…值n); 语法二: INSERT INTO 表名 VALUES (值1,值2,值3…值n); 2. 指定字段插入数据 语法: INSERT INTO 表名(字段1,字段2,字段3…) VALUES (值1,值2,值3…); 3. 插入多条记录 语法: INSERT INTO 表名 VALUES (值1,值2,值3…值n), (值1,值2,值3…值n), (值1,值2,值3…值n); 4. 插入查询结果 语法: INSERT INTO 表名(字段1,字段2,字段3…字段n) SELECT (字段1,字段2,字段3…字段n) FROM 表2 WHERE …;
update使用方法:
语法: UPDATE 表名 SET 字段1=值1, 字段2=值2, WHERE CONDITION; 示例: UPDATE mysql.user SET password=password(‘123’) where user=’root’ and host=’localhost’;
delete 使用方法:
语法:
DELETE FROM 表名
WHERE CONITION;
示例:
DELETE FROM mysql.user
WHERE password=’’;
单表操作的方法:
简单语法要求:
SELECT 字段1,字段2... FROM 表名 WHERE 条件 GROUP BY field HAVING 筛选 ORDER BY field LIMIT 限制条数 重点中的重点:关键字的执行优先级 from where group by having select distinct order by limit 执行顺序: 1.找到表:from 2.拿着where指定的约束条件,去文件/表中取出一条条记录 3.将取出的一条条记录进行分组group by,如果没有group by,则整体作为一组 4.将分组的结果进行having过滤 5.执行select 6.去重 7.将结果按条件排序:order by 8.限制结果的显示条数
select简单方法:
SELECT DISTINCT <select_list> FROM <left_table> <join_type> JOIN <right_table> ON <join_condition> WHERE <where_condition> GROUP BY <group_by_list> HAVING <having_condition> ORDER BY <order_by_condition> LIMIT <limit_number> (7) SELECT (8) DISTINCT <select_list> (1) FROM <left_table> (3) <join_type> JOIN <right_table> (2) ON <join_condition> (4) WHERE <where_condition> (5) GROUP BY <group_by_list> (6) HAVING <having_condition> (9) ORDER BY <order_by_condition> (10) LIMIT <limit_number>
示例:
company.employee 员工id id int 姓名 emp_name varchar 性别 sex enum 年龄 age int 入职日期 hire_date date 岗位 post varchar 职位描述 post_comment varchar 薪水 salary double 办公室 office int 部门编号 depart_id int #创建表 create table employee( id int not null unique auto_increment, name varchar(20) not null, sex enum(‘male‘,‘female‘) not null default ‘male‘, #大部分是男的 age int(3) unsigned not null default 28, hire_date date not null, post varchar(50), post_comment varchar(100), salary double(15,2), office int, #一个部门一个屋子 depart_id int ); #查看表结构 mysql> desc employee; +--------------+-----------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------+-----------------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | name | varchar(20) | NO | | NULL | | | sex | enum(‘male‘,‘female‘) | NO | | male | | | age | int(3) unsigned | NO | | 28 | | | hire_date | date | NO | | NULL | | | post | varchar(50) | YES | | NULL | | | post_comment | varchar(100) | YES | | NULL | | | salary | double(15,2) | YES | | NULL | | | office | int(11) | YES | | NULL | | | depart_id | int(11) | YES | | NULL | | +--------------+-----------------------+------+-----+---------+----------------+ #插入记录 #三个部门:教学,销售,运营 insert into employee(name,sex,age,hire_date,post,salary,office,depart_id) values (‘egon‘,‘male‘,18,‘20170301‘,‘老男孩驻沙河办事处外交大使‘,7300.33,401,1), #以下是教学部 (‘alex‘,‘male‘,78,‘20150302‘,‘teacher‘,1000000.31,401,1), (‘wupeiqi‘,‘male‘,81,‘20130305‘,‘teacher‘,8300,401,1), (‘yuanhao‘,‘male‘,73,‘20140701‘,‘teacher‘,3500,401,1), (‘liwenzhou‘,‘male‘,28,‘20121101‘,‘teacher‘,2100,401,1), (‘jingliyang‘,‘female‘,18,‘20110211‘,‘teacher‘,9000,401,1), (‘jinxin‘,‘male‘,18,‘19000301‘,‘teacher‘,30000,401,1), (‘成龙‘,‘male‘,48,‘20101111‘,‘teacher‘,10000,401,1), (‘歪歪‘,‘female‘,48,‘20150311‘,‘sale‘,3000.13,402,2),#以下是销售部门 (‘丫丫‘,‘female‘,38,‘20101101‘,‘sale‘,2000.35,402,2), (‘丁丁‘,‘female‘,18,‘20110312‘,‘sale‘,1000.37,402,2), (‘星星‘,‘female‘,18,‘20160513‘,‘sale‘,3000.29,402,2), (‘格格‘,‘female‘,28,‘20170127‘,‘sale‘,4000.33,402,2), (‘张野‘,‘male‘,28,‘20160311‘,‘operation‘,10000.13,403,3), #以下是运营部门 (‘程咬金‘,‘male‘,18,‘19970312‘,‘operation‘,20000,403,3), (‘程咬银‘,‘female‘,18,‘20130311‘,‘operation‘,19000,403,3), (‘程咬铜‘,‘male‘,18,‘20150411‘,‘operation‘,18000,403,3), (‘程咬铁‘,‘female‘,18,‘20140512‘,‘operation‘,17000,403,3) ; #ps:如果在windows系统中,插入中文字符,select的结果为空白,可以将所有字符编码统一设置成gbk 准备表和记录
#简单查询 SELECT id,name,sex,age,hire_date,post,post_comment,salary,office,depart_id FROM employee; SELECT * FROM employee; SELECT name,salary FROM employee; #避免重复DISTINCT SELECT DISTINCT post FROM employee; #通过四则运算查询 SELECT name, salary*12 FROM employee; SELECT name, salary*12 AS Annual_salary FROM employee; SELECT name, salary*12 Annual_salary FROM employee; #定义显示格式 CONCAT() 函数用于连接字符串 SELECT CONCAT(‘姓名: ‘,name,‘ 年薪: ‘, salary*12) AS Annual_salary FROM employee; CONCAT_WS() 第一个参数为分隔符 SELECT CONCAT_WS(‘:‘,name,salary*12) AS Annual_salary FROM employee;
小练习:
1 查出所有员工的名字,薪资,格式为 <名字:egon> <薪资:3000> 2 查出所有的岗位(去掉重复) 3 查出所有员工名字,以及他们的年薪,年薪的字段名为annual_year select concat(‘<名字:‘,name,‘> ‘,‘<薪资:‘,salary,‘>‘) from employee; select distinct depart_id from employee; select name,salary*12 annual_salary from employee;
where条件约束查询:
#1:单条件查询 SELECT name FROM employee WHERE post=‘sale‘; #2:多条件查询 SELECT name,salary FROM employee WHERE post=‘teacher‘ AND salary>10000; #3:关键字BETWEEN AND SELECT name,salary FROM employee WHERE salary BETWEEN 10000 AND 20000; SELECT name,salary FROM employee WHERE salary NOT BETWEEN 10000 AND 20000; #4:关键字IS NULL(判断某个字段是否为NULL不能用等号,需要用IS) SELECT name,post_comment FROM employee WHERE post_comment IS NULL; SELECT name,post_comment FROM employee WHERE post_comment IS NOT NULL; SELECT name,post_comment FROM employee WHERE post_comment=‘‘; 注意‘‘是空字符串,不是null ps: 执行 update employee set post_comment=‘‘ where id=2; 再用上条查看,就会有结果了 #5:关键字IN集合查询 SELECT name,salary FROM employee WHERE salary=3000 OR salary=3500 OR salary=4000 OR salary=9000 ; SELECT name,salary FROM employee WHERE salary IN (3000,3500,4000,9000) ; SELECT name,salary FROM employee WHERE salary NOT IN (3000,3500,4000,9000) ; #6:关键字LIKE模糊查询 通配符’%’ SELECT * FROM employee WHERE name LIKE ‘eg%‘; 通配符’_’ SELECT * FROM employee WHERE name LIKE ‘al__‘;
小练习:
1. 查看岗位是teacher的员工姓名、年龄 2. 查看岗位是teacher且年龄大于30岁的员工姓名、年龄 3. 查看岗位是teacher且薪资在9000-1000范围内的员工姓名、年龄、薪资 4. 查看岗位描述不为NULL的员工信息 5. 查看岗位是teacher且薪资是10000或9000或30000的员工姓名、年龄、薪资 6. 查看岗位是teacher且薪资不是10000或9000或30000的员工姓名、年龄、薪资 7. 查看岗位是teacher且名字是jin开头的员工姓名、年薪 复制代码 复制代码 select name,age from employee where post = ‘teacher‘; select name,age from employee where post=‘teacher‘ and age > 30; select name,age,salary from employee where post=‘teacher‘ and salary between 9000 and 10000; select * from employee where post_comment is not null; select name,age,salary from employee where post=‘teacher‘ and salary in (10000,9000,30000); select name,age,salary from employee where post=‘teacher‘ and salary not in (10000,9000,30000); select name,salary*12 from employee where post=‘teacher‘ and name like ‘jin%‘; 复制代码
分组group by方法:
#1、首先明确一点:分组发生在where之后,即分组是基于where之后得到的记录而进行的 #2、分组指的是:将所有记录按照某个相同字段进行归类,比如针对员工信息表的职位分组,或者按照性别进行分组等 #3、为何要分组呢? 取每个部门的最高工资 取每个部门的员工数 取男人数和女人数 小窍门:‘每’这个字后面的字段,就是我们分组的依据 #4、大前提: 可以按照任意字段分组,但是分组完毕后,比如group by post,只能查看post字段,如果想查看组内信息,需要借助于聚合函数
单独使用GROUP BY关键字分组 SELECT post FROM employee GROUP BY post; 注意:我们按照post字段分组,那么select查询的字段只能是post,想要获取组内的其他相关信息,需要借助函数 GROUP BY关键字和GROUP_CONCAT()函数一起使用 SELECT post,GROUP_CONCAT(name) FROM employee GROUP BY post;#按照岗位分组,并查看组内成员名 SELECT post,GROUP_CONCAT(name) as emp_members FROM employee GROUP BY post; GROUP BY与聚合函数一起使用 select post,count(id) as count from employee group by post;#按照岗位分组,并查看每个组有多少人
强调:
如果我们用unique的字段作为分组的依据,则每一条记录自成一组,这种分组没有意义 多条记录之间的某个字段值相同,该字段通常用来作为分组的依据
聚合函数:
#强调:聚合函数聚合的是组的内容,若是没有分组,则默认一组 示例: SELECT COUNT(*) FROM employee; SELECT COUNT(*) FROM employee WHERE depart_id=1; SELECT MAX(salary) FROM employee; SELECT MIN(salary) FROM employee; SELECT AVG(salary) FROM employee; SELECT SUM(salary) FROM employee; SELECT SUM(salary) FROM employee WHERE depart_id=3;
练习题:
1. 查询岗位名以及岗位包含的所有员工名字 2. 查询岗位名以及各岗位内包含的员工个数 3. 查询公司内男员工和女员工的个数 4. 查询岗位名以及各岗位的平均薪资 5. 查询岗位名以及各岗位的最高薪资 6. 查询岗位名以及各岗位的最低薪资 7. 查询男员工与男员工的平均薪资,女员工与女员工的平均薪资
#题1:分组 mysql> select post,group_concat(name) from employee group by post; +-----------------------------------------+---------------------------------------------------------+ | post | group_concat(name) | +-----------------------------------------+---------------------------------------------------------+ | operation | 张野,程咬金,程咬银,程咬铜,程咬铁 | | sale | 歪歪,丫丫,丁丁,星星,格格 | | teacher | alex,wupeiqi,yuanhao,liwenzhou,jingliyang,jinxin,成龙 | | 老男孩驻沙河办事处外交大使 | egon | +-----------------------------------------+---------------------------------------------------------+ #题目2: mysql> select post,count(id) from employee group by post; +-----------------------------------------+-----------+ | post | count(id) | +-----------------------------------------+-----------+ | operation | 5 | | sale | 5 | | teacher | 7 | | 老男孩驻沙河办事处外交大使 | 1 | +-----------------------------------------+-----------+ #题目3: mysql> select sex,count(id) from employee group by sex; +--------+-----------+ | sex | count(id) | +--------+-----------+ | male | 10 | | female | 8 | +--------+-----------+ #题目4: mysql> select post,avg(salary) from employee group by post; +-----------------------------------------+---------------+ | post | avg(salary) | +-----------------------------------------+---------------+ | operation | 16800.026000 | | sale | 2600.294000 | | teacher | 151842.901429 | | 老男孩驻沙河办事处外交大使 | 7300.330000 | +-----------------------------------------+---------------+ #题目5 mysql> select post,max(salary) from employee group by post; +-----------------------------------------+-------------+ | post | max(salary) | +-----------------------------------------+-------------+ | operation | 20000.00 | | sale | 4000.33 | | teacher | 1000000.31 | | 老男孩驻沙河办事处外交大使 | 7300.33 | +-----------------------------------------+-------------+ #题目6 mysql> select post,min(salary) from employee group by post; +-----------------------------------------+-------------+ | post | min(salary) | +-----------------------------------------+-------------+ | operation | 10000.13 | | sale | 1000.37 | | teacher | 2100.00 | | 老男孩驻沙河办事处外交大使 | 7300.33 | +-----------------------------------------+-------------+ #题目七 mysql> select sex,avg(salary) from employee group by sex; +--------+---------------+ | sex | avg(salary) | +--------+---------------+ | male | 110920.077000 | | female | 7250.183750 | +--------+---------------+
六 HAVING过滤
HAVING与WHERE不一样的地方在于!!!!!!
#!!!执行优先级从高到低:where > group by > having #1. Where 发生在分组group by之前,因而Where中可以有任意字段,但是绝对不能使用聚合函数。 #2. Having发生在分组group by之后,因而Having中可以使用分组的字段,无法直接取到其他字段,可以使用聚合函数
查询排序:
按单列排序 SELECT * FROM employee ORDER BY salary; SELECT * FROM employee ORDER BY salary ASC; SELECT * FROM employee ORDER BY salary DESC; 按多列排序:先按照age排序,如果年纪相同,则按照薪资排序 SELECT * from employee ORDER BY age, salary DESC;
1. 查询所有员工信息,先按照age升序排序,如果age相同则按照hire_date降序排序 2. 查询各岗位平均薪资大于10000的岗位名、平均工资,结果按平均薪资升序排列 3. 查询各岗位平均薪资大于10000的岗位名、平均工资,结果按平均薪资降序排列
#题目1 mysql> select * from employee ORDER BY age asc,hire_date desc; #题目2 mysql> select post,avg(salary) from employee group by post having avg(salary) > 10000 order by avg(salary) asc; +-----------+---------------+ | post | avg(salary) | +-----------+---------------+ | operation | 16800.026000 | | teacher | 151842.901429 | +-----------+---------------+ #题目3 mysql> select post,avg(salary) from employee group by post having avg(salary) > 10000 order by avg(salary) desc; +-----------+---------------+ | post | avg(salary) | +-----------+---------------+ | teacher | 151842.901429 | | operation | 16800.026000 | +-----------+---------------+
限制查询的记录数:
示例: SELECT * FROM employee ORDER BY salary DESC LIMIT 3; #默认初始位置为0 SELECT * FROM employee ORDER BY salary DESC LIMIT 0,5; #从第0开始,即先查询出第一条,然后包含这一条在内往后查5条 SELECT * FROM employee ORDER BY salary DESC LIMIT 5,5; #从第5开始,即先查询出第6条,然后包含这一条在内往后查5条
mysql> select * from employee limit 0,5; +----+-----------+------+-----+------------+-----------------------------------------+--------------+------------+--------+-----------+ | id | name | sex | age | hire_date | post | post_comment | salary | office | depart_id | +----+-----------+------+-----+------------+-----------------------------------------+--------------+------------+--------+-----------+ | 1 | egon | male | 18 | 2017-03-01 | 老男孩驻沙河办事处外交大使 | NULL | 7300.33 | 401 | 1 | | 2 | alex | male | 78 | 2015-03-02 | teacher | | 1000000.31 | 401 | 1 | | 3 | wupeiqi | male | 81 | 2013-03-05 | teacher | NULL | 8300.00 | 401 | 1 | | 4 | yuanhao | male | 73 | 2014-07-01 | teacher | NULL | 3500.00 | 401 | 1 | | 5 | liwenzhou | male | 28 | 2012-11-01 | teacher | NULL | 2100.00 | 401 | 1 | +----+-----------+------+-----+------------+-----------------------------------------+--------------+------------+--------+-----------+ rows in set (0.00 sec) mysql> select * from employee limit 5,5; +----+------------+--------+-----+------------+---------+--------------+----------+--------+-----------+ | id | name | sex | age | hire_date | post | post_comment | salary | office | depart_id | +----+------------+--------+-----+------------+---------+--------------+----------+--------+-----------+ | 6 | jingliyang | female | 18 | 2011-02-11 | teacher | NULL | 9000.00 | 401 | 1 | | 7 | jinxin | male | 18 | 1900-03-01 | teacher | NULL | 30000.00 | 401 | 1 | | 8 | 成龙 | male | 48 | 2010-11-11 | teacher | NULL | 10000.00 | 401 | 1 | | 9 | 歪歪 | female | 48 | 2015-03-11 | sale | NULL | 3000.13 | 402 | 2 | | 10 | 丫丫 | female | 38 | 2010-11-01 | sale | NULL | 2000.35 | 402 | 2 | +----+------------+--------+-----+------------+---------+--------------+----------+--------+-----------+ rows in set (0.00 sec) mysql> select * from employee limit 10,5; +----+-----------+--------+-----+------------+-----------+--------------+----------+--------+-----------+ | id | name | sex | age | hire_date | post | post_comment | salary | office | depart_id | +----+-----------+--------+-----+------------+-----------+--------------+----------+--------+-----------+ | 11 | 丁丁 | female | 18 | 2011-03-12 | sale | NULL | 1000.37 | 402 | 2 | | 12 | 星星 | female | 18 | 2016-05-13 | sale | NULL | 3000.29 | 402 | 2 | | 13 | 格格 | female | 28 | 2017-01-27 | sale | NULL | 4000.33 | 402 | 2 | | 14 | 张野 | male | 28 | 2016-03-11 | operation | NULL | 10000.13 | 403 | 3 | | 15 | 程咬金 | male | 18 | 1997-03-12 | operation | NULL | 20000.00 | 403 | 3 | +----+-----------+--------+-----+------------+-----------+--------------+----------+--------+-----------+ rows in set (0.00 sec)
使用正则表达式进行查询:
SELECT * FROM employee WHERE name REGEXP ‘^ale‘; SELECT * FROM employee WHERE name REGEXP ‘on$‘; SELECT * FROM employee WHERE name REGEXP ‘m{2}‘; 小结:对字符串匹配的方式 WHERE name = ‘egon‘; WHERE name LIKE ‘yua%‘; WHERE name REGEXP ‘on$‘;
小练习: 查看所有员工中名字是jin开头,n或者g结果的员工信息 select * from employee where name regexp ‘^jin.*[gn]$‘;
小练习题:
首先我们来建库,然后建表,再然后往表格里面添加数据,就可以演示简单的查询的基本用法了
create database TestDB;
CREATE TABLE table1 ( customer_id VARCHAR(10) NOT NULL, city VARCHAR(10) NOT NULL, PRIMARY KEY(customer_id) )ENGINE=INNODB DEFAULT CHARSET=UTF8; CREATE TABLE table2 ( order_id INT NOT NULL auto_increment, customer_id VARCHAR(10), PRIMARY KEY(order_id) )ENGINE=INNODB DEFAULT CHARSET=UTF8; INSERT INTO table1(customer_id,city) VALUES(‘163‘,‘hangzhou‘); INSERT INTO table1(customer_id,city) VALUES(‘9you‘,‘shanghai‘); INSERT INTO table1(customer_id,city) VALUES(‘tx‘,‘hangzhou‘); INSERT INTO table1(customer_id,city) VALUES(‘baidu‘,‘hangzhou‘); INSERT INTO table2(customer_id) VALUES(‘163‘); INSERT INTO table2(customer_id) VALUES(‘163‘); INSERT INTO table2(customer_id) VALUES(‘9you‘); INSERT INTO table2(customer_id) VALUES(‘9you‘); INSERT INTO table2(customer_id) VALUES(‘9you‘); INSERT INTO table2(customer_id) VALUES(‘tx‘); INSERT INTO table2(customer_id) VALUES(NULL);
#查询来自杭州,并且订单数少于2的客户。 SELECT a.customer_id, COUNT(b.order_id) as total_orders FROM table1 AS a LEFT JOIN table2 AS b ON a.customer_id = b.customer_id WHERE a.city = ‘hangzhou‘ GROUP BY a.customer_id HAVING count(b.order_id) < 2 ORDER BY total_orders DESC;
下面我们来具体剖析一下,程序的执行:
在这些SQL语句的执行过程中,都会产生一个虚拟表,用来保存SQL语句的执行结果(这是重点),我现在就来跟踪这个虚拟表的变化,得到最终的查询结果的过程,来分析整个SQL逻辑查询的执行顺序和过程。 执行FROM语句 第一步,执行FROM语句。我们首先需要知道最开始从哪个表开始的,这就是FROM告诉我们的。现在有了<left_table>和<right_table>两个表,我们到底从哪个表开始,还是从两个表进行某种联系以后再开始呢?它们之间如何产生联系呢?——笛卡尔积 关于什么是笛卡尔积,请自行Google补脑。经过FROM语句对两个表执行笛卡尔积,会得到一个虚拟表,暂且叫VT1(vitual table 1),内容如下:
+-------------+----------+----------+-------------+ | customer_id | city | order_id | customer_id | +-------------+----------+----------+-------------+ | 163 | hangzhou | 1 | 163 | | 9you | shanghai | 1 | 163 | | baidu | hangzhou | 1 | 163 | | tx | hangzhou | 1 | 163 | | 163 | hangzhou | 2 | 163 | | 9you | shanghai | 2 | 163 | | baidu | hangzhou | 2 | 163 | | tx | hangzhou | 2 | 163 | | 163 | hangzhou | 3 | 9you | | 9you | shanghai | 3 | 9you | | baidu | hangzhou | 3 | 9you | | tx | hangzhou | 3 | 9you | | 163 | hangzhou | 4 | 9you | | 9you | shanghai | 4 | 9you | | baidu | hangzhou | 4 | 9you | | tx | hangzhou | 4 | 9you | | 163 | hangzhou | 5 | 9you | | 9you | shanghai | 5 | 9you | | baidu | hangzhou | 5 | 9you | | tx | hangzhou | 5 | 9you | | 163 | hangzhou | 6 | tx | | 9you | shanghai | 6 | tx | | baidu | hangzhou | 6 | tx | | tx | hangzhou | 6 | tx | | 163 | hangzhou | 7 | NULL | | 9you | shanghai | 7 | NULL | | baidu | hangzhou | 7 | NULL | | tx | hangzhou | 7 | NULL | +-------------+----------+----------+-------------+
执行ON过滤
执行完笛卡尔积以后,接着就进行ON a.customer_id = b.customer_id
条件过滤,根据ON
中指定的条件,去掉那些不符合条件的数据,得到VT2表,内容如下:
+-------------+----------+----------+-------------+ | customer_id | city | order_id | customer_id | +-------------+----------+----------+-------------+ | 163 | hangzhou | 1 | 163 | | 163 | hangzhou | 2 | 163 | | 9you | shanghai | 3 | 9you | | 9you | shanghai | 4 | 9you | | 9you | shanghai | 5 | 9you | | tx | hangzhou | 6 | tx | +-------------+----------+----------+-------------+
添加外部行
这一步只有在连接类型为OUTER JOIN
时才发生,如LEFT OUTER JOIN
、RIGHT OUTER JOIN
和FULL OUTER JOIN
。在大多数的时候,我们都是会省略掉OUTER
关键字的,但OUTER
表示的就是外部行的概念。
LEFT OUTER JOIN
把左表记为保留表,得到的结果为:
+-------------+----------+----------+-------------+ | customer_id | city | order_id | customer_id | +-------------+----------+----------+-------------+ | 163 | hangzhou | 1 | 163 | | 163 | hangzhou | 2 | 163 | | 9you | shanghai | 3 | 9you | | 9you | shanghai | 4 | 9you | | 9you | shanghai | 5 | 9you | | tx | hangzhou | 6 | tx | | baidu | hangzhou | NULL | NULL | +-------------+----------+----------+-------------+
RIGHT OUTER JOIN
把右表记为保留表,得到的结果为:
+-------------+----------+----------+-------------+ | customer_id | city | order_id | customer_id | +-------------+----------+----------+-------------+ | 163 | hangzhou | 1 | 163 | | 163 | hangzhou | 2 | 163 | | 9you | shanghai | 3 | 9you | | 9you | shanghai | 4 | 9you | | 9you | shanghai | 5 | 9you | | tx | hangzhou | 6 | tx | | NULL | NULL | 7 | NULL | +-------------+----------+----------+-------------+
FULL OUTER JOIN
把左右表都作为保留表,得到的结果为:
+-------------+----------+----------+-------------+ | customer_id | city | order_id | customer_id | +-------------+----------+----------+-------------+ | 163 | hangzhou | 1 | 163 | | 163 | hangzhou | 2 | 163 | | 9you | shanghai | 3 | 9you | | 9you | shanghai | 4 | 9you | | 9you | shanghai | 5 | 9you | | tx | hangzhou | 6 | tx | | baidu | hangzhou | NULL | NULL | | NULL | NULL | 7 | NULL | +-------------+----------+----------+-------------+
添加外部行的工作就是在VT2表的基础上添加保留表中被过滤条件过滤掉的数据,非保留表中的数据被赋予NULL值,最后生成虚拟表VT3。
由于我在准备的测试SQL查询逻辑语句中使用的是LEFT JOIN
,过滤掉了以下这条数据:
| baidu | hangzhou | NULL | NULL |
现在就把这条数据添加到VT2表中,得到的VT3表如下:
+-------------+----------+----------+-------------+ | customer_id | city | order_id | customer_id | +-------------+----------+----------+-------------+ | 163 | hangzhou | 1 | 163 | | 163 | hangzhou | 2 | 163 | | 9you | shanghai | 3 | 9you | | 9you | shanghai | 4 | 9you | | 9you | shanghai | 5 | 9you | | tx | hangzhou | 6 | tx | | baidu | hangzhou | NULL | NULL | +-------------+----------+----------+-------------+
执行WHERE过滤
对添加外部行得到的VT3进行WHERE过滤,只有符合<where_condition>的记录才会输出到虚拟表VT4中。当我们执行WHERE a.city = ‘hangzhou‘
的时候,就会得到以下内容,并存在虚拟表VT4中:
+-------------+----------+----------+-------------+ | customer_id | city | order_id | customer_id | +-------------+----------+----------+-------------+ | 163 | hangzhou | 1 | 163 | | 163 | hangzhou | 2 | 163 | | tx | hangzhou | 6 | tx | | baidu | hangzhou | NULL | NULL | +-------------+----------+----------+-------------+
但是在使用WHERE子句时,需要注意以下两点:
- 由于数据还没有分组,因此现在还不能在WHERE过滤器中使用
where_condition=MIN(col)
这类对分组统计的过滤; - 由于还没有进行列的选取操作,因此在SELECT中使用列的别名也是不被允许的,如:
SELECT city as c FROM t WHERE c=‘shanghai‘;
是不允许出现的。
执行GROUP BY分组
GROU BY
子句主要是对使用WHERE
子句得到的虚拟表进行分组操作。我们执行测试语句中的GROUP BY a.customer_id
,就会得到以下内容(默认只显示组内第一条):
+-------------+----------+----------+-------------+ | customer_id | city | order_id | customer_id | +-------------+----------+----------+-------------+ | 163 | hangzhou | 1 | 163 | | baidu | hangzhou | NULL | NULL | | tx | hangzhou | 6 | tx | +-------------+----------+----------+-------------+
得到的内容会存入虚拟表VT5中,此时,我们就得到了一个VT5虚拟表,接下来的操作都会在该表上完成。
执行HAVING过滤
HAVING
子句主要和GROUP BY
子句配合使用,对分组得到的VT5虚拟表进行条件过滤。当我执行测试语句中的HAVING count(b.order_id) < 2
时,将得到以下内容:
+-------------+----------+----------+-------------+
| customer_id | city | order_id | customer_id |
+-------------+----------+----------+-------------+
| baidu | hangzhou | NULL | NULL |
| tx | hangzhou | 6 | tx |
+-------------+----------+----------+-------------+
SELECT列表
现在才会执行到SELECT
子句,不要以为SELECT
子句被写在第一行,就是第一个被执行的。
我们执行测试语句中的SELECT a.customer_id, COUNT(b.order_id) as total_orders
,从虚拟表VT6中选择出我们需要的内容。我们将得到以下内容:
+-------------+--------------+ | customer_id | total_orders | +-------------+--------------+ | baidu | 0 | | tx | 1 | +-------------+--------------+
执行ORDER BY子句
对虚拟表中的内容按照指定的列进行排序,然后返回一个新的虚拟表,我们执行测试SQL语句中的ORDER BY total_orders DESC
,就会得到以下内容:
+-------------+--------------+ | customer_id | total_orders | +-------------+--------------+ | tx | 1 | | baidu | 0 | +-------------+--------------+
LIMIT n, m
表示从第n条记录开始选择m条记录。而很多开发人员喜欢使用该语句来解决分页问题。对于小数据,使用LIMIT子句没有任何问题,当数据量非常大的时候,使用LIMIT n, m
是非常低效的。因为LIMIT的机制是每次都是从头开始扫描,如果需要从第60万行开始,读取3条数据,就需要先扫描定位到60万行,然后再进行读取,而扫描的过程是一个非常低效的过程。所以,对于大数据处理时,是非常有必要在应用层建立一定的缓存机制(现在的大数据处理,大都使用缓存)