Hacker Rank SQL 问题:15 天的 SQL,如何使用 AND 选择查询
Posted
技术标签:
【中文标题】Hacker Rank SQL 问题:15 天的 SQL,如何使用 AND 选择查询【英文标题】:Hacker Rank SQL Problem: 15 Days of SQL, how to select query work with AND 【发布时间】:2021-02-15 18:48:51 【问题描述】:考虑以下几点:
create table submissions (
submission_date date,
submission_id int,
hacker_id int,
score int
);
create table hackers (
hacker_id int,
name varchar(20)
);
insert into submissions values
("2016-03-01", 8494, 20703, 0),("2016-03-01", 22403, 53473,15),
("2016-03-01",23965,79722,60),("2016-03-01",30173,36396,70),
("2016-03-02",34928,20703,0),("2016-03-02",38740,15758,60),
("2016-03-02",42769,79722,25),("2016-03-02",44364,79722,60),
("2016-03-03",45440,20703,0),("2016-03-03",49050,36396,70),
("2016-03-03",50273,79722,5),("2016-03-04",50344,20703,0),
("2016-03-04",51360,44065,90),("2016-03-04",54404,53473,65),
("2016-03-04",61533,79722,45),("2016-03-05",72852,20703,0),
("2016-03-05",74546,38289,0),("2016-03-05",76487,62529,0),
("2016-03-05",82439,36396,10),("2016-03-05",90006,36396,40),
("2016-03-06",90404,20703,0);
create table colleges (
college_id int,
contest_id int
);
insert into hackers values
(15758, 'Rose'),(20703, 'Angela'),
(36396,'Frank'),(38289, 'Patrick'),
(44065, 'Lisa'),(53473,'Kimberly'),
(62529, 'Bonnie'),(79722, 'Michael');
对于这个 HackerRank quiz:
Julia 举办了“15 天 SQL 学习”竞赛。比赛开始日期为 2016 年 3 月 1 日,结束日期为 2016 年 3 月 15 日。
编写查询以打印至少 1 的唯一黑客总数 每天提交(从比赛的第一天开始),并找到每天提交最多提交次数的黑客的hacker_id和名称。如果不止一个这样的黑客有提交的最大数量,打印最低的hacker_id。查询应打印比赛每一天的此信息,按日期排序。
这是我想了解的解决方案:
SELECT submission_date,
(
SELECT COUNT(DISTINCT hacker_id)
FROM Submissions AS SUB2
WHERE SUB2.submission_date = SUB1.submission_date AND
(SELECT COUNT(DISTINCT submission_date)
FROM Submissions AS SUB3
WHERE (SUB3.hacker_id = SUB2.hacker_id) AND
(SUB3.submission_date < SUB1.submission_date))
= DATEDIFF(SUB1.submission_date, '2016-03-01' )
),
(SELECT hacker_id FROM Submissions SUB4
WHERE SUB4.submission_date = SUB1.submission_date
GROUP BY hacker_id
ORDER BY COUNT(submission_id) DESC, hacker_id LIMIT 1) AS HID,
(SELECT name FROM Hackers
WHERE hacker_id = HID)
FROM
(SELECT DISTINCT(submission_date)
FROM Submissions) AS SUB1
我无法理解 2 个部分:
第 1 部分
SELECT COUNT(DISTINCT hacker_id)
FROM Submissions AS SUB2
WHERE SUB2.submission_date = SUB1.submission_date AND
(SELECT COUNT(DISTINCT submission_date)
FROM Submissions AS SUB3
WHERE (SUB3.hacker_id = SUB2.hacker_id) AND
(SUB3.submission_date < SUB1.submission_date))
= DATEDIFF(SUB1.submission_date, '2016-03-01' )
)
以上代码问题: 这部分是如何工作的:
SELECT COUNT(DISTINCT hacker_id FROM Submissions AS SUB2 WHERE
SUB2.submission_date = SUB1.submission_date
适用于
(SELECT
COUNT(DISTINCT submission_date) FROM Submissions AS SUB3 WHERE
(SUB3.hacker_id = SUB2.hacker_id) AND (SUB3.submission_date <
SUB1.submission_date)) = DATEDIFF(SUB1.submission_date,
'2016-03-01' ))
第一部分为给定的提交日期带来所有唯一的hacker_id,第二部分检查hacker_id是否在该日期提交了一致的提交,但是SQL如何确保只检查第一部分中存在的hacker_id(在AND之前)在第二部分(在 AND 之后)
您能否举例说明这两个查询是如何协同工作的?
第 2 部分 对于这部分
(SELECT hacker_id FROM Submissions SUB4 WHERE
SUB4.submission_date = SUB1.submission_date GROUP BY hacker_id
ORDER BY COUNT(submission_id) DESC, hacker_id LIMIT 1) AS HID
如何只检查直到当前日期为止在每个日期提交一致的hacker_id,然后对这些hacker_ids 提交进行分组,然后选择提交次数最多的最低hacker_id?
MRE 这个问题。
【问题讨论】:
DISTINCT 不是函数,所以SELECT DISTINCT(submission_date)
有点傻。
这里我们从“submission_date”列中选择不同/唯一的值
那就是SELECT DISTINCT submission_date
现在我明白了!因为它不是一个函数,所以列名不需要在括号内。谢谢@Strawberry
【参考方案1】:
这不是你问的,但这些天,我们可能会做这样的事情......
请注意,我的日期范围和数据集与您的略有不同...
drop table if exists hackers;
create table hackers (
hacker_id serial primary key,
name varchar(20)
);
insert into hackers values
(15758, 'Rose'),
(20703, 'Angela'),
(36396,'Frank'),
(38289, 'Patrick'),
(44065, 'Lisa'),
(53473,'Kimberly'),
(62529, 'Bonnie'),
(79722, 'Michael'),
(10101, 'Geoff');
drop table if exists submissions;
create table submissions (
submission_date date,
submission_id serial primary key,
hacker_id int,
score int
);
insert into submissions values
("2016-03-01", 8494, 20703, 0),
("2016-03-01", 22403, 53473,15),
("2016-03-01",23965,79722,60),
("2016-03-01",30173,36396,70),
("2016-03-02",34928,20703,0),
("2016-03-02",38740,15758,60),
("2016-03-02",42769,79722,25),
("2016-03-02",44364,79722,60),
("2016-03-03",45440,20703,0),
("2016-03-03",49050,36396,70),
("2016-03-03",50273,79722,5),
("2016-03-04",50344,20703,0),
("2016-03-04",51360,44065,90),
("2016-03-04",54404,53473,65),
("2016-03-04",61533,79722,45),
("2016-03-05",72852,20703,0),
("2016-03-05",74546,38289,0),
("2016-03-05",76487,62529,0),
("2016-03-05",82439,36396,10),
("2016-03-05",90006,36396,40),
("2016-03-06",90404,20703,0),
("2016-03-01",90405,10101,12),
("2016-03-02",90406,10101,0),
("2016-03-03",90407,10101,15),
("2016-03-04",90409,10101,60),
("2016-03-05",90410,10101,70),
("2016-03-06",90411,10101,0);
WITH RECURSIVE cte (n) AS
(
SELECT '2016-03-01'
UNION ALL
SELECT n + INTERVAL 1 DAY FROM cte WHERE n < '2016-03-06'
)
SELECT h.*
FROM cte
JOIN hackers h
JOIN submissions s
ON s.submission_date = n
AND s.hacker_id = h.hacker_id
GROUP
BY hacker_id
HAVING COUNT(DISTINCT s.submission_date) = 6
ORDER
BY hacker_id
LIMIT 1
;
+-----------+-------+
| hacker_id | name |
+-----------+-------+
| 10101 | Geoff |
+-----------+-------+
【讨论】:
以上是关于Hacker Rank SQL 问题:15 天的 SQL,如何使用 AND 选择查询的主要内容,如果未能解决你的问题,请参考以下文章
SQL — 获取 A 列日期与 B 列日期相差 7 天的所有行
SQL RANK() over PARTITION 在连接表上
使用 SQL Server Rank 函数对行进行排名而不跳过排名号