COUNT() 返回分组表中的总行数
Posted
技术标签:
【中文标题】COUNT() 返回分组表中的总行数【英文标题】:COUNT() returns the total number of rows in the grouped table 【发布时间】:2021-12-31 09:35:27 【问题描述】:我有两张桌子:
工作表:
[![在此处输入图片描述][1]][1]
FailedReason 表被 Job 表引用:
[![在此处输入图片描述][2]][2]
我的目标是根据故障原因计算故障率。
我的期望是得到一个结果,第一列包含failure reason name
,第二列包含total number of all jobs
('successful' + 'failed'),第三列包含total number of failed jobs
,第四列包含failure ratio
,使用以下公式计算:失败计数(3 列)/总计数(2 列)* 100。
我的sql查询:
SELECT
FailedReason.main_reason as "Failure reason",
COUNT(job.name) AS "Total jobs",
SUM(CASE WHEN job.status='failed' THEN 1 ELSE 0 END) AS "Total failed jobs",
SUM(CASE WHEN job.status='failed' THEN 1 ELSE 0 END) / COUNT(job.name) * 100 AS "Failure ratio"
FROM job
LEFT JOIN FailedReason
ON job.id=FailedReason.job_id
GROUP BY 1
ORDER BY 3 DESC
我得到的结果是汇总表中的作业总数被计算在内。结果,失败率是百分之一百。
[![在此处输入图片描述][3]][3]
我应该修改什么以获得正确的作业数量(“成功”+“失败”)并计算正确的比率值
样本数据:
CREATE TABLE failedreason (id INT PRIMARY KEY AUTO_INCREMENT,
job_id INT REFERENCES job(id),
main_reason varchar(255)
);
INSERT INTO failedreason (job_id, main_reason) VALUES (13095427, 'test case failure'),
(13095407, 'test case failure'),
(13095533, 'connection error'),
(13095546, 'connection error'),
(13098367, 'runner connection error'),
(13101522, 'script error');
CREATE TABLE job (id INT PRIMARY KEY,
created_at date,
finished_at date,
status varchar(255)
);
INSERT INTO job (id,
created_at ,
finished_at ,
status
)
VALUES (13095427, '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
(13095407, '2021-05-03 02:50:39', '2021-05-03 03:46:41', 'failed'),
(13095533, '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
(13095546, '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
(13098367, '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
(13101522, '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed');
(13101444, '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'success');
(13101445, '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'success');
(13101446, '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'success');
[1]: https://i.stack.imgur.com/CYnPg.png
[2]: https://i.stack.imgur.com/t3baS.png
[3]: https://i.stack.imgur.com/LC7Hp.png
【问题讨论】:
【参考方案1】:您的查询未给出正确的总数,因此 % 不正确。
请试试这个
select reason, alltotal, failed,
failedtotal,
cast( (100.00 * failed /failedtotal) as numeric(10,2)) failedRatio ,
cast( (100.00 * failed /alltotal) as numeric(10,2)) failedPercent from
(SELECT
FailedReason.[main_reason] reason,
MAX(tots.total_jobs) alltotal,
MAX(ftots.total_failed) failedtotal,
COUNT(DISTINCT CASE WHEN job.status='failed' THEN job.id END) failed
FROM job, FailedReason,
(SELECT COUNT(*) AS total_failed FROM FailedReason) ftots ,
(SELECT COUNT(*) AS total_jobs FROM job) tots
where job.id = FailedReason.job_id
GROUP BY FailedReason.[main_reason]) finaltable
共享样本查询数据的结果
【讨论】:
此查询返回错误 '查询中的错误 (1064): Syntax error near 'numeric(10,2))/total) * 100 as numeric(10,2)) failedPercent from (SELECT fa '在第 2 行' 你可以删除强制转换并尝试 failed/total * 100 不进行强制转换,查询返回整数值,因此结果为 0,最好强制转换。或者你可以使用 (100.00 * failed/total) failedPercent 非常感谢!有没有办法计算比率 connection_error_jobs / total_failed_jobs * 100 ?因此,在每个组原因下,工作失败的比例。例如connection error jobs
计数是2。Total failed jobs
这里是5。所以ratio
应该是40%
更新了查询以获取比率,即 2/6 和百分比,即 2/9【参考方案2】:
使用SUM
而不是COUNT
SELECT f.main_reason AS "main reason",
COUNT(*) AS "Total jobs",
SUM(CASE WHEN j.status='failed' THEN 1 ELSE 0 END) AS "Total failed jobs",
(SUM(CASE WHEN j.status='failed' THEN 1 ELSE 0 END) / COUNT(*)) * 100 AS "Failure ratio"
FROM job j
LEFT JOIN failedreason f ON j.id = f.job_id
GROUP BY f.main_reason
db<>fiddle中的演示
【讨论】:
这个查询给了我同样的结果 请输入样本数据约15条记录 请查看样本数据 @shapale 我根据样本数据编辑了答案。我认为它现在有效。使用 SUM 而不是 COUNT 它返回了相同的结果【参考方案3】:计算唯一的作业,并汇总以获得总数。
工作状态 |失败原因 |总职位 |比率 :--------- | :------------------------ | ---------: | --------: 成功| 空 | 3 | 33.3333 失败 |连接错误 | 2 | 22.2222 失败 |测试用例失败 | 2 | 22.2222 失败 |跑步者连接错误| 1 | 11.1111 失败 |脚本错误 | 1 | 11.1111 失败 |总计 | 6 | 66.6667 成功|总计 | 3 | 33.3333 总计 |总计 | 9 | 100.0000SELECT CASE WHEN GROUPING(job.status) = 1 THEN 'TOTAL' ELSE job.status END AS `Job Status` , CASE WHEN GROUPING(fail.main_reason) = 1 THEN 'TOTAL' ELSE fail.main_reason END AS `Failure Reason` , COUNT(DISTINCT job.id) AS `Total Jobs` , COUNT(DISTINCT job.id) / MAX(tots.total_jobs) * 100 AS `Ratio` FROM job CROSS JOIN (SELECT COUNT(*) AS total_jobs FROM job) tots LEFT JOIN failedreason fail ON job.id = fail.job_id GROUP BY job.status, fail.main_reason WITH ROLLUP ORDER BY GROUPING(job.status), GROUPING(fail.main_reason), `Ratio` DESC;
dbfiddle here
上的演示【讨论】:
同样的结果。计数应用于分组表。 好吧,好吧......如果他们有一个 main_reason,那么他们就是失败的。因此,按 main_reason 分组将有总作业数 = 总故障数。 查看更新、交叉连接并按比例使用 我检查了更新的查询。现在总作业数和总失败作业数仍然相等,但比率是奇数。 但是“Total Jobs”应该是MAX(tots.total_jobs)
吗?你期待什么结果?以上是关于COUNT() 返回分组表中的总行数的主要内容,如果未能解决你的问题,请参考以下文章
GROUPS 和 NULL 的 COUNT 不等于表中的总行数