COUNT() 返回分组表中的总行数

Posted

技术标签:

【中文标题】COUNT() 返回分组表中的总行数【英文标题】:COUNT() returns the total number of rows in the grouped table 【发布时间】:2021-12-31 09:35:27 【问题描述】:

我有两张桌子:

工作表:

[![在此处输入图片描述][1]][1]

FailedReason 表被 Job 表引用:

[![在此处输入图片描述][2]][2]

我的目标是根据故障原因计算故障率。

我的期望是得到一个结果,第一列包含failure reason name,第二列包含total number of all jobs ('successful' + 'failed'),第三列包含total number of failed jobs ,第四列包含failure ratio,使用以下公式计算:失败计数(3 列)/总计数(2 列)* 100。

我的sql查询:

SELECT
  FailedReason.main_reason as "Failure reason",
  COUNT(job.name) AS "Total jobs",
  SUM(CASE WHEN job.status='failed' THEN 1 ELSE 0 END) AS "Total failed jobs",
  SUM(CASE WHEN job.status='failed' THEN 1 ELSE 0 END) / COUNT(job.name) * 100 AS "Failure ratio"
FROM job
LEFT JOIN FailedReason
ON job.id=FailedReason.job_id
GROUP BY 1
ORDER BY 3 DESC

我得到的结果是汇总表中的作业总数被计算在内。结果,失败率是百分之一百。

[![在此处输入图片描述][3]][3]

我应该修改什么以获得正确的作业数量(“成功”+“失败”)并计算正确的比率值

样本数据:

CREATE TABLE failedreason (id INT PRIMARY KEY AUTO_INCREMENT,
                  job_id INT REFERENCES job(id),
                  main_reason varchar(255)
                 );



INSERT INTO failedreason (job_id, main_reason) VALUES (13095427, 'test case failure'),
                                    (13095407, 'test case failure'),
                                    (13095533, 'connection error'),
                                    (13095546, 'connection error'),
                                    (13098367, 'runner connection error'),
                                    (13101522, 'script error');

CREATE TABLE job (id INT PRIMARY KEY,
                  created_at date,
                  finished_at date,
                  status varchar(255)
                 );
INSERT INTO job (id,
                  created_at ,
                  finished_at ,
                  status
                 )
VALUES (13095427,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
       (13095407,  '2021-05-03 02:50:39', '2021-05-03 03:46:41', 'failed'),
       (13095533,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
       (13095546,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
       (13098367,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
       (13101522,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed');
       (13101444,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'success');
       (13101445,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'success');
       (13101446,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'success');



  [1]: https://i.stack.imgur.com/CYnPg.png
  [2]: https://i.stack.imgur.com/t3baS.png
  [3]: https://i.stack.imgur.com/LC7Hp.png

【问题讨论】:

【参考方案1】:

您的查询未给出正确的总数,因此 % 不正确。

请试试这个

select reason, alltotal, failed,
    failedtotal,
    cast( (100.00 * failed /failedtotal) as numeric(10,2)) failedRatio , 
    cast( (100.00 * failed /alltotal) as numeric(10,2)) failedPercent from 
    (SELECT
      FailedReason.[main_reason] reason,
      MAX(tots.total_jobs)  alltotal,
      MAX(ftots.total_failed)  failedtotal,
      COUNT(DISTINCT CASE WHEN job.status='failed' THEN job.id END) failed
    FROM job, FailedReason,
    (SELECT COUNT(*) AS total_failed FROM FailedReason) ftots , 
    (SELECT COUNT(*) AS total_jobs FROM job) tots
     where job.id = FailedReason.job_id 
    GROUP BY FailedReason.[main_reason]) finaltable

共享样本查询数据的结果

【讨论】:

此查询返回错误 '查询中的错误 (1064): Syntax error near 'numeric(10,2))/total) * 100 as numeric(10,2)) failedPercent from (SELECT fa '在第 2 行' 你可以删除强制转换并尝试 failed/total * 100 不进行强制转换,查询返回整数值,因此结果为 0,最好强制转换。或者你可以使用 (100.00 * failed/total) failedPercent 非常感谢!有没有办法计算比率 connection_error_jobs / total_failed_jobs * 100 ?因此,在每个组原因下,工作失败的比例。例如connection error jobs计数是2。Total failed jobs这里是5。所以ratio 应该是40% 更新了查询以获取比率,即 2/6 和百分比,即 2/9【参考方案2】:

使用SUM 而不是COUNT

SELECT f.main_reason AS "main reason",
       COUNT(*) AS "Total jobs",
       SUM(CASE WHEN j.status='failed' THEN 1 ELSE 0 END) AS "Total failed jobs",
       (SUM(CASE WHEN j.status='failed' THEN 1 ELSE 0 END) / COUNT(*)) * 100 AS "Failure ratio"
FROM job j
LEFT JOIN failedreason f ON j.id = f.job_id
GROUP BY f.main_reason

db<>fiddle中的演示

【讨论】:

这个查询给了我同样的结果 请输入样本数据约15条记录 请查看样本数据 @shapale 我根据样本数据编辑了答案。我认为它现在有效。使用 SUM 而不是 COUNT 它返回了相同的结果【参考方案3】:

计算唯一的作业,并汇总以获得总数。

SELECT
  CASE WHEN GROUPING(job.status) = 1 THEN 'TOTAL' ELSE job.status END AS `Job Status`
, CASE WHEN GROUPING(fail.main_reason) = 1 THEN 'TOTAL' ELSE fail.main_reason END AS `Failure Reason`
, COUNT(DISTINCT job.id) AS `Total Jobs`
, COUNT(DISTINCT job.id) / MAX(tots.total_jobs) * 100 AS `Ratio`
FROM job
CROSS JOIN (SELECT COUNT(*) AS total_jobs FROM job) tots
LEFT JOIN failedreason fail
ON job.id = fail.job_id
GROUP BY job.status, fail.main_reason WITH ROLLUP
ORDER BY GROUPING(job.status), GROUPING(fail.main_reason), `Ratio` DESC;
工作状态 |失败原因 |总职位 |比率 :--------- | :------------------------ | ---------: | --------: 成功| | 3 | 33.3333 失败 |连接错误 | 2 | 22.2222 失败 |测试用例失败 | 2 | 22.2222 失败 |跑步者连接错误| 1 | 11.1111 失败 |脚本错误 | 1 | 11.1111 失败 |总计 | 6 | 66.6667 成功|总计 | 3 | 33.3333 总计 |总计 | 9 | 100.0000

dbfiddle here

上的演示

【讨论】:

同样的结果。计数应用于分组表。 好吧,好吧......如果他们有一个 main_reason,那么他们就是失败的。因此,按 main_reason 分组将有总作业数 = 总故障数。 查看更新、交叉连接并按比例使用 我检查了更新的查询。现在总作业数和总失败作业数仍然相等,但比率是奇数。 但是“Total Jobs”应该是MAX(tots.total_jobs)吗?你期待什么结果?

以上是关于COUNT() 返回分组表中的总行数的主要内容,如果未能解决你的问题,请参考以下文章

mysql中常见的聚合函数

GROUPS 和 NULL 的 COUNT 不等于表中的总行数

MySQL里的COUNT

MySQL 中哪一个更快? COUNT(id) 还是计算结果行? [关闭]

四.Oracle聚合函数和内外全连接

mysql中常见的聚合函数