BigQuery SQL如何在使用LIMIT时获取总数

Posted

技术标签:

【中文标题】BigQuery SQL如何在使用LIMIT时获取总数【英文标题】:BigQuery SQL how to get total count when using LIMIT 【发布时间】:2016-05-08 05:55:31 【问题描述】:

如果我在 SQL 查询中使用 LIMIT 10(使用 BigQuery),有没有办法同时返回总计数?

例如,存在 100 行。如何查询返回前 10 行,同时向用户显示总共有多少行可用,而无需单独执行 count(id) 聚合查询?

【问题讨论】:

你关心你的SQL查询的执行计划,对吧? 【参考方案1】:

要添加到 Mikhail 的答案,您可能希望这样做以查看分组查询中唯一值的计数。在以下示例中,R 有 10 个唯一值,但您只想查看前 4 个以及唯一行的计数。我还添加了显示每个组的计数和每行的总数。 (下面是标准 SQL)

WITH YourTable AS (
  SELECT 1 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 10 AS r    
)
SELECT 
  r,
  SUM(1) OVER (ORDER BY r ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS CountOfAllUniqueRows,
  COUNT(r) AS CountOfEachR,
  SUM(COUNT(R)) OVER (ORDER BY r ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS CountOfAllRows
FROM YourTable
GROUP BY r
ORDER BY r
LIMIT 4

并给出以下结果:

r   CountOfAllUniqueRows    CountOfEachR    CountOfAllRows
1   10                      8               68
2   10                      6               68
3   10                      7               68
4   10                      6               68

【讨论】:

【参考方案2】:

不知道你为什么要这样做 - 可能是因为成本 - 所以你避免第二次扫描 - 无论如何 - 下面的“技巧”可能对你有用。 虽然只获得您希望的行数 - 您还可以获得总行数,但在每个输出行内 - 因此在向用户显示此内容时您需要自行处理

使用 BigQuery 旧版 SQL:

SELECT 
  r, cnt
FROM (
  SELECT 
    r,
    COUNT(r) OVER() AS cnt,
    ROW_NUMBER() OVER() AS line
  FROM 
    (SELECT 1 AS r),
    (SELECT 2 AS r),
    (SELECT 3 AS r),
    (SELECT 4 AS r),
    (SELECT 5 AS r),
    (SELECT 6 AS r),
    (SELECT 7 AS r),
    (SELECT 8 AS r),
    (SELECT 9 AS r),
    (SELECT 10 AS r)
)
WHERE line <= 4

SELECT 
  r,
  cnt
FROM (
  SELECT r
  FROM 
    (SELECT 1 AS r),
    (SELECT 2 AS r),
    (SELECT 3 AS r),
    (SELECT 4 AS r),
    (SELECT 5 AS r),
    (SELECT 6 AS r),
    (SELECT 7 AS r),
    (SELECT 8 AS r),
    (SELECT 9 AS r),
    (SELECT 10 AS r)
) AS YourTable
CROSS JOIN (
  SELECT COUNT(1) AS cnt
  FROM 
    (SELECT 1 AS r),
    (SELECT 2 AS r),
    (SELECT 3 AS r),
    (SELECT 4 AS r),
    (SELECT 5 AS r),
    (SELECT 6 AS r),
    (SELECT 7 AS r),
    (SELECT 8 AS r),
    (SELECT 9 AS r),
    (SELECT 10 AS r)
) rows
LIMIT 4 

使用 BigQuery 标准 SQL:

不要忘记取消选中 Show Options

下的 Use Legacy SQL 复选框
WITH YourTable AS (
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 10 AS r    
)
SELECT 
  r,
  (SELECT COUNT(1) FROM YourTable) AS cnt
FROM YourTable
LIMIT 4

在所有情况下,结果都是

r   cnt  
1   10   
2   10   
3   10   
4   10   

【讨论】:

当心 - 这不会按预期运行:LIMIT 在 OVER() 之前应用,因此行将为“4”。 (改用子选择) 找到对此行为的引用 - code.google.com/p/google-bigquery/issues/detail?id=424 这可行,但计数是在旧版的单台机器上完成的,无法扩展。

以上是关于BigQuery SQL如何在使用LIMIT时获取总数的主要内容,如果未能解决你的问题,请参考以下文章