BigQuery SQL如何在使用LIMIT时获取总数

Posted

技术标签:

【中文标题】BigQuery SQL如何在使用LIMIT时获取总数【英文标题】:BigQuery SQL how to get total count when using LIMIT 【发布时间】:2016-05-08 05:55:31 【问题描述】:

如果我在 SQL 查询中使用 LIMIT 10(使用 BigQuery),有没有办法同时返回总计数?

例如,存在 100 行。如何查询返回前 10 行,同时向用户显示总共有多少行可用,而无需单独执行 count(id) 聚合查询?

【问题讨论】:

你关心你的SQL查询的执行计划,对吧? 【参考方案1】:

要添加到 Mikhail 的答案,您可能希望这样做以查看分组查询中唯一值的计数。在以下示例中,R 有 10 个唯一值,但您只想查看前 4 个以及唯一行的计数。我还添加了显示每个组的计数和每行的总数。 (下面是标准 SQL)

WITH YourTable AS (
  SELECT 1 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 10 AS r    
)
SELECT 
  r,
  SUM(1) OVER (ORDER BY r ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS CountOfAllUniqueRows,
  COUNT(r) AS CountOfEachR,
  SUM(COUNT(R)) OVER (ORDER BY r ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS CountOfAllRows
FROM YourTable
GROUP BY r
ORDER BY r
LIMIT 4

并给出以下结果:

r   CountOfAllUniqueRows    CountOfEachR    CountOfAllRows
1   10                      8               68
2   10                      6               68
3   10                      7               68
4   10                      6               68

【讨论】:

【参考方案2】:

不知道你为什么要这样做 - 可能是因为成本 - 所以你避免第二次扫描 - 无论如何 - 下面的“技巧”可能对你有用。 虽然只获得您希望的行数 - 您还可以获得总行数,但在每个输出行内 - 因此在向用户显示此内容时您需要自行处理

使用 BigQuery 旧版 SQL:

SELECT 
  r, cnt
FROM (
  SELECT 
    r,
    COUNT(r) OVER() AS cnt,
    ROW_NUMBER() OVER() AS line
  FROM 
    (SELECT 1 AS r),
    (SELECT 2 AS r),
    (SELECT 3 AS r),
    (SELECT 4 AS r),
    (SELECT 5 AS r),
    (SELECT 6 AS r),
    (SELECT 7 AS r),
    (SELECT 8 AS r),
    (SELECT 9 AS r),
    (SELECT 10 AS r)
)
WHERE line <= 4

SELECT 
  r,
  cnt
FROM (
  SELECT r
  FROM 
    (SELECT 1 AS r),
    (SELECT 2 AS r),
    (SELECT 3 AS r),
    (SELECT 4 AS r),
    (SELECT 5 AS r),
    (SELECT 6 AS r),
    (SELECT 7 AS r),
    (SELECT 8 AS r),
    (SELECT 9 AS r),
    (SELECT 10 AS r)
) AS YourTable
CROSS JOIN (
  SELECT COUNT(1) AS cnt
  FROM 
    (SELECT 1 AS r),
    (SELECT 2 AS r),
    (SELECT 3 AS r),
    (SELECT 4 AS r),
    (SELECT 5 AS r),
    (SELECT 6 AS r),
    (SELECT 7 AS r),
    (SELECT 8 AS r),
    (SELECT 9 AS r),
    (SELECT 10 AS r)
) rows
LIMIT 4 

使用 BigQuery 标准 SQL:

不要忘记取消选中 Show Options

下的 Use Legacy SQL 复选框
WITH YourTable AS (
  SELECT 1 AS r UNION ALL
  SELECT 2 AS r UNION ALL
  SELECT 3 AS r UNION ALL
  SELECT 4 AS r UNION ALL
  SELECT 5 AS r UNION ALL
  SELECT 6 AS r UNION ALL
  SELECT 7 AS r UNION ALL
  SELECT 8 AS r UNION ALL
  SELECT 9 AS r UNION ALL
  SELECT 10 AS r    
)
SELECT 
  r,
  (SELECT COUNT(1) FROM YourTable) AS cnt
FROM YourTable
LIMIT 4

在所有情况下,结果都是

r   cnt  
1   10   
2   10   
3   10   
4   10   

【讨论】:

当心 - 这不会按预期运行:LIMIT 在 OVER() 之前应用,因此行将为“4”。 (改用子选择) 找到对此行为的引用 - code.google.com/p/google-bigquery/issues/detail?id=424 这可行,但计数是在旧版的单台机器上完成的,无法扩展。

以上是关于BigQuery SQL如何在使用LIMIT时获取总数的主要内容,如果未能解决你的问题,请参考以下文章

支持标准 SQL 中的视图

如何使用 PHP 编写类似 sql 的查询以从 BigQuery 获取数据

bigquery 标准 sql 获取 dayofweek

如何在 BigQuery 标准 SQL 中获取数组的切片?

如何通过 Java 程序获取 Bigquery 表的架构?

如何使用 BigQuery 旧版 sql 查询相同的重复字符串字段以获取多个值?