Redshift 结合两个查询

Posted 2023-03-30

技术标签:

【中文标题】Redshift 结合两个查询【英文标题】：Redshift combine two queries 【发布时间】：2017-04-14 04:51:14 【问题描述】：

考虑以下查询以获取计数：

查询 A

SELECT
    COUNT(*)
FROM
    "user_notifications"
WHERE
    "user_notifications"."source_id" = 5196
    AND "user_notifications"."source_type" = 'MassGifting'
    AND "user_notifications"."status" = 'sent'
    AND "user_notifications"."read_at" IS NULL

执行细节：这大约需要 6-10 秒。

查询 B 它也在同一张桌子上，只是 where 子句略有不同，尝试根据通知的状态检查通知的声音，如果它们已根据 source_id 和 source_type 读取。：

    SELECT
    COUNT(*)
FROM
    "user_notifications"
WHERE
    "user_notifications"."source_id" = 5196
    AND "user_notifications"."source_type" = 'MassGifting'
    AND (
        "user_notifications"."read_at" IS NOT NULL
    )

执行细节：耗时：5-6秒。执行这两个查询并在我们的网站上呈现报告总共需要大约 30-60 秒。

我想知道我们可以通过什么方式来加快速度？

【问题讨论】：

您的 user_notifications 表中有多少数据？你有任何排序键或分布键吗？ where条件下没有排序键，提速是相当困难的。 【参考方案1】：

试试下面的查询，告诉我需要多长时间？它应该为您提供 1 个表扫描中的两个查询的结果。

SELECT
    SUM(case when        
    "user_notifications"."status" = 'sent'
    AND "user_notifications"."read_at" IS NULL
    then 1 else 0 end) as Result1,
    SUM(case when        
    "user_notifications"."read_at" IS NOT NULL
    then 1 else 0 end) as Result2,
FROM
    "user_notifications"
WHERE
    "user_notifications"."source_id" = 5196
    AND "user_notifications"."source_type" = 'MassGifting'

如果我做了一个错误的假设，请发表评论，我会重新调整我的答案。

【讨论】：

以上是关于Redshift 结合两个查询的主要内容，如果未能解决你的问题，请参考以下文章

如何在 Redshift 中结合演员表和日期最大值？

在 Postgres (Redshift) 中使用两个选择列运行 MAX 聚合查询时出现问题

我可以在 Redshift 上的存储过程中将两个查询连接在一起吗？

Redshift UDF boto sql 查询和 S3

在 redshift 中查询时出现权限错误

如何为 Amazon redshift 数据库编写查询，使相关查询的 where 子句具有两个条件？