SQL 按 Top 3 和其他对行进行分组。 (按州和其他排名前 3 名的城市的收入)

Posted

技术标签:

【中文标题】SQL 按 Top 3 和其他对行进行分组。 (按州和其他排名前 3 名的城市的收入)【英文标题】:SQL group rows by Top 3 and other. (Revenue by Top 3 cities by state and then other) 【发布时间】:2019-07-01 17:17:26 【问题描述】:

所以我有一个查询,它提供了我的每个城市、对应的州以及我的客户在该城市花费的价值。

我想按每个城市的 TotalCustomerValue 来定义前 3 名。

查询比这复杂一点,但这是它的核心:

    SELECT DISTINCT
    --*
    LTRIM(RTRIM(cs.City)) City
    ,LTRIM(RTRIM(cs.State)) State

    ,SUM(cs.TotalCustomerValueOverBase) over (partition by  cs.City, cs.State) TotalCustomerValue
    ,SUM(cs.TotalOrdersBase) over (partition by  cs.City, cs.State) TotalOrders

    FROM ( -- This table gives full customer information per customer.
    )CSS

我想要做的是创建一个表格,其中包含每个州的前 3 个城市,然后是另一行(将被视为每隔一行),它将是所有其他城市的总和。所以总表将有 200 行,(50*4)。

我正在尝试用行号做一些事情,但我似乎无法工作

        ,ROW_NUMBER() over (partition by  cs.City, cs.State Order by sum(CS.TotalCustomerValueOverBase)) rowNr

然后我可以尝试将所有大于 3 的行号相加。

有人有什么想法吗?

然后我假设 s-s-rS 当我想用这些数据创建视觉效果时,我可以使用关键字“其他”在所有行上创建一个过滤器,对吧?

这是我的完整查询(尝试实施 Gordon 的解决方案)

SELECT 
(case when CSS.seqnum <= 3 then city else 'Others' end) as city,
(case when seqnum <= 3 then state end) as state
sum(TotalCustomerValue) as TotalCustomerValue,
sum(TotalOrders) as TotalOrders

FROM
(
    SELECT DISTINCT
    --*
    LTRIM(RTRIM(cs.City)) City
    ,LTRIM(RTRIM(cs.State)) State

    ,right(left(lat, len(lat) -1),len(lat) -2) lat -- the lat and long are wrapped in quotes
    ,right(left(lng, len(lng) -1),len(lng) -2) lng -- so i have to do the left right to get rid of them. 

    ,SUM(cs.TotalCustomerValueOverBase) over (partition by  cs.City, cs.State) TotalCustomerValue
    ,SUM(cs.TotalOrdersBase) over (partition by  cs.City, cs.State) TotalOrders
    --,SUM(cs.TotalQuantityOverBase) over (partition by  cs.City, cs.State) TotalQuantity

    ,CONVERT(float, right(left(population_proper, len(population_proper) -1),len(population_proper) -2)) population_proper

    ,CAST(
        SUM(cs.TotalOrdersBase) over (partition by  cs.City, cs.State) 
        /
        NULLIF(convert(float, right(left(population_proper, len(population_proper) -1),len(population_proper) -2)),0)--*100 
    as decimal(10,4)) AS OrderDensityPercent

    ,SUM(cs.BrandNewCustomer) over (partition by     cs.City, cs.State) BrandNewCustomers
    ,SUM(cs.RecurringCustomer) over (partition by    cs.City, cs.State) RecurringCustomers
    ,SUM(cs.ReactivatedCustomer) over (partition by  cs.City, cs.State) ReactivatedCustomers

    ,row_number() over (partition by ltrim(rtrim(cs.State)) order by sum(cs.TotalCustomerValueOverBase) desc) as seqnum

    FROM ( -- This table gives full customer information per customer.
        SELECT 
        CC.CustomerEmail
        ,CC.Month Month
        ,CONCAT(CC.Year, '-', CASE WHEN CC.Month < 10 then '0' else '' end, CC.Month) Date


        ,CASE WHEN  
            (   ISNULL(CC.TotalOrdersCustomerBase,0) >= 1
            AND ISNULL(RC.TotalOrdersRecurringBase,0) = 0
            AND ISNULL(LC.TotalOrdersLifetimeBase,0) = 0) 
        THEN 1 ELSE 0 END BrandNewCustomer

        ,CASE WHEN  
            (   ISNULL(CC.TotalOrdersCustomerBase,0)  >= 1 
            AND ISNULL(RC.TotalOrdersRecurringBase,0) >= 1)
        THEN 1 ELSE 0 END RecurringCustomer

        ,CASE WHEN  
            (   ISNULL(CC.TotalOrdersCustomerBase,0) >= 1 
            AND ISNULL(RC.TotalOrdersRecurringBase,0) = 0
            AND ISNULL(LC.TotalOrdersLifetimeBase,0) >= 1) 
        THEN 1 ELSE 0 END ReactivatedCustomer

        ,ISNULL(CC.TotalCustomerValueOverCustomerBase,0) TotalCustomerValueOverCustomerBase
        ,ISNULL(CC.TotalOrdersCustomerBase,0) TotalOrdersCustomerBase
        ,ISNULL(CC.TotalQuantityOverCustomerBase,0) TotalQuantityOverCustomerBase

        ,ISNULL(RC.TotalCustomerValueOverRecurringBase,0) TotalCustomerValueOverRecurringBase
        ,ISNULL(RC.TotalOrdersRecurringBase,0) TotalOrdersRecurringBase
        ,ISNULL(RC.TotalQuantityOverRecurringBase,0) TotalQuantityOverRecurringBase

        ,ISNULL(LC.TotalCustomerValueOverLifetimeBase,0) TotalCustomerValueOverLifetimeBase
        ,ISNULL(LC.TotalOrdersLifetimeBase,0) TotalOrdersLifetimeBase
        ,ISNULL(LC.TotalQuantityOverLifetimeBase,0) TotalQuantityOverLifetimeBase

        ,ISNULL(FC.TotalCustomerValueOverBase,0) TotalCustomerValueOverBase
        ,ISNULL(FC.TotalOrdersBase,0) TotalOrdersBase
        ,ISNULL(FC.TotalQuantityOverBase,0) TotalQuantityOverBase

        ,ISNULL(CC.TotalCustomersOverCustomerBase,0) TotalCustomersOverCustomerBase
        ,ISNULL(RC.TotalCustomersOverRecurringBase,0) TotalCustomersOverRecurringBase
        ,ISNULL(LC.TotalCustomersOverLifetimeBase,0) TotalCustomersOverLifetimeBase
        ,ISNULL(FC.TotalCustomersOverBase,0) TotalCustomersOverBase

        ,CC.City
        ,CC.State
        ,CC.CountryCode

        From
        (
            SELECT 
            C.CustomerEmail
            ,C.Month
            ,C.Year
            ,C.TotalCustomersOverCustomerBase
            ,C.TotalCustomerValueOverCustomerBase
            ,C.TotalOrdersCustomerBase
            ,C.TotalQuantityOverCustomerBase
            ,C.City
            ,C.State
            ,C.CountryCode
            FROM
            #CustomerBase C
            WHERE C.OrderCountCustomerBase = 1 -- This makes it return only the first row of a customer with multiple purchases.
            --and  TotalOrdersCustomerBase = TotalQuantityOverCustomerBase 
        ) CC

        LEFT JOIN 
        (
            SELECT
            R.CustomerEmail
            ,R.TotalCustomersOverRecurringBase
            ,R.TotalCustomerValueOverRecurringBase
            ,R.TotalOrdersRecurringBase
            ,R.TotalQuantityOverRecurringBase
            FROM
            #RecurringBase R
            WHERE R.OrderCountRecurringBase = 1
        ) RC ON CC.CustomerEmail = RC.CustomerEmail

        LEFT JOIN 
        (
            SELECT 
            L.CustomerEmail
            ,L.TotalCustomersOverLifetimeBase
            ,L.TotalCustomerValueOverLifetimeBase
            ,L.TotalOrdersLifetimeBase
            ,L.TotalQuantityOverLifetimeBase
            FROM
            #LifetimeBase L
            WHERE L.OrderCountLifetimeBase = 1
        ) LC ON CC.CustomerEmail = LC.CustomerEmail

        LEFT JOIN 
        (
            SELECT 
            F.CustomerEmail
            ,F.TotalCustomersOverBase
            ,F.TotalCustomerValueOverBase
            ,F.TotalOrdersBase
            ,F.TotalQuantityOverBase
            FROM
            #FullBase F
            WHERE F.OrderCountBase = 1
        ) FC ON CC.CustomerEmail = FC.CustomerEmail

    ) Cs --Customers

    LEFT JOIN [A1Warehouse].[dbo].[uscities] Ci ON cs.City = right(left(ci.city_ascii, len(ci.city_ascii) -1),len(ci.city_ascii) -2) and cs.State = right(left(ci.state_id, len(ci.state_id) -1),len(ci.state_id) -2)

    WHERE 
    LAT IS NOT NULL 
    AND LNG IS NOT NULL

    group by ltrim(rtrim(cs.City)), ltrim(rtrim(cs.State))


)CSS


Where CAST(CSS.LAT AS FLOAT) > 20 AND CAST(CSS.LNG AS FLOAT) > -120

group by (case when seqnum <= 3 then city else 'Others' end),
(case when seqnum <= 3 then state end)

ORDER BY TotalCustomerValue DESC

【问题讨论】:

如何定义“前三名”? 我已经通过 TotalCustomerValue 添加了该信息 【参考方案1】:

您可以使用两个级别的聚合和一些条件逻辑:

select (case when seqnum <= 3 then city else 'Others' end) as city,
       state,
       sum(TotalCustomerValue) as TotalCustomerValue,
       sum(TotalOrders) as TotalOrders
from (select ltrim(rtrim(cs.City)) as City, ltrim(rtrim(cs.State)) as State,
             sum(cs.TotalCustomerValueOverBase) as TotalCustomerValue,
             sum(cs.TotalOrdersBase) as TotalOrders,
             row_number() over (partition by ltrim(rtrim(cs.State)) order by sum(cs.TotalCustomerValueOverBase) desc) as seqnum
      from . . . cs
      group by ltrim(rtrim(cs.City)), ltrim(rtrim(cs.State))
     ) cs
group by (case when seqnum <= 3 then city else 'Others' end),
         (case when seqnum <= 3 then state end)

编辑:

根据评论,您可以:

select (case when seqnum <= 3 then city else 'Others' end) as city,
       (case when seqnum <= 3 then state end) as state,
       sum(TotalCustomerValue) as TotalCustomerValue,
       sum(TotalOrders) as TotalOrders
from (select ltrim(rtrim(cs.City)) as City, ltrim(rtrim(cs.State)) as State,
             sum(cs.TotalCustomerValueOverBase) as TotalCustomerValue,
             sum(cs.TotalOrdersBase) as TotalOrders,
             row_number() over (partition by ltrim(rtrim(cs.State)) order by sum(cs.TotalCustomerValueOverBase) desc) as seqnum
      from . . . cs
      group by ltrim(rtrim(cs.City)), ltrim(rtrim(cs.State))
     ) cs
group by (case when seqnum <= 3 then city else 'Others' end),
         state

【讨论】:

戈登,我已经尝试实施您的解决方案,但我似乎无法让它发挥作用。我已经编辑了我的原始问题以包含完整的查询和我尝试集成您的代码。我做错了吗? 问题似乎出在group by上,我想不通。 我得到了查询,但它把所有其他的都当作一个大桶,我试图每个州做一个其他桶。那可能吗? i.imgur.com/95OYAn7.png @Natan 。 . .你只需要调整GROUP BY

以上是关于SQL 按 Top 3 和其他对行进行分组。 (按州和其他排名前 3 名的城市的收入)的主要内容,如果未能解决你的问题,请参考以下文章

按数组中的指定列对行进行分组

Django:将价格注释到行,然后按字段对行进行分组,并将价格总和注释到组

在Microsoft SWS中,如何对行进行分组?

使用 ag 网格,尝试按一个值对行分组并显示另一个

SQL 最佳实践 - 可以依靠自动增量字段按时间顺序对行进行排序吗?

Access 2010 SQL--在交叉表查询中按聚合函数对行进行排序