如何处理数据透视表和电源查询中的重复条目以填充到 excel 仪表板中

Posted

技术标签:

【中文标题】如何处理数据透视表和电源查询中的重复条目以填充到 excel 仪表板中【英文标题】:how to handle repeated entries coming in the pivot table & power query to populate in excel dashboard 【发布时间】:2021-11-13 13:06:33 【问题描述】:

我有两张表容量需求

容量表如下所示:

RESOURCE NAME SKILL GROUP PROJECT START DATE END DATE COST PER HOUR CAPACITY
Resource 1 Automation Testing Project 1 1-Oct-2021 31-Mar-2022 12.0 800.0
Resource 2 DB Testing Project 1 1-Oct-2021 31-Mar-2022 11.0 900.0
Resource 3 DB Testing Project 1 1-Oct-2021 31-Dec-2021 12.0 800.0
Resource 4 Report Testing Project 2 1-Oct-2021 30-Apr-2022 12.0 900.0
Resource 5 CICD and Devops Project 3 1-Oct-2021 31-Mar-2022 11.0 800.0
Resource 6 Performance Testing Project 1 1-Oct-2021 31-Mar-2022 12.0 900.0
Resource 7 Automation Testing Project 2 1-Nov-2021 31-Mar-2022 10.0 800.0
Resource 8 Cloud Testing Project 3 1-Oct-2021 31-Mar-2022 12.0 900.0
Resource 9 Report Testing Project 1 1-Dec-2021 31-Dec-2021 11.0 800.0
Resource 10 Cloud Testing Project 1 1-Dec-2021 31-Dec-2021 11.0 900.0
Resource 11 Report Testing Project 3 1-Dec-2021 31-Dec-2021 12.0 800.0
Resource 12 Pipeline Testing Project 1 1-Dec-2021 31-Dec-2021 11.0 900.0
Resource 13 Cloud Testing Project 3 1-Dec-2021 31-Dec-2021 12.0 800.0

需求表如下所示:

RESOURCE NAME SKILL GROUP PROJECT START DATE END DATE DEMAND
Resource 1 Automation Testing Project 2 1-Oct-2021 25-Oct-2021 200.0
Resource 2 DB Testing Project 1 1-Oct-2021 31-Dec-2021 300.0
Resource 3 DB Testing Project 1 1-Oct-2021 31-Dec-2021 400.0
Resource 1 Report Testing Project 1 1-Oct-2021 31-Dec-2021 200.0
Resource 4 CICD and Devops Project 3 1-Oct-2021 31-Mar-2022 300.0
Resource 5 Performance Testing Project 2 1-Oct-2021 25-Oct-2021 400.0
Resource 6 Automation Testing Project 1 1-Oct-2021 31-Dec-2021 200.0
Resource 2 Cloud Testing Project 2 1-Oct-2021 25-Oct-2021 300.0
Resource 7 Report Testing Project 1 1-Oct-2021 31-Dec-2021 400.0
Resource 8 Cloud Testing Project 3 1-Oct-2021 31-Dec-2021 800.0
Resource 9 Report Testing Project 2 1-Oct-2021 31-Dec-2021 800.0
Resource 10 Pipeline Testing Project 1 1-Oct-2021 31-Dec-2021 600.0
Resource 11 Cloud Testing Project 3 1-Oct-2021 31-Dec-2021 700.0
Resource 10 Performance Testing Project 2 1-Oct-2021 31-Dec-2021 250.0
Resource 11 Automation Testing Project 1 1-Oct-2021 31-Dec-2021 250.0

我在资源名称的基础上使用power query合并了这两个表,并尝试生成下面的数据透视表。

数据透视表截图

我在两个表中的共同字段是 “资源名称”,我正在尝试在数据透视表中构建它,它将在我的仪表板中进一步使用切片器。尝试像这样构建仪表板。

仪表板屏幕截图

挑战点:

无法捕获容量小时数和总容量成本,成本不断重复。 HoursCost 部分的其他值是精细的 Demand hours 和 Total Demand 根据上面的数据透视表。

PowerQuery 合并屏幕截图

决赛桌截图

之后,我选择“关闭并加载”并选择“上传到数据模型”选项

这是最终表格的样子:

RESOURCE NAME SKILL GROUP PROJECT START DATE END DATE COST PER HOUR CAPACITY DemandTable.RESOURCE NAME DemandTable.SKILL GROUP DemandTable.PROJECT DemandTable.DETAIL DemandTable.START DATE DemandTable.END DATE DemandTable.DEMAND
Resource 1 Automation Testing Project 1 01-10-21 0:00 31-03-22 0:00 12 800 Resource 1 Automation Testing Project 2 01-10-21 0:00 25-10-21 0:00 200
Resource 1 Automation Testing Project 1 01-10-21 0:00 31-03-22 0:00 12 800 Resource 1 Report Testing Project 1 01-10-21 0:00 31-12-21 0:00 200
Resource 2 DB Testing Project 1 01-10-21 0:00 31-03-22 0:00 11 900 Resource 2 DB Testing Project 1 01-10-21 0:00 31-12-21 0:00 300
Resource 2 DB Testing Project 1 01-10-21 0:00 31-03-22 0:00 11 900 Resource 2 Cloud Testing Project 2 01-10-21 0:00 25-10-21 0:00 300
Resource 3 DB Testing Project 1 01-10-21 0:00 31-12-21 0:00 12 800 Resource 3 DB Testing Project 1 01-10-21 0:00 31-12-21 0:00 400
Resource 4 Report Testing Project 2 01-10-21 0:00 30-04-22 0:00 12 200 Resource 4 CICD and Devops Project 3 01-10-21 0:00 31-03-22 0:00 300
Resource 5 CICD and Devops Project 3 01-10-21 0:00 31-03-22 0:00 11 800 Resource 5 Performance Testing Project 2 01-10-21 0:00 25-10-21 0:00 400
Resource 6 Performance Testing Project 1 01-10-21 0:00 31-03-22 0:00 12 900 Resource 6 Automation Testing Project 1 01-10-21 0:00 31-12-21 0:00 200
Resource 7 Automation Testing Project 2 01-11-21 0:00 31-03-22 0:00 10 250 Resource 7 Report Testing Project 1 01-10-21 0:00 31-12-21 0:00 400
Resource 8 Cloud Testing Project 3 01-10-21 0:00 31-03-22 0:00 12 900 Resource 8 Cloud Testing Project 3 01-10-21 0:00 31-12-21 0:00 800
Resource 9 Report Testing Project 1 01-12-21 0:00 31-12-21 0:00 11 800 Resource 9 Report Testing Project 2 01-10-21 0:00 31-12-21 0:00 800
Resource 10 Cloud Testing Project 1 01-12-21 0:00 31-12-21 0:00 11 900 Resource 10 Pipeline Testing Project 1 01-10-21 0:00 31-12-21 0:00 600
Resource 10 Cloud Testing Project 1 01-12-21 0:00 31-12-21 0:00 11 900 Resource 10 Performance Testing Project 2 01-10-21 0:00 31-12-21 0:00 250
Resource 11 Report Testing Project 3 01-12-21 0:00 31-12-21 0:00 12 800 Resource 11 Cloud Testing Project 3 01-10-21 0:00 31-12-21 0:00 700
Resource 11 Report Testing Project 3 01-12-21 0:00 31-12-21 0:00 12 800 Resource 11 Automation Testing Project 1 01-10-21 0:00 31-12-21 0:00 250
Resource 12 Pipeline Testing Project 1 01-12-21 0:00 31-12-21 0:00 11 900
Resource 13 Cloud Testing Project 3 01-12-21 0:00 31-12-21 0:00 12 800

查询: CapacityTable

let Source = Excel.CurrentWorkbook()[Name="CapacityTable"][Content], #"Changed Type" = Table.TransformColumnTypes(Source,"RESOURCE NAME", type text, "SKILL GROUP", type text, "PROJECT", type text, "START DATE", type datetime, "END DATE", type datetime, "SUN", type any, "MON", type number, "TUE", type number, "WED", type number, "THU", type number, "FRI", type number, "SAT", type any, "COST PER HOUR", Int64.Type, "CAPACITY", Int64.Type) in #"Changed Type"

需求表

let Source = Excel.CurrentWorkbook()[Name="DemandTable"][Content], #"Changed Type" = Table.TransformColumnTypes(Source,"RESOURCE NAME", type text, "SKILL GROUP", type text, "PROJECT", type text, "DETAIL", type any, "START DATE", type datetime, "END DATE", type datetime, "SUN", type any, "MON", type number, "TUE", type number, "WED", type number, "THU", type number, "FRI", type number, "SAT", type any, "DEMAND", Int64.Type) in #"Changed Type"

决赛桌

let Source = Table.NestedJoin(DemandTable, "RESOURCE NAME", CapacityTable, "RESOURCE NAME", "CapacityTable", JoinKind.LeftOuter), #"Expanded CapacityTable" = Table.ExpandTableColumn(Source, "CapacityTable", "RESOURCE NAME", "SKILL GROUP", "PROJECT", "COST PER HOUR", "CAPACITY", "CapacityTable.RESOURCE NAME", "CapacityTable.SKILL GROUP", "CapacityTable.PROJECT", "CapacityTable.COST PER HOUR", "CapacityTable.CAPACITY") in #"Expanded CapacityTable"

问题是,当我尝试通过数据透视表为我的所有项目和资源的容量与需求构建数据时,我的容量小时对于我的需求表中存在的每条记录都会重复。我相信我需要在 Project 的基础上填充我的数据,但是不确定需要做什么。

【问题讨论】:

我建议你添加你用来尝试获得你想要的结果的代码;以及您想要的结果的屏幕截图;以及您期望切片器做什么的信息。阅读帮助页面以获取有关How to Ask a Good Question 和How to create a Minimal, Complete, and Verifiable example 的信息可能会有所帮助。事实上,你对你想要的结果的描述,用问答来表达,太模糊了。 @RonRosenfeld - 我已根据您的意见和建议更新了问题。 我设法解决了创建上述仪表板的所有问题,但是,只是停留在容量小时数和成本上,这对于需求表中存在的所有资源都重复出现。我尝试使用电源查询连接和其他选项来解决这个问题,但是没有运气。有什么建议吗? 您需要对流程或代码进行一些更改。但是,鉴于您迄今为止发布的内容,我不知道如何为您提供帮助。 @RonRosenfeld 如问题中所述,我刚刚使用 powerquery 合并了我的容量和需求表,并且在我的数据透视表中,因为需求表具有相同的资源名称重复,但是当我合并需求和容量时我最终为需求表中的每个匹配资源名称复制了容量数量。在我上面的示例数据中,资源 1、2 和 11 在 Demand 表中多次列出,总容量以每个项目的资源级别的数量列出。 【参考方案1】:

要创建您正在显示但不复制数据的数据透视表,您可以:

使用JoinKind.FullOuter 加入基于"RESOURCE NAME","PROJECT" 的两个表 为缺少容量或需求表中的条目的行展开表并在 PROJECTRESOURCE NAME 列中“填充空值” 添加需求*成本列。 清理后,您可以保存并加载到数据透视表。

M 码

let

//Lodad and Type Capacity Table
    Source1 = Excel.CurrentWorkbook()[Name="Capacity"][Content],
    Capacity = Table.TransformColumnTypes(Source1,
        "RESOURCE NAME", type text, "SKILL GROUP", type text, "PROJECT", type text, 
        "START DATE", type date, "END DATE", type date, 
        "COST PER HOUR", Currency.Type, "CAPACITY", Number.Type),

//Lodad and Type Demand Table    
    Source2 = Excel.CurrentWorkbook()[Name="Demand"][Content],
    Demand = Table.TransformColumnTypes(Source2,
        "RESOURCE NAME", type text, "SKILL GROUP", type text, "PROJECT", type text, 
        "START DATE", type date, "END DATE", type date, "DEMAND", Number.Type),

//Join the two tables
    joined = Table.NestedJoin(Capacity,"RESOURCE NAME","PROJECT",Demand,"RESOURCE NAME","PROJECT","Joined",JoinKind.FullOuter),

//Remove unneeded columns and expand the Joined table
    #"Removed Columns" = Table.RemoveColumns(joined,"SKILL GROUP", "START DATE", "END DATE"),
    #"Expanded Joined" = Table.ExpandTableColumn(#"Removed Columns", "Joined", 
        "RESOURCE NAME", "PROJECT", "DEMAND", 
        "Demand.RESOURCE NAME", "Demand.PROJECT", "Demand.DEMAND"),

//Transform the null records for those missing from one table or the other
    capFN = "RESOURCE NAME", "PROJECT", "COST PER HOUR","CAPACITY",
    demFN = "Demand.RESOURCE NAME", "Demand.PROJECT", "Demand.DEMAND",
    recs = Table.ToRecords(#"Expanded Joined"),
    xForm = List.Generate(
        ()=>[rec = recs0, idx=0],
        each [idx] < List.Count(recs),
        each [rec = if recs[idx]+1[RESOURCE NAME] = null or recs[idx]+1[Demand.RESOURCE NAME]= null then 
            let 
                rtl = Record.ToList(recs[idx]+1),
                xRtl = if rtl0 = null then List.ReplaceRange(rtl,0,2, List.Range(rtl,4,2)) 
                    else List.ReplaceRange(rtl,4,2, List.Range(rtl,0,2)) 
            in Record.FromList(xRtl, List.Combine(capFN,demFN))
            
                        
                        else recs[idx]+1, idx=[idx]+1],
        each [rec]),
    #"Converted to Table" = Table.FromList(xForm, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", 
        "RESOURCE NAME", "PROJECT", "COST PER HOUR", "CAPACITY", "Demand.DEMAND", 
        "RESOURCE NAME", "PROJECT", "COST PER HOUR", "CAPACITY", "Demand.DEMAND"),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Column1",
        "RESOURCE NAME", type text, "PROJECT", type text, 
        "COST PER HOUR", Currency.Type, 
        "CAPACITY", Int64.Type, "Demand.DEMAND", Int64.Type),

//Add demand*cost column
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Demand Cost", each [COST PER HOUR]*[Demand.DEMAND], Currency.Type),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom","COST PER HOUR")
in
    #"Removed Columns1"

=>

【讨论】:

以上是关于如何处理数据透视表和电源查询中的重复条目以填充到 excel 仪表板中的主要内容,如果未能解决你的问题,请参考以下文章

如何处理重复条目的错误?

在电源查询中加载 CSV 时如何处理多个引号?

如何处理 Postgresql 查询中的单引号 [重复]

如何处理需要反透视的 JSON 数据?

如何处理 SQLAlchemy、flask、python 中的唯一数据

查询时如何处理db中随机字母大小写实例?