天蓝色流分析是第一个和最后一个查询

Posted

技术标签:

【中文标题】天蓝色流分析是第一个和最后一个查询【英文标题】:azure Stream analytics isfirst and last query 【发布时间】:2020-02-13 06:43:21 【问题描述】:

我有类似下面的有效载荷。我需要每 1 分钟获取第一个不同的批次值。请告诉我如何使用 isfirst 和 lag 或 last 在流分析中实现这一目标

输出如下:

BATCH=01,"2015-01-01T00:00:01.0000000Z" BATCH=02,"2015-01-01T00:00:03.0000000Z" BATCH=03,"2015-01-01T00:00:06.0000000Z" BATCH=01,"2015-01-01T00:00:14.0000000Z" BATCH=02,"2015-01-01T00:00:18.0000000Z" BATCH=03,"2015-01-01T00:00:22.0000000Z" BATCH=01,"2015-01-01T00:00:27.0000000Z" BATCH=01,"2015-01-01T00:00:31.0000000Z"

Pay Load:
    [
            "Payload": 
                "Make": "BATCH1",
                "VAL": "01",
                "TS": "2015-01-01T00:00:01.0000000Z"
            
    ,
    
    "Payload": 
            "Make": "BATCH1",
            "VAL": "01",
            "TS": "2015-01-01T00:00:02.0000000Z"
        
    ,
    
        "Payload": 
            "Make": "BATCH1",
            "VAL": "02",
            "TS": "2015-01-01T00:00:03.0000000Z"
        
    ,
    
        "Payload": 
            "Make": "BATCH1",
            "VAL": "02",
            "TS": "2015-01-01T00:00:04.0000000Z"
        
    ,
    
        "Payload": 
            "Make": "BATCH1",
            "VAL": "02",
            "TS": "2015-01-01T00:00:05.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH1",
            "VAL": "03",
            "TS": "2015-01-01T00:00:06.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH1",
            "VAL": "03",
            "TS": "2015-01-01T00:00:07.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH1",
            "VAL": "03",
            "TS": "2015-01-01T00:00:10.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH1",
            "VAL": "03",
            "TS": "2015-01-01T00:00:11.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH1",
            "VAL": "03",
            "TS": "2015-01-01T00:00:12.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH2",
            "VAL": "01",
            "TS": "2015-01-01T00:00:13.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH2",
            "VAL": "01",
            "TS": "2015-01-01T00:00:14.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH2",
            "VAL": "01",
            "TS": "2015-01-01T00:00:15.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH2",
            "VAL": "01",
            "TS": "2015-01-01T00:00:16.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH2",
            "VAL": "01",
            "TS": "2015-01-01T00:00:17.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH2",
            "VAL": "02",
            "TS": "2015-01-01T00:00:18.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH2",
            "VAL": "02",
            "TS": "2015-01-01T00:00:20.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH2",
            "VAL": "02",
            "TS": "2015-01-01T00:00:21.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH3",
            "VAL": "02",
            "TS": "2015-01-01T00:00:22.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH3",
            "VAL": "02",
            "TS": "2015-01-01T00:00:23.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH3",
            "VAL": "02",
            "TS": "2015-01-01T00:00:24.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH3",
            "VAL": "02",
            "TS": "2015-01-01T00:00:25.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH3",
            "VAL": "02",
            "TS": "2015-01-01T00:00:26.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH4",
            "VAL": "01",
            "TS": "2015-01-01T00:00:27.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH4",
            "VAL": "01",
            "TS": "2015-01-01T00:00:28.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH4",
            "VAL": "01",
            "TS": "2015-01-01T00:00:29.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH4",
            "VAL": "01",
            "TS": "2015-01-01T00:00:30.0000000Z"
        

    ,
    "Payload": 
            "Make": "BATCH5",
            "VAL": "01",
            "TS": "2015-01-01T00:00:31.0000000Z"
        

    
    ]

【问题讨论】:

嗨,先生。实际上,我无法得到您的关注。您想在 1 分钟窗口内实现类似 TOP 的东西吗? 嗨 Jay,感谢您的回复,很抱歉我不清楚。在一分钟内,每个批次 ID 可以有多个 VAL 更改。例如:在几分钟内我可以得到 Make:batch1,Val :01, Make:batch1,val:01, Make:batch1,val:02, Make:batch1,val:02 ×××××××××××× Make:batch2,val:01, Make:batch2, val:01, Xxxxxxxxxx 从这里我只需要过滤每个批次更改的 val 并且没有重复项。我需要输出为 Make:batch1,val:01 Make:batch1,val:02 Make:batch2,val:01 Ofcourse with the original timestamps正如我提到的我的第一篇文章 【参考方案1】:

我试图将您的要求总结如下:

示例输入,在一分钟的窗口中,每个批次 ID 可以有多个 VAL 更改:

Make:batch1,Val:01, Make:batch1,val:01, Make:batch1,val:02, Make:batch1,val:02 ××××××××××××× Make:batch2,val:01, Make:batch2,val:01, Xxxxxxxxxx

想要的输出,每批只有val变化,没有重复:

Make:batch1,val:01 Make:batch1,val:02 Make:batch2,val:01

答案分为两部分:

1.采集静态周期数据,可以使用内置Tumbling Window function,如下:

2.没有像 distinct 那样的内置 ASA 功能来过滤重复项。我建议您使用 GROUP BYMAXASA UDF(link) 来接近您的结果。

SQL:

 SELECT g.Payload.Make,g.Payload.VAL,max(udf.convertdate(g.Payload.TS)) as TS
    FROM geoinput g TIMESTAMP BY g.Payload.TS
    GROUP BY g.Payload.Make,g.Payload.VAL, TumblingWindow(Duration(minute, 1))

测试输出:

顺便说一句,我只是在 UDF 中使用下面的代码

var date = new Date(datetime);
    return date.getTime();

另一种解决方法,您可以在 1 分钟内收集所有数据,然后使用 Azure Function as Output. 在 Azure 函数中,您可以根据需要处理数据。比如使用JSON对象来存储数据。Key-Value结构可以过滤重复行。

【讨论】:

感谢杰伊的回复。您的逻辑有效,但在第一种情况下,我需要 TS 作为日期时间格式。是的,我们也可以考虑第二种选择,但我必须单独在流分析中执行此操作。第一种方法的任何输入带有适当的时间戳

以上是关于天蓝色流分析是第一个和最后一个查询的主要内容,如果未能解决你的问题,请参考以下文章

天蓝色流分析查询太复杂?没有给出正确的输出,现在出错。接下来做啥

Azure 流分析查询 - 获取所有设备的最后一个请求

天蓝色流分析是不是读取来自所有分区的数据

流分析恢复行为

天蓝色流分析实施或最佳方法

根据 Azure 流分析中的时间顺序获取记录的最后一个值