在流 Azure 分析中将对象解析为输出中的字符串

Posted

技术标签:

【中文标题】在流 Azure 分析中将对象解析为输出中的字符串【英文标题】:Parse an object as string in output in Stream Azure Analytics 【发布时间】:2016-02-26 14:45:36 【问题描述】:

这个问题是关于流分析的。我想将 blob 导出到 SQL 中。我知道这个过程,我的问题是我必须使用的查询。

"performanceCounter":["available_bytes":"value":994164736.0,"categoryName":"Memory","instanceName":""],"internal":"data":"id":"459bf840-d259-11e5-a640-1df0b6342362","documentVersion":"1.61","context":"device":"type":"PC","network":"Ethernet","screenResolution":,"locale":"en-US","id":"RD0003FF73B748","roleName":"Sdm.MyGovId.Static.Web","roleInstance":"Sdm.MyGovId.Static.Web_IN_1","oemName":"Microsoft Corporation","deviceName":"Virtual Machine","deviceModel":"Virtual Machine","application":"version":"R2.0_20160205.5","location":"continent":"North America","country":"United States","clientip":"104.41.209.0","province":"Washington","city":"Redmond","data":"isSynthetic":false,"samplingRate":100.0,"eventTime":"2016-02-13T13:53:44.2667669Z","user":"isAuthenticated":false,"anonAcquisitionDate":"0001-01-01T00:00:00Z","authAcquisitionDate":"0001-01-01T00:00:00Z","accountAcquisitionDate":"0001-01-01T00:00:00Z","operation":,"cloud":,"serverDevice":,"custom":"dimensions":[],"metrics":[],"session":
"performanceCounter":["percentage_processor_total":"value":0.0123466420918703,"categoryName":"Processor","instanceName":"_Total"],"internal":"data":"id":"459bf841-d259-11e5-a640-1df0b6342362","documentVersion":"1.61","context":"device":"type":"PC","network":"Ethernet","screenResolution":,"locale":"en-US","id":"RD0003FF73B748","roleName":"Sdm.MyGovId.Static.Web","roleInstance":"Sdm.MyGovId.Static.Web_IN_1","oemName":"Microsoft Corporation","deviceName":"Virtual Machine","deviceModel":"Virtual Machine","application":"version":"R2.0_20160205.5","location":"continent":"North America","country":"United States","clientip":"104.41.209.0","province":"Washington","city":"Redmond","data":"isSynthetic":false,"samplingRate":100.0,"eventTime":"2016-02-13T13:53:44.2668221Z","user":"isAuthenticated":false,"anonAcquisitionDate":"0001-01-01T00:00:00Z","authAcquisitionDate":"0001-01-01T00:00:00Z","accountAcquisitionDate":"0001-01-01T00:00:00Z","operation":,"cloud":,"serverDevice":,"custom":"dimensions":[],"metrics":[],"session":
"performanceCounter":["percentage_processor_time":"value":0.0,"categoryName":"Process","instanceName":"w3wp"],"internal":"data":"id":"459bf842-d259-11e5-a640-1df0b6342362","documentVersion":"1.61","context":"device":"type":"PC","network":"Ethernet","screenResolution":,"locale":"en-US","id":"RD0003FF73B748","roleName":"Sdm.MyGovId.Static.Web","roleInstance":"Sdm.MyGovId.Static.Web_IN_1","oemName":"Microsoft Corporation","deviceName":"Virtual Machine","deviceModel":"Virtual Machine","application":"version":"R2.0_20160205.5","location":"continent":"North America","country":"United States","clientip":"104.41.209.0","province":"Washington","city":"Redmond","data":"isSynthetic":false,"samplingRate":100.0,"eventTime":"2016-02-13T13:53:44.2668342Z","user":"isAuthenticated":false,"anonAcquisitionDate":"0001-01-01T00:00:00Z","authAcquisitionDate":"0001-01-01T00:00:00Z","accountAcquisitionDate":"0001-01-01T00:00:00Z","operation":,"cloud":,"serverDevice":,"custom":"dimensions":[],"metrics":[],"session":

你可以看到 3 个 json 对象,它们中的数组 performanceCounter 中的对象具有不同的字段。基本上每个对象的第一个对象。第一个是available_bytes,第二个是percentage_processor_total,第三个是percentage_processor_time。

因为我要将它导出到一个名为 performaceCounter 的 sql 表中,所以我应该为每个不同的对象设置一个不同的列,所以我想将它保存到一个字符串中,然后在我的应用程序中解析它。

作为起点,我有这个查询,它读取输入(blob)并写入输出(SQL)

  Select GetArrayElement(A.performanceCounter,0) as a
INTO
  PerformanceCounterOutput
FROM PerformanceCounterInput A

此 GetArrayElement 在 performanceCounter 中获取数组的索引 0,然后为在每个对象中找到的每个不同字段写入不同的列。所以我应该有所有不同的计数器并为每个计数器创建一个列,但我的想法更像是一个列调用 performanceCounterData 并保存字符串,如

'"available_bytes":"value":994164736.0,"categoryName":"Memory","instanceName":""'

或者这个

""percentage_processor_total":"value":0.0123466420918703,"categoryName":"Processor","instanceName":"_Total""

""percentage_processor_time":"value":0.0,"categoryName":"Process","instanceName":"w3wp""

如何转换像字符串这样的数组? 我试过 CAST(GetArrayElement(A.performanceCounter,0) as nvarchar(max)) 但我不能。

请一些好的帮助会得到奖励

【问题讨论】:

【参考方案1】:

通过以下解决方案,我得到 2 列带有属性的名称和另一列带有属性的值,这是我最初的目的

With pc as
    (
        Select 
        GetArrayElement(A.[performanceCounter],0) as counter
        ,A.context.data.eventTime as eventTime
          ,A.context.location.clientip as clientIp
          ,A.context.location.continent as continent
          ,A.context.location.country as country
          ,A.context.location.province as province
          ,A.context.location.city as city
        FROM PerformanceCounterInput A
    )
        select 
    props.propertyName,
    props.propertyValue,
    pc.counter.categoryName,
    pc.counter.instanceName,
    pc.eventTime,
    pc.clientIp,
    pc.continent,
    pc.country,
    pc.province,
    pc.city
    from pc
    cross apply  GetRecordProperties(pc.counter) as props
    where props.propertyname<>'categoryname' and props.propertyname<>'instancename'

无论如何,如果有人发现如何在分析中用纯文本编写对象,仍然会得到奖励和赞赏

【讨论】:

【参考方案2】:

您可以执行以下操作,这会将计数器作为 (propertyName, propertyValue) 对。

with T1 as
(
select 
  GetArrayElement(iotInput.performanceCounter, 0) Counter,
  System.Timestamp [EventTime]
from 
    iotInput timestamp by context.data.eventTime
)

 select
  [EventTime],
  Counter.categoryName,
  Counter.available_bytes [Value]    
 from 
  T1
 where
  Counter.categoryName = 'Memory'

  union all

 select
   [EventTime],
  Counter.categoryName,
  Counter.percentage_processor_time [Value]    
 from 
  T1
 where
  Counter.categoryName = 'Process'

也可以执行为每种计数器类型提供一列的查询,您必须为每个计数器使用“case”语句进行连接或分组。

【讨论】:

在这种情况下,我需要事先知道将导出到 blob 的所有不同计数器(available_bytes,percentage_processor_time......)因为它是 JSon,我们没有确定的结构所以任何时候我都可以收到不同的柜台。看看我在下面找到的解决方案,看看我的真正意思。无论如何,非常感谢您的时间和不同的方法 GetRecordProperties() 函数非常适合这里。抱歉,没有注意到您已经在使用它。您编写的查询是目前执行此操作的最佳方式。

以上是关于在流 Azure 分析中将对象解析为输出中的字符串的主要内容,如果未能解决你的问题,请参考以下文章

如何在流分析组中将多条记录与字符串和空值合并

将流分析作业中的输出数据流式传输到 Azure Synapse Analytics sql 池表?

如何在流分析作业 ARM 模板中指定数据库表作为输出

在流分析中将时间戳拆分为单独的列,以便在 Power BI 中进行进一步筛选

如何从解析中的对象中获取 ObjectId 并在 swift 中将其呈现为字符串

在流分析中使用 Tumbling 窗口 2 小时未按预期输出