是否有任何功能(将元组更改为数组)或(按键求和数组)?

Posted

技术标签:

【中文标题】是否有任何功能(将元组更改为数组)或(按键求和数组)?【英文标题】:Is there any function (change Tuple to Array) or (sum Array by key)? 【发布时间】:2019-05-06 00:48:50 【问题描述】:

Q1 和 Q2 在不同方面是相同的问题。 如果数据存储为元组(键,值),任何SQL都可以得到相同的结果?

(1,3)(2,5)(4,7)
(1,3)(2,5)(3,4)
(2,3)(7,5)(10,4)

Q1:sumMap可以将Array转为Tuple,但是如何将Tuple转为Array呢?

select sumMap(a, b) from (
select array(1,2,4) as a, array(3,5,7) as b
union all
select array(1,2,3) as a, array(3,5,4) as b
union all
select array(2,7,10) as a, array(3,5,4) as b);

│ ([1,2,3,4,7,10],[6,13,4,7,5,4]) │

错误 SQL:

select sumMap(a, b).[0], sumMap(a, b).[1] from tbl

[1,2,3,4,7,10]   [6,13,4,7,5,4]

Q2:如何对数组进行key求和,比如sumMap?

select array(1,2,4) as a, array(3,5,7) as b
union all
select array(1,2,3) as a, array(3,5,4) as b
union all
select array(2,7,10) as a, array(3,5,4) as b

│ [1,2,4] │ [3,5,7] │
│ [2,7,10]│ [3,5,4] │
│ [1,2,3] │ [3,5,4] │

错误 SQL:

select sumBykey(a, a), sumBykey(b, a).key2 from tbl

[1,2,3,4,7,10]   [6,13,4,7,5,4]

【问题讨论】:

【参考方案1】:

需要使用tuple access operators。

SELECT
    sumMap(a, b) AS summap,
    summap.1 AS a1,
    summap.2 AS a2
FROM
(
    SELECT [1, 2, 4] AS a, [3, 5, 7] AS b
    UNION ALL
    SELECT [1, 2, 3] AS a, [3, 5, 4] AS b
    UNION ALL
    SELECT [2, 7, 10] AS a, [3, 5, 4] AS b
)

/* Result:
    ┌─summap──────────────────────────┬─a1─────────────┬─a2─────────────┐
    │ ([1,2,3,4,7,10],[6,13,4,7,5,4]) │ [1,2,3,4,7,10] │ [6,13,4,7,5,4] │
    └─────────────────────────────────┴────────────────┴────────────────┘
*/

目前sumMap 仅支持数字键值。对其他类型的键使用哈希:

SELECT
    sumMap(arrayMap(x -> xxHash32(x), a), b) AS summap,
    summap.1 AS a1,
    summap.2 AS a2
FROM
(
    SELECT ['1', '2', '4'] AS a, [3, 5, 7] AS b
    UNION ALL
    SELECT ['1', '2', '3'] AS a, [3, 5, 4] AS b
    UNION ALL
    SELECT ['2', '7', '10'] AS a, [3, 5, 4] AS b
)

/* Result:
┌─summap─────────────────────────────────────────────────────────────────────────────┬─a1────────────────────────────────────────────────────────────────┬─a2─────────────┐
│ ([205742900,548432130,1150380693,1842982710,2632741828,3068971186],[13,5,4,7,4,6]) │ [205742900,548432130,1150380693,1842982710,2632741828,3068971186] │ [13,5,4,7,4,6] │
└────────────────────────────────────────────────────────────────────────────────────┴───────────────────────────────────────────────────────────────────┴────────────────┘
*/

【讨论】:

sumMap 似乎不接受字符串键(使用 Clickhouse 19.13)。有解决办法吗? @Backlin 我扩展了答案 就我而言,我确实必须保留原始密钥。最终,我使用了您建议的方法 here,即 flatten(groupArray(a)) as str_keysarrayEnumerateDense(keys) as int_keys(或者您可以在这里散列),然后是 arrayReduce('sumMap', [int_keys], ...),最后使用 arrayFirst 查找原始密钥。非常感谢您的帮助! 有没有办法防止“中间列”出现在输出中?例如。如果用户只关心a1, a2,有没有办法删除summap @Backlin 很高兴为您提供帮助。要隐藏未使用的列,只需使用如下嵌套查询: SELECT a1, a2 FROM (SELECT sumMap(arrayMap(x -> xxHash32(x), a), b) AS summap, summap.1 AS a1, summap. 2 AS a2 ...).

以上是关于是否有任何功能(将元组更改为数组)或(按键求和数组)?的主要内容,如果未能解决你的问题,请参考以下文章

将元组中的列表转换为numpy数组?

Python:将元组转换为二维数组

使用 SwiftyJSON 将元组数组转换为 JSON 字符串

坚持不懈续集 初学者挑战学习Python编程30天

Hive 数组类型的求和值

python中如何将元组展开