按列值的前导字符对数据行进行分组

Posted

技术标签:

【中文标题】按列值的前导字符对数据行进行分组【英文标题】:Group rows of data by the leading characters of a column value 【发布时间】:2011-07-08 15:34:48 【问题描述】:

我有一个如下所示的数组,我想做的是根据timestamp 列的日期子字符串创建多个子数组,例如 (2011-02-04):

Array
(
    [0] => Array
        (
            [avgvalue] => 0
            [maxvalue] => 0
            [minvalue] => 0
            [nrsamples] => 0
            [stddeviation] => 0
            [timestamp] => 2011-02-04T11:00:00.000Z
        )

    [1] => Array
        (
            [avgvalue] => 268.3
            [maxvalue] => 268.3
            [minvalue] => 268.3
            [nrsamples] => 0
            [stddeviation] => 0
            [timestamp] => 2011-02-04T12:00:00.000Z
        )

    [2] => Array
        (
            [avgvalue] => 268.391666667
            [maxvalue] => 268.4
            [minvalue] => 268.3
            [nrsamples] => 0.0288675134595
            [stddeviation] => 0.0288675134595
            [timestamp] => 2011-02-04T13:00:00.000Z
        )

    [3] => Array
        (
            [avgvalue] => 268.433333333
            [maxvalue] => 268.5
            [minvalue] => 268.4
            [nrsamples] => 0.0492365963918
            [stddeviation] => 0.0492365963918
            [timestamp] => 2011-02-04T14:00:00.000Z
        )

    [4] => Array
        (
            [avgvalue] => 268.5
            [maxvalue] => 268.5
            [minvalue] => 268.5
            [nrsamples] => 0
            [stddeviation] => 0
            [timestamp] => 2011-02-04T15:00:00.000Z
        )

    [5] => Array
        (
            [avgvalue] => 268.575
            [maxvalue] => 268.6
            [minvalue] => 268.5
            [nrsamples] => 0.0452267016867
            [stddeviation] => 0.0452267016867
            [timestamp] => 2011-02-04T16:00:00.000Z
        )

    [6] => Array
        (
            [avgvalue] => 268.616666667
            [maxvalue] => 268.7
            [minvalue] => 268.6
            [nrsamples] => 0.0389249472081
            [stddeviation] => 0.0389249472081
            [timestamp] => 2011-02-04T17:00:00.000Z
        )

    [7] => Array
        (
            [avgvalue] => 268.7
            [maxvalue] => 268.7
            [minvalue] => 268.7
            [nrsamples] => 0
            [stddeviation] => 0
            [timestamp] => 2011-02-04T18:00:00.000Z
        )

    [8] => Array
        (
            [avgvalue] => 268.741666667
            [maxvalue] => 268.8
            [minvalue] => 268.7
            [nrsamples] => 0.0514928650545
            [stddeviation] => 0.0514928650545
            [timestamp] => 2011-02-04T19:00:00.000Z
        )

    [9] => Array
        (
            [avgvalue] => 268.8
            [maxvalue] => 268.8
            [minvalue] => 268.8
            [nrsamples] => 0
            [stddeviation] => 0
            [timestamp] => 2011-02-04T20:00:00.000Z
        )

    [10] => Array
        (
            [avgvalue] => 268.883333333
            [maxvalue] => 268.9
            [minvalue] => 268.8
            [nrsamples] => 0.0389249472081
            [stddeviation] => 0.0389249472081
            [timestamp] => 2011-02-04T21:00:00.000Z
        )
 )

上述数组在每个子数组中都有一个时间戳键。我分解了时间戳值以将日期与时间分开,现在我无法将数组拆分为子数组。

我想要的是为2011-02-04 设置一个数组(包含该日期的所有值)和一个2011-02-05 的数组(包含所有该日期的值)。这可以是动态的,我的意思是日期可以更多。那么,我该怎么做呢?

我希望它是:

array[0] => array(... list of all the values for 2011-02-04),
array[1] => array(...list of all values for 2011-02-05)

【问题讨论】:

【参考方案1】:

假设所有条目的日期格式都相同(看起来确实如此),您可以简单地遍历数组:

$result = array();

foreach($array as $item) 
    $date = strstr($item['timestamp'], 'T', true);
    if(!array_key_exists($date, $result)) 
        $result[$date] = array();
    
    $result[$date][] = $item;

参考:strstrarray_key_exists

根据原始数组中项目的顺序,您可能必须使用ksort 来按时间顺序对$result 数组进行排序。

【讨论】:

【参考方案2】:

在推送新子数组之前无需检查数组键是否存在。只需将子数组推入日期键控子数组即可。

解决方法很简单:(Demo)

$result = [];
foreach ($array as $row) 
    $result[substr($row['timestamp'], 0, 10)][] = $row;

分组后,如果要对组进行排序,请使用 ksort($result) 表示 ASC 或使用 krsort($result) 表示 DESC。如果要重新索引第一级键,请使用$result = array_values($result)

【讨论】:

以上是关于按列值的前导字符对数据行进行分组的主要内容,如果未能解决你的问题,请参考以下文章

如何按列值的计数进行分组并对其进行排序?

HiveQL 按列值的子字符串分组并识别缺失的组

按列值查询某些行的联合和分组

按列值分组的列值更新mysql排名

如何在 Pandas 数据框中按列值分组

ListView 按列值分组