如何在 MySQL 中对具有不同平均值的三个变量进行分组?

Posted

技术标签:

【中文标题】如何在 MySQL 中对具有不同平均值的三个变量进行分组?【英文标题】:How to group three variable with different averages in MySQL? 【发布时间】:2016-03-30 17:41:39 【问题描述】:
+-----------------+---------------------+
|    date_time    |                vale |
+-----------------+---------------------+
| 12/13/2015 0:00 |        56.75        |
| 12/13/2015 0:15 |        208.75       |
| 12/13/2015 0:30 |        58.8         |
| 12/13/2015 0:45 |        61.79        |
| 12/13/2015 1:00 |        288.65       |
| 12/13/2015 1:15 |        89.1         |
| 12/13/2015 1:30 |        28.9         |
| 12/13/2015 1:45 |        57.04        |
| 12/14/2015 1:00 |        63.87        |
| 12/14/2015 1:15 |        219.83       |
| 12/14/2015 1:30 |        64.95        |
| 12/14/2015 1:45 |        65.24        |
| 12/14/2015 2:00 |        55.67        |
| 12/14/2015 2:15 |        21.63        |
| 12/14/2015 2:30 |        56.75        |
| 12/14/2015 2:45 |        57.04        |
+-----------------+---------------------+

我有 date_time 及其各自的值,现在如何根据天、周和小时取平均值,如下所示:

+-----------------+-----------------+-----------+----------+
|    date_time    |        hour_avg |   day_avg | week_avg |
+-----------------+-----------------+-----------+----------+
| 12/13/2015 0:00 | 96.52           |   106.2   |     90.9 |
| 12/13/2015 1:00 | 115.9           |   106.2   |     90.9 |
| 12/14/2015 1:00 | 103.4           |   75.6    |     90.9 |
| 12/14/2015 2:00 | 47.7            |   75.6    |     90.9 |
+-----------------+-----------------+-----------+----------+

【问题讨论】:

如果你想在date_time 列中显示小时,那么你必须计算day_avgweek_avg 也考虑时间......而且,顺便说一句,你想如何计算那些@ 987654327@和week_avg??? 它没有回答您的确切问题,但另一种方法(具有不同的输出风格)将使用“WITH ROLLUP”-dev.mysql.com/doc/refman/5.7/en/group-by-modifiers.html 【参考方案1】:

实现它的一种方法是使用 GROUP BY 日期和小时 + 整天/周的相关子查询:

SELECT 
   DATE_ADD(CAST(date_time AS DATE), INTERVAL  HOUR(date_time) HOUR) AS date_time
   ,ROUND(AVG(vale),1) AS hour_avg
   ,ROUND((SELECT AVG(vale) FROM tab t2 WHERE DATE(t2.date_time) = DATE(t.date_time) GROUP BY DATE(date_time)),1) AS  day_avg
   ,ROUND((SELECT AVG(vale) FROM tab t2 WHERE WEEK(t2.date_time) = WEEK(t.date_time) AND YEAR(t.date_time) = YEAR(t2.date_time)  GROUP BY WEEK(date_time)),1) AS  week_avg
FROM tab t
GROUP BY DATE(date_time), HOUR(date_time);

SqlFiddleDemo

输出:

╔═════════════════════════════╦═══════════╦══════════╦══════════╗
║         date_time           ║ hour_avg  ║ day_avg  ║ week_avg ║
╠═════════════════════════════╬═══════════╬══════════╬══════════╣
║ December, 13 2015 00:00:00  ║ 96.5      ║ 106.2    ║ 90.9     ║
║ December, 13 2015 01:00:00  ║ 115.9     ║ 106.2    ║ 90.9     ║
║ December, 14 2015 01:00:00  ║ 103.5     ║ 75.6     ║ 90.9     ║
║ December, 14 2015 02:00:00  ║ 47.8      ║ 75.6     ║ 90.9     ║
╚═════════════════════════════╩═══════════╩══════════╩══════════╝

【讨论】:

【参考方案2】:

计划

计算每个分组粒度的平均值 在小时级别将谷物连接在一起

查询

select ha.grain, ha.hour_avg, da.day_avg, wa.week_avg
from
(
select date(date_time) + interval hour(date_time) hour as grain, avg(vale) hour_avg
from temperature
group by date(date_time), hour(date_time)
) ha
inner join
(
select date(date_time) as day, avg(vale) as day_avg
from temperature
group by date(date_time)
) da
on date(grain) = da.day
inner join
(
select year(date_time) as year, week(date_time) as week, avg(vale) as week_avg
from temperature
group by year(date_time), week(date_time)
) wa
on wa.year = year(ha.grain)
and wa.week = week(ha.grain)
;

输出

+----------------------------+----------+----------+----------+
|           grain            | hour_avg | day_avg  | week_avg |
+----------------------------+----------+----------+----------+
| December, 13 2015 00:00:00 | 96.5225  | 106.2225 | 90.9225  |
| December, 13 2015 01:00:00 | 115.9225 | 106.2225 | 90.9225  |
| December, 14 2015 01:00:00 | 103.4725 | 75.6225  | 90.9225  |
| December, 14 2015 02:00:00 | 47.7725  | 75.6225  | 90.9225  |
+----------------------------+----------+----------+----------+

sqlfiddle

【讨论】:

您应该将 YEARWEEK 加入到 wa 子查询中。当表包含多年的数据时,仅一周会产生不正确的结果。 Demo

以上是关于如何在 MySQL 中对具有不同平均值的三个变量进行分组?的主要内容,如果未能解决你的问题,请参考以下文章

如何在mysql中对具有不同数据集的列进行排序

为啥在逻辑回归中对 roc_auc 进行评分时,GridSearchCV 不给出具有最高 AUC 的 C

R中具有多个分组因子的多个变量的均值和标准差

在 H2O 中对新数据使用标准化时

如何在 MySQL 8 中使用两个变量进行滚动平均?

具有可变大小变量列表的 MySQL 准备语句