平均日期的sql特征函数

Posted

技术标签:

【中文标题】平均日期的sql特征函数【英文标题】:sql charateristic function for avg dates 【发布时间】:2009-10-20 13:14:13 【问题描述】:

我有一个查询,用于获取特定日期和该日期的价格,但现在我想使用类似的东西来获取一周中特定日期的平均价格。

这是我当前的查询,它适用于从名为 availables 的表中提取的特定日期:

SELECT rooms.name, rooms.roomtype, rooms.id, max(availables.updated_at),
MAX(IF(to_days(availables.bookdate) - to_days('2009-12-10') = 0, (availables.price*0.66795805223432), '')) AS day1,
MAX(IF(to_days(availables.bookdate) - to_days('2009-12-10') = 1, (availables.price*0.66795805223432), '')) AS day2,
MAX(IF(to_days(availables.bookdate) - to_days('2009-12-10') = 2, (availables.price*0.66795805223432), '')) AS day3,
MAX(IF(to_days(availables.bookdate) - to_days('2009-12-10') = 3, (availables.price*0.66795805223432), '')) AS day4,
MAX(IF(to_days(availables.bookdate) - to_days('2009-12-10') = 4, (availables.price*0.66795805223432), '')) AS day5,
MAX(IF(to_days(availables.bookdate) - to_days('2009-12-10') = 5, (availables.price*0.66795805223432), '')) AS day6,
MAX(IF(to_days(availables.bookdate) - to_days('2009-12-10') = 6, (availables.price*0.66795805223432), '')) AS day7,
MIN(spots) as spots
     FROM `availables`
     INNER JOIN rooms
     ON availables.room_id=rooms.id
     WHERE rooms.hotel_id = '5064' AND bookdate
     BETWEEN '2009-12-10' AND DATE_ADD('2009-12-10', INTERVAL 6 DAY)
     GROUP BY rooms.name
     ORDER BY rooms.ppl

我的第一个 stab 不起作用,可能是因为 DAYSOFWEEK 函数与 to_days 有很大不同...

SELECT rooms.id, rooms.name,
MAX(IF(DAYOFWEEK(availables.bookdate) - DAYOFWEEK('2009-12-10') = 0, (availables.price*0.66795805223432), '')) AS day1,
MAX(IF(DAYOFWEEK(availables.bookdate) - DAYOFWEEK('2009-12-10') = 1, (availables.price*0.66795805223432), '')) AS day2,
MAX(IF(DAYOFWEEK(availables.bookdate) - DAYOFWEEK('2009-12-10') = 2, (availables.price*0.66795805223432), '')) AS day3,
MAX(IF(DAYOFWEEK(availables.bookdate) - DAYOFWEEK('2009-12-10') = 3, (availables.price*0.66795805223432), '')) AS day4,
MAX(IF(DAYOFWEEK(availables.bookdate) - DAYOFWEEK('2009-12-10') = 4, (availables.price*0.66795805223432), '')) AS day5,
MAX(IF(DAYOFWEEK(availables.bookdate) - DAYOFWEEK('2009-12-10') = 5, (availables.price*0.66795805223432), '')) AS day6,
MAX(IF(DAYOFWEEK(availables.bookdate) - DAYOFWEEK('2009-12-10') = 6, (availables.price*0.66795805223432), '')) AS day7,rooms.ppl AS spots FROM `availables` 
 INNER JOIN `rooms` ON `rooms`.id = `availables`.room_id 
 WHERE (rooms.hotel_id = 5064 AND rooms.ppl > 3 AND availables.price > 0 AND availables.spots > 1) 
 GROUP BY rooms.name
 ORDER BY rooms.ppl

也许我让这件事变得疯狂,有人知道一个更简单的方法。

它需要看起来像这样的数据

#Availables
id    room_id   price    spots    bookdate
1     26        $5       5        2009-10-20
2     26        $6       5        2009-10-21

到:

+----+-------+--------------------+---------------------+---------------------+---------------------+------+------+------+------+
| id | spots | name               | day1                | day2                | day3                | day4 | day5 | day6 | day7 |
+----+-------+--------------------+---------------------+---------------------+---------------------+------+------+------+------+
| 25 | 4     | Blue Room          | 14.9889786921381408 | 14.9889786921381408 | 14.9889786921381408 |      |      |      |      |
| 26 | 6     | Whatever           | 13.7398971344599624 | 13.7398971344599624 | 13.7398971344599624 |      |      |      |      |
| 27 | 8     | Some name          | 11.2417340191036056 | 11.2417340191036056 | 11.2417340191036056 |      |      |      |      |
| 28 | 8     | Another            | 9.9926524614254272  | 9.9926524614254272  | 9.9926524614254272  |      |      |      |      |
| 29 | 10    | Stuff              | 7.4944893460690704  | 7.4944893460690704  | 7.4944893460690704  |      |      |      |      |
+----+-------+--------------------+---------------------+---------------------+---------------------+------+------+------+---

【问题讨论】:

【参考方案1】:

如果我理解正确,看起来您只需要取出“-DAYOFWEEK('2009-12-10')”,因为 DAYOFWEEK(availables.bookdate) 已经返回一个代表星期几的数字。

此外,DAYOFWEEK 返回一个数字 1 到 7,而不是 0 到 6,因此您需要相应地进行调整。

在内部子查询中执行“GROUP BY room_id, DAYOFWEEK(availables.bookdate)”以按房间和日期对平均价格进行分组,然后在外部查询中进行旋转可能更有效。

【讨论】:

以上是关于平均日期的sql特征函数的主要内容,如果未能解决你的问题,请参考以下文章

数据分布特征描述性分析(数据探索)

R语言使用caret包的nzv函数进行接近零方差变量(特征)的删除方差是衡量一个变量的离散程度(即数据偏离平均值的程度大小越靠近零方差判别性越差)

R语言使用caret包的nearZeroVar函数进行接近零方差变量(特征)的删除方差是衡量一个变量的离散程度(即数据偏离平均值的程度大小越靠近零方差判别性越差)

如何根据之前的平均 X 行创建特征? [复制]

平均特征后学习算法的准确性下降

平均数编码:针对某个分类特征类别基数特别大的编码方式