使用 Hive 或 sql 进行日期转换
Posted
技术标签:
【中文标题】使用 Hive 或 sql 进行日期转换【英文标题】:Date Transformation using Hive or sql 【发布时间】:2020-04-09 01:56:33 【问题描述】:前提:您有一个表,其中包含一列 original_date
,数据类型为字符串:
ORIGINAL_DATE
20190825
20190826
20190827
20190828
20190829
20190830
20190831
20190901
问题:编写一个 SQL 查询来计算另外两列 - end_of_week
- 来自 original_date
的下周日的日期。如果original_date
已经是星期天,则该字段应该是相同的值end_of_month
- 月末日期的值 可接受的解决方案是适用于original_date
字符串格式的任何有效日期。计算 end_of_month 和 end_of_week
ORIGINAL_DATE END_OF_WEEK END_OF_MONTH
20190825 20190825 20190831
20190826 20190901 20190831
20190827 20190901 20190831
20190828 20190901 20190831
20190829 20190901 20190831
20190830 20190901 20190831
20190831 20190901 20190831
20190901 20190901 20190930
附加信息:
20190825 是星期日,因此该值的 end_of_week
仍然是同一日期。
20190827 是星期二,下个星期日是 20190901
CREATE TABLE random_dates ( original_date VARCHAR(8) NOT NULL );
INSERT INTO random_dates(original_date) values('20190825');
INSERT INTO random_dates(original_date) values('20190826');
INSERT INTO random_dates(original_date) values('20190827');
INSERT INTO random_dates(original_date) values('20190828');
INSERT INTO random_dates(original_date) values('20190829');
INSERT INTO random_dates(original_date) values('20190830');
INSERT INTO random_dates(original_date) values('20190831');
INSERT INTO random_dates(original_date) values('20190901');
预期输出:
20190825 2019-08-25 2019-08-31
20190826 2019-09-01 2019-08-31
20190827 2019-09-01 2019-08-31
20190828 2019-09-01 2019-08-31
20190829 2019-09-01 2019-08-31
20190830 2019-09-01 2019-08-31
20190831 2019-09-01 2019-08-31
20190901 2019-09-01 2019-09-30
【问题讨论】:
【参考方案1】:Hive 解决方案:
with random_dates as (--this is your example dataset
select stack(8,
'20190825', '20190826', '20190827', '20190828', '20190829', '20190830', '20190831', '20190901'
) as original_date
)
select original_date,
date_add(date_formatted, 6-days) end_of_week,
last_day(date_formatted) end_of_month
from
(
select original_date,
regexp_replace(original_date,'^(\\d4)(\\d2)(\\d2)$','$1-$2-$3') date_formatted,
pmod(datediff(regexp_replace(original_date,'^(\\d4)(\\d2)(\\d2)$','$1-$2-$3'),'1900-01-08'),7) days
from random_dates
)s
;
结果:
original_date end_of_week end_of_month
20190825 2019-08-25 2019-08-31
20190826 2019-09-01 2019-08-31
20190827 2019-09-01 2019-08-31
20190828 2019-09-01 2019-08-31
20190829 2019-09-01 2019-08-31
20190830 2019-09-01 2019-08-31
20190831 2019-09-01 2019-08-31
20190901 2019-09-01 2019-09-30
【讨论】:
【参考方案2】:SELECT original_date,
CASE DAYOFWEEK(STR_TO_DATE(original_date,'%Y%m%d'))
WHEN 1 THEN DATE_ADD(STR_TO_DATE(original_date,'%Y%m%d'),INTERVAL 0 DAY)
WHEN 2 THEN DATE_ADD(STR_TO_DATE(original_date,'%Y%m%d'),INTERVAL 6 DAY)
WHEN 3 THEN DATE_ADD(STR_TO_DATE(original_date,'%Y%m%d'),INTERVAL 5 DAY)
WHEN 4 THEN DATE_ADD(STR_TO_DATE(original_date,'%Y%m%d'),INTERVAL 4 DAY)
WHEN 5 THEN DATE_ADD(STR_TO_DATE(original_date,'%Y%m%d'),INTERVAL 3 DAY)
WHEN 6 THEN DATE_ADD(STR_TO_DATE(original_date,'%Y%m%d'),INTERVAL 2 DAY)
WHEN 7 THEN DATE_ADD(STR_TO_DATE(original_date,'%Y%m%d'),INTERVAL 1 DAY)
END AS END_OF_WEEK,
LAST_DAY(STR_TO_DATE(original_date,'%Y%m%d')) AS END_OF_MONTH
FROM random_dates;
【讨论】:
您能提供一些背景信息吗?你的代码 sn-p 在做什么(为什么)? :)以上是关于使用 Hive 或 sql 进行日期转换的主要内容,如果未能解决你的问题,请参考以下文章