amazon athena 创建带分区的请求
Posted
技术标签:
【中文标题】amazon athena 创建带分区的请求【英文标题】:amazon athena create request with partitions 【发布时间】:2018-08-06 19:00:00 【问题描述】:我创建了一个表,其分区如下:首先按年、月和日。
问题:我希望得到 12/2017 和 03/2018 的数据,我该怎么做? 我的想法:
where (year='2017' and month='12') and ( year ='2018' and month='03')
正确吗?我不会感到困惑,因此 Amazon Athena 获取以下数据:
12/2017 and 03/2018 and 03/2017 and 12/2018
因为 and 运算符?
PS:我无法测试,我只有免费帐户。 谢谢。
【问题讨论】:
【参考方案1】:无论如何,我尝试了一组迷你数据,发现 Amazon Athena 考虑了括号。
我的测试如下: 生成的表的 DDl:
CREATE EXTERNAL TABLE `manyands`(
`years` int COMMENT 'from deserializer',
`months` int COMMENT 'from deserializer',
`days` int COMMENT 'from deserializer')
PARTITIONED BY (
`year` string,
`month` string)
ROW FORMAT SERDE
'org.openx.data.jsonserde.JsonSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION
's3://mybucket/'
我的一组数据测试:
我的测试:
1-SELECT * FROM "atlasdatabase"."manyands" where month='1';
我得到了 CSV 格式:
"years","months","days","year","month"
"2017","1","21","2017","1"
"2018","1","81","2018","1"
2-SELECT * FROM "atlasdatabase"."manyands" where month='1' and year='2017';
"years","months","days","year","month"
"2017","1","21","2017","1"
3-SELECT * FROM "atlasdatabase"."manyands" where (month='1' and year='2018') and (month='3' and year='2017') ;
empty (Zéro enregistrements renvoyés)
4-SELECT * FROM "atlasdatabase"."manyands" where (month='1' and year='2018') or (month='3' ) ;
"years","months","days","year","month"
"2018","1","81","2018","1"
"2017","3","73","2017","3"
"2018","3","73","2018","3"
结论:在多个分区实例之间添加OR运算符。
【讨论】:
以上是关于amazon athena 创建带分区的请求的主要内容,如果未能解决你的问题,请参考以下文章