获取过去 20 周三的数据:aws redshift
Posted
技术标签:
【中文标题】获取过去 20 周三的数据:aws redshift【英文标题】:Getting data of past 20 Wednesday: aws redshift 【发布时间】:2019-04-22 13:53:20 【问题描述】:我必须为 AWS-redshift 编写此查询,以获取最近 20 个星期三的数据,帮助!
SELECT
count(user_leads.id) AS lead_count, DATE(user_leads.created)
FROM
user_leads
join courses on user_leads.course_id = courses.id
left join users on user_leads.user_id = users.id
where
user_leads.created >= '2020-01-31'
AND user_leads.created < '2020-03-03'
AND courses.course_type !=4
AND users.email not like "%edureka%"
AND users.first_name not like "%test%"
AND weekday(user_leads) = 2
GROUP BY DATE(user_leads.created) DESC;
【问题讨论】:
转换成str
和replace('][','],[')
?
【参考方案1】:
使用str.replace()
:
someFile.json:
[
"Date",
"17/04/2019",
"Skill",
"Travis",
"Repository",
"27,699 repository results"
][
"Date",
"17/04/2019",
"Skill",
"Kotlin",
"Repository",
"55,752 repository results"
]
因此:
with open('someFile.json', 'r') as fp:
content = fp.readlines()
content = [l.strip() for l in content if l.strip()]
for line in content:
if '][' in line:
print(line.replace('][','],['))
else:
print(line)
输出:
[
"Date",
"17/04/2019",
"Skill",
"Travis",
"Repository",
"27,699 repository results"
],[
"Date",
"17/04/2019",
"Skill",
"Kotlin",
"Repository",
"55,752 repository results"
]
编辑:
一个看起来像 json 的文件应该是:
someFile.json:
[
"date": "Date",
"dt": "17/04/2019",
"skill": "Skill",
"travel": "Travis",
"repo": "Repository",
"dat": "27,699 repository results"
][
"date": "Date",
"dt": "17/04/2019",
"skill": "Skill",
"travel": "Kotlin",
"repo": "Repository",
"dat": "2327,699 repository results"
]
因此:
import json
with open('someFile.json', 'r') as file:
content = file.read()
clean = content.replace('][', ',') # cleanup here
json_data = json.loads(clean)
print(json_data)
输出:
[
'date': 'Date', 'dt': '17/04/2019', 'skill': 'Skill', 'travel': 'Travis', 'repo': 'Repository', 'dat': '27,699 repository results',
'date': 'Date', 'dt': '17/04/2019', 'skill': 'Skill', 'travel': 'Kotlin', 'repo': 'Repository', 'dat': '2327,699 repository results'
]
【讨论】:
@ashishmishra 这不是一个有效的 json 开头。以上是关于获取过去 20 周三的数据:aws redshift的主要内容,如果未能解决你的问题,请参考以下文章
AWS Athena [Presto] 如何仅接收过去 7 天的数据?
零基础学习云计算及大数据DBA集群架构师Linux Bash Shell编程及系统自动化2015年1月20日周三
pandas通过DatetimeProperties对象获取日期对象的星期几周几信息编码(周一为0,周天为6)使用pd.to_timedelta函数将时间列所有时间数据处理到当周的周三(星期三)