来自 CSV 数据（时间戳和事件）的时间序列图：x-label 常量

Posted 2023-02-23

技术标签:

【中文标题】来自 CSV 数据（时间戳和事件）的时间序列图：x-label 常量【英文标题】：Timeseries plot from CSV data (Timestamp and events): x-label constant 【发布时间】：2017-09-30 09:24:09 【问题描述】：

（这个问题可以单独阅读，但是是续集：Timeseries from CSV data (Timestamp and events)）

我想使用 python 的 pandas 模块（见下面的链接）通过时间序列表示来可视化 CSV 数据（来自 2 个文件），如下所示。

df1的样本数据：

             TIMESTAMP  eventid
0  2017-03-20 02:38:24        1
1  2017-03-21 05:59:41        1
2  2017-03-23 12:59:58        1
3  2017-03-24 01:00:07        1
4  2017-03-27 03:00:13        1

“eventid”列始终包含值 1，我试图显示数据集中每一天的事件总和。第二个数据集 df0 具有相似的结构，但只包含零：

df0的样本数据：

             TIMESTAMP  eventid
0  2017-03-21 01:38:24        0
1  2017-03-21 03:59:41        0
2  2017-03-22 11:59:58        0
3  2017-03-24 01:03:07        0
4  2017-03-26 03:50:13        0

x轴标签只显示相同的日期，我的问题是：不同的日期如何显示？（是什么导致同一个日期在 x 标签上显示多次？）

到目前为止的脚本：

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

df1 = pd.read_csv('timestamp01.csv',  parse_dates=True, index_col='TIMESTAMP')
df0 = pd.read_csv('timestamp00.csv',  parse_dates=True, index_col='TIMESTAMP')

f, (ax1, ax2) = plt.subplots(1, 2)
ax1.plot(df0.resample('D').size())

ax1.set_xlim([pd.to_datetime('2017-01-27'), pd.to_datetime('2017-04-30')])  

ax1.xaxis.set_major_formatter(ticker.FixedFormatter
(df0.index.strftime('%Y-%m-%d')))

plt.setp(ax1.xaxis.get_majorticklabels(), rotation=15)

ax2.plot(df1.resample('D').size())
ax2.set_xlim([pd.to_datetime('2017-03-22'), pd.to_datetime('2017-04-29')])
ax2.xaxis.set_major_formatter(ticker.FixedFormatter(df1.index.strftime
('%Y-%m-%d')))
plt.setp(ax2.xaxis.get_majorticklabels(), rotation=15)
plt.show()

输出：(https://www.dropbox.com/s/z21koflkzglm6c3/figure_1.png?dl=0)

我尝试过的链接：

http://pandas.pydata.org/pandas-docs/stable/visualization.html

Multiple timeseries plots from Pandas Dataframe

Pandas timeseries plot setting x-axis major and minor ticks and labels

非常感谢任何帮助。

【问题讨论】：

【参考方案1】：

为了使示例可重现，我们可以创建以下文本文件 (data/timestamp01.csv)：

TIMESTAMP;eventid
2017-03-20 02:38:24;1
2017-03-21 05:59:41;1
2017-03-23 12:59:58;1
2017-03-24 01:00:07;1
2017-03-27 03:00:13;1

（data/timestamp00.csv 相同）。然后我们可以在

中阅读它们

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

df1 = pd.read_csv('data/timestamp01.csv', parse_dates=True, index_col='TIMESTAMP', sep=";")
df0 = pd.read_csv('data/timestamp00.csv', parse_dates=True, index_col='TIMESTAMP', sep=";")

绘制它们

f, (ax1, ax2) = plt.subplots(1, 2)

ax1.plot(df0.resample('D').size())
ax2.plot(df1.resample('D').size())

plt.setp(ax1.xaxis.get_majorticklabels(), rotation=30, ha="right")
plt.setp(ax2.xaxis.get_majorticklabels(), rotation=30, ha="right")
plt.show()

结果

这是所需的情节。

【讨论】：

以上是关于来自 CSV 数据（时间戳和事件）的时间序列图：x-label 常量的主要内容，如果未能解决你的问题，请参考以下文章

Matplotlib/Genfromtxt：针对时间的多个图，跳过丢失的数据点，来自 .csv

如何在循环中绘制来自多个文件的数据

来自时间戳和国家/地区的 pyspark 时区转换

交换 Graphite 在 Grafana 中返回的时间戳和值

来自 CSV 的 d3 可缩放树形图

从 matplotlib 中的 .CSV 文件制作多线图