Matplotlib 和 Numpy - 创建日历热图
Posted
技术标签:
【中文标题】Matplotlib 和 Numpy - 创建日历热图【英文标题】:Matplotlib and Numpy - Create a calendar heatmap 【发布时间】:2015-12-05 19:08:54 【问题描述】:是否可以在不使用 pandas 的情况下创建日历热图? 如果是这样,有人可以发布一个简单的例子吗?
我有像 8 月 16 日这样的日期和像 16 这样的计数值,我认为这将是一种快速简便的方法,可以在很长一段时间内显示几天之间的计数强度。
谢谢
【问题讨论】:
Seaborn 热图可能是您正在寻找的:seaborn.pydata.org/generated/seaborn.heatmap.html 【参考方案1】:免责声明:这是我自己的包的插件。虽然我帮助 OP 迟了几年,但我希望其他人会发现它有用。
我对一个相关问题进行了一些挖掘。当我找不到任何其他满足我所有要求的包时,我最终为此编写了一个新包。
这个包还没有完善,它仍然有一个稀疏的文档,但我还是在 PyPi 上发布了它以使其可供其他人使用。任何反馈都表示赞赏,无论是在这里还是在我的GitHub。
七月
包名为july
,可以用pip安装:
$ pip install july
以下是直接来自自述文件的一些用例:
导入包并生成数据
import numpy as np
import july
from july.utils import date_range
dates = date_range("2020-01-01", "2020-12-31")
data = np.random.randint(0, 14, len(dates))
GitHub Activity 样图:
july.heatmap(dates, data, title='Github Activity', cmap="github")
连续数据的每日热图(带颜色条):
july.heatmap(
osl_df.date, # Here, osl_df is a pandas data frame.
osl_df.temp,
cmap="golden",
colorbar=True,
title="Average temperatures: Oslo , Norway"
)
用month_grid=True
列出每个月的大纲
july.heatmap(dates=dates,
data=data,
cmap="Pastel1",
month_grid=True,
horizontal=True,
value_label=False,
date_label=False,
weekday_label=True,
month_label=True,
year_label=True,
colorbar=False,
fontfamily="monospace",
fontsize=12,
title=None,
titlesize="large",
dpi=100)
最后,您还可以创建月份或日历图:
# july.month_plot(dates, data, month=5) # This will plot only May.
july.calendar_plot(dates, data)
类似的包:
calplot
由 Tom Kwok。
GitHub:Link
安装:pip install calplot
比july
积极维护和更好的文档。
以熊猫为中心,采用带有日期和值的熊猫系列。
如果您只寻找热图功能并且不需要month_plot
或calendar_plot
,这是非常好的选择。
calmap
Martijn Vermaat。
GitHub:Link
安装:pip install calmap
calplot
产生的包。
似乎得到了更长时间的积极维护。
【讨论】:
您好,您知道制作日历热图的任何方法吗,但仅限几个月和几年?我没有每周数据,当我尝试使用 July 或 Calplot 时,它每月返回一个阴影单元格,因为它假设一个月中只有一周有数据。【参考方案2】:我希望创建一个日历热图,其中每个月都单独显示。我还需要用天数(day_of_month)和它的值标签来注释每一天。
我受到了此处发布的答案以及以下网站的启发:
Here, although in R
Heatmap using pcolormesh
但是,我似乎没有找到完全符合我要求的东西,所以我决定在这里发布我的解决方案,也许可以节省其他人想要相同情节的时间。
我的示例使用了一些 Pandas 来生成一些虚拟数据,因此您可以轻松插入自己的数据源。除此之外,它只是 matplotlib。
代码的输出如下所示。为了我的需要,我还想突出显示数据为 0 的日子(见 1 月 1 日)。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
# Settings
years = [2018] # [2018, 2019, 2020]
weeks = [1, 2, 3, 4, 5, 6]
days = ['M', 'T', 'W', 'T', 'F', 'S', 'S']
month_names = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August',
'September', 'October', 'November', 'December']
def generate_data():
idx = pd.date_range('2018-01-01', periods=365, freq='D')
return pd.Series(range(len(idx)), index=idx)
def split_months(df, year):
"""
Take a df, slice by year, and produce a list of months,
where each month is a 2D array in the shape of the calendar
:param df: dataframe or series
:return: matrix for daily values and numerals
"""
df = df[df.index.year == year]
# Empty matrices
a = np.empty((6, 7))
a[:] = np.nan
day_nums = m:np.copy(a) for m in range(1,13) # matrix for day numbers
day_vals = m:np.copy(a) for m in range(1,13) # matrix for day values
# Logic to shape datetimes to matrices in calendar layout
for d in df.iteritems(): # use iterrows if you have a DataFrame
day = d[0].day
month = d[0].month
col = d[0].dayofweek
if d[0].is_month_start:
row = 0
day_nums[month][row, col] = day # day number (0-31)
day_vals[month][row, col] = d[1] # day value (the heatmap data)
if col == 6:
row += 1
return day_nums, day_vals
def create_year_calendar(day_nums, day_vals):
fig, ax = plt.subplots(3, 4, figsize=(14.85, 10.5))
for i, axs in enumerate(ax.flat):
axs.imshow(day_vals[i+1], cmap='viridis', vmin=1, vmax=365) # heatmap
axs.set_title(month_names[i])
# Labels
axs.set_xticks(np.arange(len(days)))
axs.set_xticklabels(days, fontsize=10, fontweight='bold', color='#555555')
axs.set_yticklabels([])
# Tick marks
axs.tick_params(axis=u'both', which=u'both', length=0) # remove tick marks
axs.xaxis.tick_top()
# Modify tick locations for proper grid placement
axs.set_xticks(np.arange(-.5, 6, 1), minor=True)
axs.set_yticks(np.arange(-.5, 5, 1), minor=True)
axs.grid(which='minor', color='w', linestyle='-', linewidth=2.1)
# Despine
for edge in ['left', 'right', 'bottom', 'top']:
axs.spines[edge].set_color('#FFFFFF')
# Annotate
for w in range(len(weeks)):
for d in range(len(days)):
day_val = day_vals[i+1][w, d]
day_num = day_nums[i+1][w, d]
# Value label
axs.text(d, w+0.3, f"day_val:0.0f",
ha="center", va="center",
fontsize=7, color="w", alpha=0.8)
# If value is 0, draw a grey patch
if day_val == 0:
patch_coords = ((d - 0.5, w - 0.5),
(d - 0.5, w + 0.5),
(d + 0.5, w + 0.5),
(d + 0.5, w - 0.5))
square = Polygon(patch_coords, fc='#DDDDDD')
axs.add_artist(square)
# If day number is a valid calendar day, add an annotation
if not np.isnan(day_num):
axs.text(d+0.45, w-0.31, f"day_num:0.0f",
ha="right", va="center",
fontsize=6, color="#003333", alpha=0.8) # day
# Aesthetic background for calendar day number
patch_coords = ((d-0.1, w-0.5),
(d+0.5, w-0.5),
(d+0.5, w+0.1))
triangle = Polygon(patch_coords, fc='w', alpha=0.7)
axs.add_artist(triangle)
# Final adjustments
fig.suptitle('Calendar', fontsize=16)
plt.subplots_adjust(left=0.04, right=0.96, top=0.88, bottom=0.04)
# Save to file
plt.savefig('calendar_example.pdf')
for year in years:
df = generate_data()
day_nums, day_vals = split_months(df, year)
create_year_calendar(day_nums, day_vals)
可能还有很大的优化空间,但这可以满足我的需要。
【讨论】:
这看起来很漂亮,我喜欢它!我只需要进行一些调整以将其用于我的目的,但这很容易做到,因为您的代码结构良好且注释良好【参考方案3】:编辑:我现在看到问题要求没有熊猫的情节。即便如此,这个问题是“python 日历热图”的第一页谷歌结果,所以我将把它留在这里。无论如何,我建议使用熊猫。您可能已经将它作为另一个包的依赖项,而 pandas 拥有迄今为止处理日期时间数据的最佳 API(pandas.Timestamp
和 pandas.DatetimeIndex
)。
我能为这些图找到的唯一 Python 包是 calmap
,它未维护且与最近的 matplotlib 不兼容。所以我决定自己写。它产生如下图:
这是代码。输入是一个带有日期时间索引的系列,给出了热图的值:
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
DAYS = ['Sun.', 'Mon.', 'Tues.', 'Wed.', 'Thurs.', 'Fri.', 'Sat.']
MONTHS = ['Jan.', 'Feb.', 'Mar.', 'Apr.', 'May', 'June', 'July', 'Aug.', 'Sept.', 'Oct.', 'Nov.', 'Dec.']
def date_heatmap(series, start=None, end=None, mean=False, ax=None, **kwargs):
'''Plot a calendar heatmap given a datetime series.
Arguments:
series (pd.Series):
A series of numeric values with a datetime index. Values occurring
on the same day are combined by sum.
start (Any):
The first day to be considered in the plot. The value can be
anything accepted by :func:`pandas.to_datetime`. The default is the
earliest date in the data.
end (Any):
The last day to be considered in the plot. The value can be
anything accepted by :func:`pandas.to_datetime`. The default is the
latest date in the data.
mean (bool):
Combine values occurring on the same day by mean instead of sum.
ax (matplotlib.Axes or None):
The axes on which to draw the heatmap. The default is the current
axes in the :module:`~matplotlib.pyplot` API.
**kwargs:
Forwarded to :meth:`~matplotlib.Axes.pcolormesh` for drawing the
heatmap.
Returns:
matplotlib.collections.Axes:
The axes on which the heatmap was drawn. This is set as the current
axes in the `~matplotlib.pyplot` API.
'''
# Combine values occurring on the same day.
dates = series.index.floor('D')
group = series.groupby(dates)
series = group.mean() if mean else group.sum()
# Parse start/end, defaulting to the min/max of the index.
start = pd.to_datetime(start or series.index.min())
end = pd.to_datetime(end or series.index.max())
# We use [start, end) as a half-open interval below.
end += np.timedelta64(1, 'D')
# Get the previous/following Sunday to start/end.
# Pandas and numpy day-of-week conventions are Monday=0 and Sunday=6.
start_sun = start - np.timedelta64((start.dayofweek + 1) % 7, 'D')
end_sun = end + np.timedelta64(7 - end.dayofweek - 1, 'D')
# Create the heatmap and track ticks.
num_weeks = (end_sun - start_sun).days // 7
heatmap = np.zeros((7, num_weeks))
ticks = # week number -> month name
for week in range(num_weeks):
for day in range(7):
date = start_sun + np.timedelta64(7 * week + day, 'D')
if date.day == 1:
ticks[week] = MONTHS[date.month - 1]
if date.dayofyear == 1:
ticks[week] += f'\ndate.year'
if start <= date < end:
heatmap[day, week] = series.get(date, 0)
# Get the coordinates, offset by 0.5 to align the ticks.
y = np.arange(8) - 0.5
x = np.arange(num_weeks + 1) - 0.5
# Plot the heatmap. Prefer pcolormesh over imshow so that the figure can be
# vectorized when saved to a compatible format. We must invert the axis for
# pcolormesh, but not for imshow, so that it reads top-bottom, left-right.
ax = ax or plt.gca()
mesh = ax.pcolormesh(x, y, heatmap, **kwargs)
ax.invert_yaxis()
# Set the ticks.
ax.set_xticks(list(ticks.keys()))
ax.set_xticklabels(list(ticks.values()))
ax.set_yticks(np.arange(7))
ax.set_yticklabels(DAYS)
# Set the current image and axes in the pyplot API.
plt.sca(ax)
plt.sci(mesh)
return ax
def date_heatmap_demo():
'''An example for `date_heatmap`.
Most of the sizes here are chosen arbitrarily to look nice with 1yr of
data. You may need to fiddle with the numbers to look right on other data.
'''
# Get some data, a series of values with datetime index.
data = np.random.randint(5, size=365)
data = pd.Series(data)
data.index = pd.date_range(start='2017-01-01', end='2017-12-31', freq='1D')
# Create the figure. For the aspect ratio, one year is 7 days by 53 weeks.
# We widen it further to account for the tick labels and color bar.
figsize = plt.figaspect(7 / 56)
fig = plt.figure(figsize=figsize)
# Plot the heatmap with a color bar.
ax = date_heatmap(data, edgecolor='black')
plt.colorbar(ticks=range(5), pad=0.02)
# Use a discrete color map with 5 colors (the data ranges from 0 to 4).
# Extending the color limits by 0.5 aligns the ticks in the color bar.
cmap = mpl.cm.get_cmap('Blues', 5)
plt.set_cmap(cmap)
plt.clim(-0.5, 4.5)
# Force the cells to be square. If this is set, the size of the color bar
# may look weird compared to the size of the heatmap. That can be corrected
# by the aspect ratio of the figure or scale of the color bar.
ax.set_aspect('equal')
# Save to a file. For embedding in a LaTeX doc, consider the PDF backend.
# http://sbillaudelle.de/2015/02/23/seamlessly-embedding-matplotlib-output-into-latex.html
fig.savefig('heatmap.pdf', bbox_inches='tight')
# The firgure must be explicitly closed if it was not shown.
plt.close(fig)
【讨论】:
嗨,最新的 matplotlib 和 pandas 版本仍然适用于您吗?我在一周的第一天和最后一天遇到了一些麻烦,它们只显示了一半大小。有任何想法吗?谢谢! DatetimeIndex: 意外的关键字参数“开始”pandas.pydata.org/pandas-docs/stable/reference/api/… 我通过将 pd.Datetimeindex() 更改为 pd.date_range() 修复了演示功能 这看起来真不错! github上有没有公开的repo之类的?【参考方案4】:下面是一个代码,可用于为某个值的每日配置文件生成日历图。
"""
Created on Tue Sep 4 11:17:25 2018
@author: woldekidank
"""
import numpy as np
from datetime import date
import datetime
import matplotlib.pyplot as plt
import random
D = date(2016,1,1)
Dord = date.toordinal(D)
Dweekday = date.weekday(D)
Dsnday = Dord - Dweekday + 1 #find sunday
square = np.array([[0, 0],[ 0, 1], [1, 1], [1, 0], [0, 0]])#x and y to draw a square
row = 1
count = 0
while row != 0:
for column in range(1,7+1): #one week per row
prof = np.ones([24, 1])
hourly = np.zeros([24, 1])
for i in range(1,24+1):
prof[i-1, 0] = prof[i-1, 0] * random.uniform(0, 1)
hourly[i-1, 0] = i / 24
plt.title('Temperature Profile')
plt.plot(square[:, 0] + column - 1, square[:, 1] - row + 1,color='r') #go right each column, go down each row
if date.fromordinal(Dsnday).month == D.month:
if count == 0:
plt.plot(hourly, prof)
else:
plt.plot(hourly + min(square[:, 0] + column - 1), prof + min(square[:, 1] - row + 1))
plt.text(column - 0.5, 1.8 - row, datetime.datetime.strptime(str(date.fromordinal(Dsnday)),'%Y-%m-%d').strftime('%a'))
plt.text(column - 0.5, 1.5 - row, date.fromordinal(Dsnday).day)
Dsnday = Dsnday + 1
count = count + 1
if date.fromordinal(Dsnday).month == D.month:
row = row + 1 #new row
else:
row = 0 #stop the while loop
下面是这段代码的输出
【讨论】:
【参考方案5】:这当然是可能的,但你需要跳过几个圈子。
首先,我假设您的意思是看起来像日历的日历显示,而不是更线性的格式(线性格式的“热图”比这更容易)。
关键是将任意长度的 1D 系列重塑为 Nx7 2D 数组,其中每行是一周,列是天。这很容易,但您还需要正确标记月份和日期,这可能会有点冗长。
这是一个例子。它甚至不会远程尝试处理跨年边界(例如 2014 年 12 月至 2015 年 1 月等)。但是,希望它能让您入门:
import datetime as dt
import matplotlib.pyplot as plt
import numpy as np
def main():
dates, data = generate_data()
fig, ax = plt.subplots(figsize=(6, 10))
calendar_heatmap(ax, dates, data)
plt.show()
def generate_data():
num = 100
data = np.random.randint(0, 20, num)
start = dt.datetime(2015, 3, 13)
dates = [start + dt.timedelta(days=i) for i in range(num)]
return dates, data
def calendar_array(dates, data):
i, j = zip(*[d.isocalendar()[1:] for d in dates])
i = np.array(i) - min(i)
j = np.array(j) - 1
ni = max(i) + 1
calendar = np.nan * np.zeros((ni, 7))
calendar[i, j] = data
return i, j, calendar
def calendar_heatmap(ax, dates, data):
i, j, calendar = calendar_array(dates, data)
im = ax.imshow(calendar, interpolation='none', cmap='summer')
label_days(ax, dates, i, j, calendar)
label_months(ax, dates, i, j, calendar)
ax.figure.colorbar(im)
def label_days(ax, dates, i, j, calendar):
ni, nj = calendar.shape
day_of_month = np.nan * np.zeros((ni, 7))
day_of_month[i, j] = [d.day for d in dates]
for (i, j), day in np.ndenumerate(day_of_month):
if np.isfinite(day):
ax.text(j, i, int(day), ha='center', va='center')
ax.set(xticks=np.arange(7),
xticklabels=['M', 'T', 'W', 'R', 'F', 'S', 'S'])
ax.xaxis.tick_top()
def label_months(ax, dates, i, j, calendar):
month_labels = np.array(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul',
'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
months = np.array([d.month for d in dates])
uniq_months = sorted(set(months))
yticks = [i[months == m].mean() for m in uniq_months]
labels = [month_labels[m - 1] for m in uniq_months]
ax.set(yticks=yticks)
ax.set_yticklabels(labels, rotation=90)
main()
【讨论】:
感谢您提供此示例,它的效果非常好。我确实有一个问题。 numpy 数组的形状是否会影响图形的形状,或者如果我希望图形水平,我会做些什么改变? 是的,数组的形状直接影响图形的形状。要更改它,您可以转置数组(即imshow(calendar.T, ...)
)并在别处交换 x 和 y。稍后我会发布一个示例,但我可能还没有时间。
嗨@JoeKington。非常感谢这段代码,很方便!但是,在 Python 3.7.3
和 matplotlib 3.1.1
上运行代码时,y 轴上的尺寸会出现一些问题(请参阅:result image)。我不知道如何解决这个问题。任何帮助都非常感谢......非常感谢!
这是一个很好的解决方案!继 cmets 之后,在让它顺时针旋转以水平显示方面有什么进展吗?以上是关于Matplotlib 和 Numpy - 创建日历热图的主要内容,如果未能解决你的问题,请参考以下文章
matplotlib----初探------2Numpy的基本知识
Python使用matplotlib可视化时间序列日历热力图日历热力图可以很好地描绘极端值和节日数据特性(Calendar Heatmap)