如何在 Keras 中为 LSTM 构建三维滞后时间步长？

Posted 2023-02-16

技术标签:

【中文标题】如何在 Keras 中为 LSTM 构建三维滞后时间步长？【英文标题】：How to Structure Three-Dimensional Lag TimeSteps for an LSTM in Keras? 【发布时间】：2021-09-27 04:00:37 【问题描述】：

我知道 LSTMS 需要一个三维数据集才能按照这种格式运行，N_samples x TimeSteps x Variables。我想将我所有行的单个时间步长的数据重组为按小时计算的 Lag 时间步长。这个想法是，LSTM 将逐小时进行批量训练（从 310033 行 x 1 时间步 x 83 变量到 310033 行 x 60 时间步 x 83 变量）。

但是，我的模型的损失很奇怪（随着 epoch 增加训练损失），并且训练准确度从单个时间步下降到滞后时间步。这让我相信我做错了这个转变。这是重组数据的正确方法还是有更好的方法？

数据是 1 秒记录的时间序列数据，已经被预处理到 0-1 范围内，One-Hot 编码，清洗等...

Python 中的当前转换：

X_train, X_test, y_train, y_test = train_test_split(scaled, target, train_size=.7, shuffle = False) 
#reshape input to be 3D [samples, timesteps, features]
#X_train = X_train.reshape((X_train.shape[0], 1, X_train.shape[1])) - Old method for 1 timestep
#X_test = X_test.reshape((X_test.shape[0], 1, X_test.shape[1])) - Old method for 1 timestep

#Generate Lag time Steps 3D framework for LSTM
#As required for LSTM networks, we must reshape the input data into N_samples x TimeSteps x Variables
hours = len(X_train)/3600
hours = math.floor(hours) #Most 60 min hours availible in subset of data 
temp =[]
# Pull hours into the three dimensional feild
for hr in range(hours, len(X_train) + hours):
    temp.append(scaled[hr - hours:hr, 0:scaled.shape[1]])
X_train = np.array(temp) #Export Train Features

hours = len(X_test)/3600
hours = math.floor(hours) #Most 60 min hours availible in subset of data 
temp =[]
# Pull hours into the three dimensional feild
for hr in range(hours, len(X_test) + hours):
    temp.append(scaled[hr - hours:hr, 0:scaled.shape[1]])
X_test = np.array(temp) #Export Test Features

转换后的数据形状：

模型注入：

model.add(LSTM(128, return_sequences=True, 
                   input_shape=(X_train.shape[1], X_train.shape[2]))) 
model.add(Dropout(0.15)) #15% drop out layer
#model.add(BatchNormalization())

#Layer 2
model.add(LSTM(128, return_sequences=False))
model.add(Dropout(0.15)) #15% drop out layer

#Layer 3 - return a single vector
model.add(Dense(32))
#Output of 2 because we have 2 classes
model.add(Dense(2, activation= 'sigmoid'))
# Define optimiser
opt = tf.keras.optimizers.Adam(learning_rate=1e-5, decay=1e-6)
# Compile model
model.compile(loss='sparse_categorical_crossentropy', # Mean Square Error Loss = 'mse'; Mean Absolute Error = 'mae'; sparse_categorical_crossentropy
                  optimizer=opt, 
                  metrics=['accuracy'])
history = model.fit(X_train, y_train, epochs=epoch, batch_size=batch, validation_data=(X_test, y_test), verbose=2, shuffle=False)

关于如何提高性能或修复滞后时间步的任何意见？

【问题讨论】：

【参考方案1】：

由于您尝试针对 x 变量的滞后值和当前值预测 y，因此您的 y_train 需要在第一组滞后值之后开始，或者 y_train 需要为 y_train[59:] 并且您的 X_train 需要在训练期间结束，并且y_train 的最后一次观察应该对应于 X_train，它具有与 y_train 相同的最新数据时间点。所以取 X_train[:y_train[59:].shape[0], 60, 83]

要详细说明，你需要适应：

X(t), X(t-1), X(t-2), ..., X(t-59) ---- > y(t)

X(t+1), X(t), X(t-1),..., X(t-58) ------> y(t+1)

如果我没记错的话，你写的代码可能正好相反：

X(t), X(t-1), X(t-2), ..., X(t-59) ---- > y(t-59)

【讨论】：

感谢您的回复！这可以使用 shift() 到目标字段来完成吗？此外，如果我将滞后时间步长从 1 拆分为 60 拆分，您是否知道如何处理 N_samples。假设我在拆分之前的初始数据中使用了 60 个时间步长和我的数据 309600 行。我可以进行 70 - 30 的训练和测试拆分（分别为 216720 和 92880 行），其中每个时间步长为 3870 行或大约一小时的数据。测试数据的结构是否为 3870 x 60 x 84 而不是 216720 x 60 x 84？当您采用 60 个滞后值时，用于模型拟合的起始 X_train 是（除了滞后的过去值） X_train(60)，因此只需删除 y[:60] 就足够了。因此，服用 y_train[60:] 就足够了。此外 X_train 不应超出 y_train[60:].shape[0] 并且您也需要注意这一点。我的想法是：首先您进行秒到小时的转换，其次进行滞后和 X 到 Y 时间映射并进行所需的修整，然后进行训练测试拆分。您的最终训练维度将是 (y_train[lag_hours:].shape[0], lag_hours, variable_numbers)

以上是关于如何在 Keras 中为 LSTM 构建三维滞后时间步长？的主要内容，如果未能解决你的问题，请参考以下文章

keras: 在构建LSTM模型时，使用变长序列的方法

Keras深度学习实战——使用长短时记忆网络构建情感分析模型

keras 有状态 LSTM

如何在时间序列预测中使用LSTM网络中的时间步长

Keras CNN-LSTM：制作 y_train 时出错

LSTM 分类问题 (Keras) - 奇怪的结果