如何计算时间序列的均方误差？

Posted 2023-02-16

技术标签:

【中文标题】如何计算时间序列的均方误差？【英文标题】：How do i calculate the mean squared error for time series? 【发布时间】：2020-09-20 20:43:04 【问题描述】：

univariate_past_history = 100
univariate_future_target = 0

x_train_uni, y_train_uni = univariate_data(uni_data, 0, TRAIN_SPLIT,
                                           univariate_past_history,
                                           univariate_future_target)
x_val_uni, y_val_uni = univariate_data(uni_data, TRAIN_SPLIT, None,
                                       univariate_past_history,
                                       univariate_future_target)

要预测的值为y[0].numpy()，预测值为simple_lstm_model.predict(x)[0]如何计算两者的均方误差？

BATCH_SIZE = 256
BUFFER_SIZE = 10000

train_univariate = tf.data.Dataset.from_tensor_slices((x_train_uni, y_train_uni))
train_univariate = train_univariate.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()

val_univariate = tf.data.Dataset.from_tensor_slices((x_val_uni, y_val_uni))
val_univariate = val_univariate.batch(BATCH_SIZE).repeat()

simple_lstm_model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(8, input_shape=x_train_uni.shape[-2:]),
    tf.keras.layers.Dense(1)
])

simple_lstm_model.compile(optimizer='adam', loss='mae')
for x, y in val_univariate.take(1):
    print(simple_lstm_model.predict(x).shape)
EVALUATION_INTERVAL = 200
EPOCHS = 10

simple_lstm_model.fit(train_univariate, epochs=EPOCHS,
                      steps_per_epoch=EVALUATION_INTERVAL,
                      validation_data=val_univariate, validation_steps=50)

for x, y in val_univariate.take(3):
  plot = show_plot([x[0].numpy(), y[0].numpy(),
                    simple_lstm_model.predict(x)[0]], 0, 'Simple LSTM model')
  plot.show()
print(simple_lstm_model.predict(x)[0])

expected = y[0].numpy()
predicted = simple_lstm_model.predict(x)[0]
print(mean_squared_error(expected,predicted))

如果我确实喜欢上述内容，我会收到此错误 TypeError: Singleton array 0.05017540446704798 cannot be considered a valid collection.

提前致谢

【问题讨论】：

【参考方案1】：

两个标量之间的平方误差为(y_true - y_pred)**2。这是一个标量，我们可以从这个标量了解错误有多严重或模型有多好。

两个向量的平方误差是一个向量，但由于我们想要一个标量来理解两个向量之间的偏差，所以我们取平均值并因此取 MSE。

您可以使用下面的代码来计算 MSE

# Assuming y_train_uni is a list
# Error for single sample
y_pred = simple_lstm_model.predict(x).numpy()[0]
y_true = np.array(y_train_uni)[0]
print (np.power(y_true - y_pred, 2))

# Error for multiple samples
y_pred = simple_lstm_model.predict(x).numpy()
y_true = np.array(y_train_uni)
print (np.power(y_true - y_pred, 2).mean(())

【讨论】：

我得到这个错误，请帮助 AttributeError: 'numpy.ndarray' object has no attribute 'numpy'。 y[0].numpy() 是真正的未来，simple_lstm_model.predict(x)[0] 是模型预测 @RichardPhillipsRoy 使用type(simple_lstm_model.predict(x)) 检查类型。如果它已经是 numpy 你可以使用y_pred = simple_lstm_model.predict(x) 因为不需要类型转换。

以上是关于如何计算时间序列的均方误差？的主要内容，如果未能解决你的问题，请参考以下文章