使用 Keras 维度误差的 CNN-1D(Seq2 点)时间序列预测

Posted

技术标签:

【中文标题】使用 Keras 维度误差的 CNN-1D(Seq2 点)时间序列预测【英文标题】:CNN-1D(Seq2point) Timeseries Prediction using Keras Error in Dimension 【发布时间】:2020-11-07 07:29:54 【问题描述】:

我正在尝试使用序列到点 (seq2point) 架构在 keras 中拟合 CNN 1D 模型

我的数据是时间序列数据。

输入特征是一个传感器数据(每 6 秒 1 个值),输出也是一个连续数字。 这意味着我的问题是回归问题。

但我在运行模型时遇到错误

*ValueError: Error when checking input: expected conv1d_18_input to have 3 dimensions, but got array with shape (176526, 600)*

我知道我在弄乱尺寸但不知道在哪里。

非常感谢任何帮助。

Google Colab 中完整代码的链接 GoogleColab

创建模型

def create_model(n_timesteps, n_features,n_outputs):
'''Creates and returns the ShortSeq2Point Network
'''
model = Sequential()

# 1D Conv
model.add(Conv1D(filters=30, kernel_size=10, activation='relu', input_shape=(n_timesteps, n_features), padding="same", strides=1))
model.add(Dropout(0.5))
model.add(Conv1D(filters=30, kernel_size=8, activation='relu', padding="same", strides=1))
model.add(Dropout(0.5))
model.add(Conv1D(filters=40, kernel_size=6, activation='relu', padding="same", strides=1))
model.add(Dropout(0.5))
model.add(Conv1D(filters=50, kernel_size=5, activation='relu', padding="same", strides=1))
model.add(Dropout(0.5))
# Fully Connected Layers
model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(n_outputs, activation='linear'))

model.compile(loss='mse', optimizer='adam')
#plot_model(model, to_file='model.png', show_shapes=True)

return model

n_timesteps, n_features, n_outputs = x_train.shape[1], x_train.shape[2], y_train.shape[1]
# create model
model = create_model(num_time_periods, n_features,n_outputs)
model.summary()

型号

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv1d_18 (Conv1D)           (None, 600, 30)           330       
_________________________________________________________________
dropout_20 (Dropout)         (None, 600, 30)           0         
_________________________________________________________________
conv1d_19 (Conv1D)           (None, 600, 30)           7230      
_________________________________________________________________
dropout_21 (Dropout)         (None, 600, 30)           0         
_________________________________________________________________
conv1d_20 (Conv1D)           (None, 600, 40)           7240      
_________________________________________________________________
dropout_22 (Dropout)         (None, 600, 40)           0         
_________________________________________________________________
conv1d_21 (Conv1D)           (None, 600, 50)           10050     
_________________________________________________________________
dropout_23 (Dropout)         (None, 600, 50)           0         
_________________________________________________________________
flatten_5 (Flatten)          (None, 30000)             0         
_________________________________________________________________
dense_8 (Dense)              (None, 1024)              30721024  
_________________________________________________________________
dropout_24 (Dropout)         (None, 1024)              0         
_________________________________________________________________
dense_9 (Dense)              (None, 600)               615000    
=================================================================
Total params: 31,360,874
Trainable params: 31,360,874

数据集

Input_feature   Output_Value        
       4              276
       5              276
              ...
      21              667
      20              672
    177126 rows × 2 columns

Dataset

【问题讨论】:

你是reshaping Data 两次,这就是错误的原因。请找到这篇文章 (machinelearningmastery.com/…),它有一个全面的 End to End code 用于 Uni-VariateMulti-Variate Time Series Analysis 使用 Conv-1D。谢谢! 【参考方案1】:

是的,正如 Tensorflow 支持正确提到的那样。在这里,您正在重塑数据几次,这将输入数据(即 x_train)的形状从 3 维更改为 2。

请参考下面的完整工作代码

import torch
from torch import nn
import torch.nn.functional as F
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
import plotly
import plotly.express as px
from datetime import datetime,timedelta
from sklearn.preprocessing import MinMaxScaler
import math
from sklearn.metrics import mean_squared_error,  mean_absolute_error
from sklearn.preprocessing import StandardScaler

import tensorflow as tf
from keras.models import Sequential, load_model
from keras.layers import Dense, Flatten, Dropout, Reshape
from keras.layers.convolutional import Conv1D, MaxPooling1D
from keras.utils import to_categorical
#from keras.utils import plot_model
from keras.utils.vis_utils import plot_model
from keras import backend
import keras
#from keras.layers.core import Dense, Activation, Dropout


from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
import warnings
warnings.filterwarnings("ignore")

import plotly.graph_objs as go
plotly.offline.init_notebook_mode(connected=True)

url = 'https://raw.githubusercontent.com/nitin-barthwal/CNN1D/master/dataset.csv'
rescaled = pd.read_csv(url)
print('Shape : ',rescaled.shape)
rescaled.sample(10)

def Reshape_data(df, time_steps, step):
    '''
    time_steps is the number of observations that I have in a sample 
    steps is the number with which I am shifting or advancing to create a new sample. 
    for eg if data is [ 1,2,3,4,5,6,7,8,9,10,11,12] with time steps = 5 and step=3 then 
    sample 1: [1,2,3,4,5]
    sample 2: [4,5,6,7,8] 
    sample 3: [7,8,9,10,11]
    '''
        
    N_FEATURES = 1   # As i have only one input feature
    segments = []
    output_segments = []
    labels = []
    for i in range(0, len(df) - time_steps, step):
        sensor_load = df['Input'].values[i: i + time_steps]
        segments.append([sensor_load])
        output= df['Output'].values[i: i + time_steps]
        output_segments.append([output])
    # Reshaping
    reshaped_segments = np.asarray(segments, dtype= np.float32).reshape(-1, time_steps, N_FEATURES)
    output_segments = np.asarray(output_segments, dtype= np.float32).reshape(-1, time_steps, N_FEATURES)
    return reshaped_segments, output_segments

TIME_PERIODS= 600

STEP_DISTANCE = 1 # Shifting just 1 observation

x_train,y_train = Reshape_data(rescaled, TIME_PERIODS, STEP_DISTANCE )
print('x_train.shape : ',x_train.shape)
print('y_train.shape : ',y_train.shape)

num_time_periods, n_features = x_train.shape[1], x_train.shape[2]

input_shape = (num_time_periods*n_features)

def create_model(n_timesteps, n_features,n_outputs):
    '''Creates and returns the ShortSeq2Point Network
    '''
    model = Sequential()

    # 1D Conv
    model.add(Conv1D(filters=30, kernel_size=10, activation='relu', input_shape=(n_timesteps, n_features), padding="same", strides=1))
    model.add(Dropout(0.5))
    model.add(Conv1D(filters=30, kernel_size=8, activation='relu', padding="same", strides=1))
    model.add(Dropout(0.5))
    model.add(Conv1D(filters=40, kernel_size=6, activation='relu', padding="same", strides=1))
    model.add(Dropout(0.5))
    model.add(Conv1D(filters=50, kernel_size=5, activation='relu', padding="same", strides=1))
    model.add(Dropout(0.5))
    # Fully Connected Layers
    model.add(Flatten())
    model.add(Dense(1024, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(n_outputs, activation='linear'))

    model.compile(loss='mse', optimizer='adam')
    #plot_model(model, to_file='model.png', show_shapes=True)

    return model

#n_timesteps, n_features, n_outputs = x_train.shape[1], x_train.shape[2], y_train.shape[1]
n_outputs=y_train.shape[1]
# create model
model = create_model(num_time_periods, n_features,n_outputs)
model.summary()

callbacks_list = [
    keras.callbacks.ModelCheckpoint(
        filepath='best_model.epoch:02d-val_loss:.2f.h5',
        monitor='val_loss', save_best_only=True),
    keras.callbacks.EarlyStopping(monitor='acc', patience=1)
]

model.compile(loss='mse',
                optimizer='adam', metrics=['mse'])

# Hyper-parameters
BATCH_SIZE = 400
EPOCHS = 1

# Enable validation to use ModelCheckpoint and EarlyStopping callbacks.
history = model.fit(x_train,
                      y_train,
                      batch_size=BATCH_SIZE,
                      epochs=EPOCHS,
                      #callbacks=callbacks_list,
                      validation_split=0.2,
                      verbose=1)

输出:

Shape :  (177126, 2)
Input   Output
76135   0.014060    0.077409
18217   0.013428    0.043602
127749  0.014060    0.069510
160231  0.013902    0.046603
167690  0.000000    0.058294
170048  0.000000    0.054502
137066  0.012954    0.052607
70656   0.000000    0.056556
166391  0.000000    0.064613
78139   0.000000    0.039494

x_train.shape :  (176526, 600, 1)
y_train.shape :  (176526, 600, 1)

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv1d (Conv1D)              (None, 600, 30)           330       
_________________________________________________________________
dropout (Dropout)            (None, 600, 30)           0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 600, 30)           7230      
_________________________________________________________________
dropout_1 (Dropout)          (None, 600, 30)           0         
_________________________________________________________________
conv1d_2 (Conv1D)            (None, 600, 40)           7240      
_________________________________________________________________
dropout_2 (Dropout)          (None, 600, 40)           0         
_________________________________________________________________
conv1d_3 (Conv1D)            (None, 600, 50)           10050     
_________________________________________________________________
dropout_3 (Dropout)          (None, 600, 50)           0         
_________________________________________________________________
flatten (Flatten)            (None, 30000)             0         
_________________________________________________________________
dense (Dense)                (None, 1024)              30721024  
_________________________________________________________________
dropout_4 (Dropout)          (None, 1024)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 600)               615000    
=================================================================
Total params: 31,360,874
Trainable params: 31,360,874
Non-trainable params: 0
_________________________________________________________________

354/354 [==============================] - 14s 39ms/step - loss: 0.0081 - mse: 0.0081 - val_loss: 0.0081 - val_mse: 0.0081

【讨论】:

以上是关于使用 Keras 维度误差的 CNN-1D(Seq2 点)时间序列预测的主要内容,如果未能解决你的问题,请参考以下文章

在 Keras 中使用 GRU 实现 Seq2Seq

使用大数据集在 Google Colab TPU 上训练 seq2seq 模型 - Keras

Keras 中带有 LSTM 的多层 Seq2Seq 模型

Keras 中的 Seq2Seq 双向编码器解码器

使用 Keras 将平滑多维函数逼近到 1e-4 的误差

神经网络维度不匹配