keras LSTM 以正确的形状输入输入
Posted
技术标签:
【中文标题】keras LSTM 以正确的形状输入输入【英文标题】:keras LSTM feeding input with the right shape 【发布时间】:2019-01-29 14:52:01 【问题描述】:我正在从具有以下形状的 pandas 数据框中获取一些数据
df.head()
>>>
Value USD Drop 7 Up 7 Mean Change 7 Change Predict
0.06480 2.0 4.0 -0.000429 -0.00420 4
0.06900 1.0 5.0 0.000274 0.00403 2
0.06497 1.0 5.0 0.000229 0.00007 2
0.06490 1.0 5.0 0.000514 0.00200 2
0.06290 2.0 4.0 0.000229 -0.00050 3
前 5 列旨在作为 X
并预测 y
。这就是我为模型预处理数据的方式
from keras.models import Sequential
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.metrics import accuracy_score
from keras.layers import LSTM
from sklearn import preprocessing
# Convert a Pandas dataframe to the x,y inputs that TensorFlow needs
def to_xy(df, target):
result = []
for x in df.columns:
if x != target:
result.append(x)
# find out the type of the target column. Is it really this hard? :(
target_type = df[target].dtypes
target_type = target_type[0] if hasattr(target_type, '__iter__') else target_type
# Encode to int for classification, float otherwise. TensorFlow likes 32 bits.
if target_type in (np.int64, np.int32):
# Classification
dummies = pd.get_dummies(df[target])
return df.as_matrix(result).astype(np.float32), dummies.as_matrix().astype(np.float32)
else:
# Regression
return df.as_matrix(result).astype(np.float32), df.as_matrix([target]).astype(np.float32)
# Encode text values to indexes(i.e. [1],[2],[3] for red,green,blue).
def encode_text_index(df, name):
le = preprocessing.LabelEncoder()
df[name] = le.fit_transform(df[name])
return le.classes_
df['Predict'].value_counts()
>>>
4 1194
3 664
2 623
0 405
1 14
Name: Predict, dtype: int64
predictions = encode_text_index(df, "Predict")
predictions
>>>
array([0, 1, 2, 3, 4], dtype=int64)
X,y = to_xy(df,"Predict")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, shuffle=False)
X_train
>>>
array([[ 6.4800002e-02, 2.0000000e+00, 4.0000000e+00, -4.2857142e-04,
-4.1999999e-03],
[ 6.8999998e-02, 1.0000000e+00, 5.0000000e+00, 2.7414286e-04,
4.0300000e-03],
[ 6.4970002e-02, 1.0000000e+00, 5.0000000e+00, 2.2857143e-04,
7.0000002e-05],
...,
[ 9.5987000e+02, 5.0000000e+00, 2.0000000e+00, -1.5831429e+01,
-3.7849998e+01],
[ 9.9771997e+02, 5.0000000e+00, 2.0000000e+00, -1.6948572e+01,
-1.8250000e+01],
[ 1.0159700e+03, 5.0000000e+00, 2.0000000e+00, -1.3252857e+01,
-7.1700001e+00]], dtype=float32)
y_train
>>>
array([[0., 0., 0., 0., 1.],
[0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0.],
...,
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.]], dtype=float32)
X_train[1]
>>>
array([6.8999998e-02, 1.0000000e+00, 5.0000000e+00, 2.7414286e-04,
4.0300000e-03], dtype=float32)
X_train.shape
>>>
(2320, 5)
X_train[1].shape
>>>
(5,)
最后是 LSTM 模型(它可能看起来不是最好的编写方法,所以如果是这样的话,我也会欣赏内层的重写)
model = Sequential()
#model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2, input_shape=(None, 1)))
model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=X_train.shape))
model.add(LSTM(50, dropout=0.2, return_sequences=True))
model.add(LSTM(50, dropout=0.2, return_sequences=True))
model.add(LSTM(50, dropout=0.2, return_sequences=True))
#model.add(Dense(50, activation='relu'))
model.add(Dense(y_train.shape[1], activation='softmax'))
#model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
#model.fit(X_train, y_train, epochs=1000)
model.compile(loss='categorical_crossentropy', optimizer='adam')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-2, patience=15, verbose=1, mode='auto')
checkpointer = ModelCheckpoint(filepath="best_weights.hdf5", verbose=0, save_best_only=True) # save best model
model.fit(X_train, y_train, validation_data=(X_test, y_test), callbacks=[monitor,checkpointer], verbose=2, epochs=1000)
model.load_weights('best_weights.hdf5') # load weights from best model
运行这个会抛出这个错误
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-67-a17835a382f6> in <module>()
15 checkpointer = ModelCheckpoint(filepath="best_weights.hdf5", verbose=0, save_best_only=True) # save best model
16
---> 17 model.fit(X_train, y_train, validation_data=(X_test, y_test), callbacks=[monitor,checkpointer], verbose=2, epochs=1000)
18 model.load_weights('best_weights.hdf5') # load weights from best model
c:\users\samuel\appdata\local\programs\python\python35\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
948 sample_weight=sample_weight,
949 class_weight=class_weight,
--> 950 batch_size=batch_size)
951 # Prepare validation data.
952 do_validation = False
c:\users\samuel\appdata\local\programs\python\python35\lib\site-packages\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
747 feed_input_shapes,
748 check_batch_axis=False, # Don't enforce the batch size.
--> 749 exception_prefix='input')
750
751 if y is not None:
c:\users\samuel\appdata\local\programs\python\python35\lib\site-packages\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
125 ': expected ' + names[i] + ' to have ' +
126 str(len(shape)) + ' dimensions, but got array '
--> 127 'with shape ' + str(data_shape))
128 if not check_batch_axis:
129 data_shape = data_shape[1:]
ValueError: Error when checking input: expected lstm_48_input to have 3 dimensions, but got array with shape (2320, 5)
我尝试了很多 X_train 输入形状的变体,但每一个都会引发一些错误,我还检查了 Keras docs,但不清楚应该如何将数据输入模型
建议中的第 1 次尝试
首先是重塑 X_train
data = np.resize(X_train,(X_train.shape[0],1,X_train.shape[1]))
model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=data.shape))
这会失败并出现错误
ValueError: Input 0 is incompatible with layer lstm_52: expected ndim=3, found ndim=4
建议我把它喂成
model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=X_train.shape[1:]))
抛出同样的错误
ValueError: Input 0 is incompatible with layer lstm_63: expected ndim=3, found ndim=2
建议 2
使用 pandas 的默认 X,y
y = df['Predict']
X = df[['Value USD', 'Drop 7', 'Up 7', 'Mean Change 7', 'Change']]
X = np.array(X)
y = np.array(y)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, shuffle=False)
LSTM 也期望通过以下方式输入(batch_size, timesteps, input_dim)
所以我尝试了这个
model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=(100, 100, X_train.shape)))
抛出此错误
TypeError: Error converting shape to a TensorShape: int() argument must be a string, a bytes-like object or a number, not 'tuple'.
另一种方式
model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=(100, 100, X_train[1].shape)))
返回相同的错误
TypeError: Error converting shape to a TensorShape: int() argument must be a string, a bytes-like object or a number, not 'tuple'.
【问题讨论】:
【参考方案1】:您想设置具有多个特征的 LSTM(有状态还是无状态?),特征是数据框中的 Value USD Drop 7 Up 7 Mean Change 7 Change
列。 https://github.com/keras-team/keras/issues/6471中也有类似的问题
Keras LSTM 接受 (batch_size (number of samples processed at a time),timesteps,features) = (batch_size, timesteps, input_dim)
的输入,因为您有 5 个特征 input_dim = features = 5
。我不知道你的全部数据,所以我不能说更多。 number_of_samples
(dataframe 中的行数)和batch_size
的关系在http://philipperemy.github.io/keras-stateful-lstm/,batch_size
是一次处理的样本数(行)(doubts regarding batch size and time steps in RNN ) :
换句话说,每当你训练或测试你的 LSTM 时,你首先有 构建形状为
nb_samples, timesteps, input_dim
的输入矩阵 X 您的batch size
与nb_samples
相除的地方。例如,如果nb_samples=1024
和batch_size=64
,这意味着你的模型将 接收 64 个样本的块,计算每个输出(无论数量 时间步长是针对每个样本的),平均梯度并传播 它来更新参数向量。
来源:http://philipperemy.github.io/keras-stateful-lstm/
批量大小对于训练很重要
批量大小为 1 表示模型将使用 online 拟合 训练(相对于批量训练或小批量训练)。作为一个 结果,预计模型拟合会有一些方差。
来源:https://machinelearningmastery.com/stateful-stateless-lstm-time-series-forecasting-python/
timesteps
是您想要回顾的时间步数/过去的网络状态,由于性能原因,LSTM 的最大值约为 200-500(梯度消失问题),最大值约为 200(@987654326 @)
拆分更容易(Selecting multiple columns in a pandas dataframe):
y = df['Predict']
X = df[['Value USD','Drop 7','Up 7','Mean Change 7', 'Change']]
https://www.kaggle.com/mknorps/titanic-with-decision-trees中是修改数据类型的代码
更新:
要消除这些错误,您必须像 Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (339732, 29) 中那样重塑训练数据(还包含超过 1 个时间步的重塑代码)。我发布了对我有用的整个代码,因为这个问题不像一见钟情那么简单(注意[
和]
的数量,它们在整形时表示数组的维度) :
import pandas as pd
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from keras.layers import LSTM
from sklearn import preprocessing
df = pd.read_csv('/path/data_lstm.dat')
y = df['Predict']
X = df[['Value USD', 'Drop 7', 'Up 7', 'Mean Change 7', 'Change']]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, shuffle=False)
X_train_array = X_train.values ( https://***.com/questions/13187778/convert-pandas-dataframe-to-numpy-array-preserving-index )
y_train_array = y_train.values.reshape(4,1)
X_test_array = X_test.values
y_test_array = y_test.values
# reshaping to fit batch_input_shape=(4,1,5) batch_size, timesteps, number_of_features , batch_size can be varied batch_input_shape=(2,1,5), = (1,1,5),... is also working
X_train_array = np.reshape(X_train_array, (X_train_array.shape[0], 1, X_train_array.shape[1]))
#>>> X_train_array NOTE THE NUMBER OF [ and ] !!
#array([[[ 6.480e-02, 2.000e+00, 4.000e+00, -4.290e-04, -4.200e-03]],
# [[ 6.900e-02, 1.000e+00, 5.000e+00, 2.740e-04, 4.030e-03]],
# [[ 6.497e-02, 1.000e+00, 5.000e+00, 2.290e-04, 7.000e-05]],
# [[ 6.490e-02, 1.000e+00, 5.000e+00, 5.140e-04, 2.000e-03]]])
y_train_array = np.reshape(y_train_array, (y_train_array.shape[0], 1, y_train_array.shape[1]))
#>>> y_train_array NOTE THE NUMBER OF [ and ] !!
#array([[[4]],
# [[2]],
# [[2]],
# [[2]]])
model = Sequential()
model.add(LSTM(32, return_sequences=True, batch_input_shape=(4,1,5) ))
model.add(LSTM(32, return_sequences=True ))
model.add(Dense(1, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
【讨论】:
感谢更新,虽然 y_train 在拆分后的重塑中途出现了问题:y_train.shape
>>> (2320,)
... y_train.values
>>> array([4, 2, 2, ..., 4, 4, 4], dtype=int64)
.. . y_train_array = y_train.values.reshape(4,1)
>>> ValueError: cannot reshape array of size 2320 into shape (4,1)
.我也不确定为什么特别是这种形状,因为 y 的值是 0-4,即 5 个值。
您可以根据自己的数据进行调整,即y_train_array = y_train.values.reshape(2320,1)
此重塑有效,但仍会在拟合线上引发形状错误model.fit(X_train_array, y_train_array, validation_data=(X_test_array, y_test_array), callbacks=[monitor,checkpointer], verbose=2, epochs=1000)
>>> ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (580, 5)
这是第一个 lstm model.add(LSTM(32, return_sequences=True, batch_input_shape=(4,1,5) ))
您还必须适应您的数据并重塑 X_train_array。在batch_input_shape
中,您可以选择批量大小,因为上面的timesteps =1
和input_dim =5
必须保持原样。
如果你在model.fit
中有validation_data
也可以重塑X_test_array
和y_test_array
【参考方案2】:
来自循环层上的Keras docs:
输入形状
具有形状(batch_size、timesteps、input_dim)的 3D 张量。
换句话说,您的模型希望您的输入具有明确的时间步长维度。尝试使用np.expand_dims()
【讨论】:
基于此更新问题,显示我得到的错误【参考方案3】:输入形状假定格式为(no_of_samples,no_of_timesteps,features)
这里只有(no_of_samples,features)
您可以在构建网络之前使用 numpy 调整训练数据的大小
data = np.resize(X_train,(X_train.shape[0],1,X_train.shape[1]))
希望对你有帮助
【讨论】:
这会创建具有以下形状的数据(2320, 1, 5)
,但将其提供给模型model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=data.shape))
会抛出ValueError: Input 0 is incompatible with layer lstm_52: expected ndim=3, found ndim=4
LSTM 中的 input_shape 参数应该是X_train.shape[1:]
即input_shape=X_train.shape[1:]
machinelearningmastery.com/… 得到错误:检查输入时出错:预期 lstm_70_input 有 3 个维度,但使用 data = np.resize(X_train,(X_train.shape) 得到了形状为 (38105, 30) 的数组[0],1,X_train.shape[1]))
我正在尝试在数字数据集(非 NLP)上实现注意力模型,并且我将输入和输出(数组)的维度设置为:(1335,5,5)和( 1335,3,5) 分别。其中我的数据的批量大小或总样本 = 1335,输入有 5 个特征,输出有 3 个特征要预测。输入和输出特征都进一步由 5 个值(作为列表)或元素组成,因此是上面提到的那些维度。但即使对于基本的 lstm 编码器-解码器模型,我也会不断收到尺寸错误。以上是关于keras LSTM 以正确的形状输入输入的主要内容,如果未能解决你的问题,请参考以下文章
Keras 功能 api 输入形状错误,lstm 层收到 2d 而不是 3d 形状
理解 LSTM 中的输入和输出形状 | tf.keras.layers.LSTM(以及对于return_sequences的解释)