python中kera LSTM网络的模型拟合和尺寸大小误差

Posted

技术标签:

【中文标题】python中kera LSTM网络的模型拟合和尺寸大小误差【英文标题】:Model fit and dimension size error for kera LSTM network in python 【发布时间】:2020-01-24 03:29:56 【问题描述】:

您好,我一直在使用 Keras 在 Python 中开发 LSTM 网络。我为我的训练和测试集创建了一个一维数组。当我尝试拟合模型时,出现以下错误:

ValueError: 检查输入时出错:预期 lstm_31_input 有 3 个维度,但得到的数组形状为 (599, 1)

我已尝试调整尺寸和添加(展平)图层的大小。这些都不起作用。我的代码如下:

#Setup
import pandas as pd
import numpy as np
from numpy import array, zeros, newaxis
from numpy import argmax
from keras.layers.core import Dense, Activation, Dropout
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Embedding, Flatten
from keras.layers import LSTM

#Used to ignore warning about some of the tensor command being depracated
#Code from: 
#https://***.com/questions/43819820/how-to-disable-keras-warnings
#import warnings
#with warnings.catch_warnings():
#    warnings.simplefilter("ignore")


"""
#Allow use of modules from the Common_Functions Folder
import sys
sys.path.append('../_Common_Functions')
import Hello_World as helloWorld
"""

#Creates dataset of random numbers
#import numpy as np
from random import random
def generateDatset(n):
    val = np.array([])
    typ = np.array([])
    for i in range (1, n):
        val = np.append(val, round(random()*10, 2))

        if val[i-1] < 3 or val[i-1] > 7:
            typ = np.append(typ, 'm')
        else:
            typ = np.append(typ, 'f')
    return val, typ


# Encode the output labels
def lable_encoding(gender_series):
    labels = np.empty((0, 2))
    for i in gender_series:
        if i == 'm':
            labels = np.append(labels, [[1,0]], axis=0)
        else:
            labels = np.append(labels, [[0,1]], axis=0)
    return labels

#Gets dataset in proper format for this program
val, typ = generateDatset(1000)
df = pd.DataFrame( "first_name": val[:], "gender": typ[:] )

# Split dataset in 60% train, 20% test and 20% validation
train, validate, test = np.split(df.sample(frac=1), [int(.6*len(df)), int(.8*len(df))])

# Convert both the input names as well as the output lables into the discussed machine readable vector format
train_x = np.asarray(train.first_name)
#train_x = np.reshape(train_x, train_x.shape + (1,))
#train_x = np.reshape(train_x, (train_x.shape[0], 1, train_x.shape[1]))

train_y = lable_encoding(train.gender)
#train_y = np.reshape(train_y, train_y.shape + (1,))
#train_y = np.reshape(train_y, (train_y.shape[0], 1, train_y.shape[1]))

validate_x =  np.asarray(validate.first_name)
#validate_x = np.reshape(validate_x, validate_x.shape + (1,))
validate_y = lable_encoding(validate.gender)
#validate_y = np.reshape(validate_y, validate_y.shape + (1,))

test_x =  np.asarray(test.first_name)
#test_x = np.reshape(test_x, test_x.shape + (1,))
test_y = lable_encoding(test.gender)
#test_x = np.reshape(test_x, test_x.shape + (1,))

"""
The number of hidden nodes can be determined by the following equation: 
Nh = (Ns/ (alpha * Ni + No ) )
Where  Ni --> number of input neurons
       No --> number of output neurons 
       Ns --> number of samples
       alph --> scaling factor

Alternatively the following equation can be used: 
    Nh = (2/3)*(Ni + No)
As a not this equation is simpler but may not provide the best performance

"""
#Set a value for the scaling factor. 
#This typically ranges between 2 and 10
alpha = 8
hidden_nodes = int(np.size(train_x) / (alpha * ((len(df.columns)-1)+ 4)))

input_length = train_x.shape # Length of the character vector
output_labels = 2 # Number of output labels

from keras import optimizers
# Build the model
print('Building model...')
model = Sequential()

#print(train_x.shape)
#
df = np.expand_dims(df, axis=2)


model.add(LSTM(hidden_nodes, return_sequences=True, input_shape=(599, 1)))

model.add(Dropout(0.2))

model.add(Flatten())
model.add(Dense(units=output_labels))

model.add(Activation('softmax'))
sgd = optimizers.SGD(lr=0.5, clipnorm=10.)
model.compile(loss='categorical_crossentropy', optimizer= sgd, metrics=['acc'])
#
batch_size=1000

#x = train_x[..., newaxis, newaxis]
#x.shape
#y = train_y[..., newaxis, newaxis]
#y.shape
model.fit(train_x, train_y, batch_size=batch_size, epochs=10)

#http://45.76.113.195/?questions/46616674/expected-ndim-3-found-ndim-2-how-to-feed-sparse-matrix-to-lstm-layer-in-keras

【问题讨论】:

【参考方案1】:

input_shape=(599, 1) 指定一个样本的形状。

在这里,您的训练批量大小为 599,一个样本的形状为 1。由于第一层是 LSTM 层,因此它需要 3 维输入(batch_size,number_of_time_stamps_of_a_sample,diamentionality_of_one_time_stamp)。但我们没有提到input_shape中的批量大小。所以LSTM层的输入形状应该是

input_shape=(number_of_time_stamps_of_a_sample,diamentionality_of_one_time_stamp)

所以你应该

1)将input_shape=(599, 1)替换为input_shape=(1, 1)

2)在训练前添加下面一行train_x=train_x.reshape(599,1,1)

为了更清楚请参考我上传的这个video;-)

【讨论】:

以上是关于python中kera LSTM网络的模型拟合和尺寸大小误差的主要内容,如果未能解决你的问题,请参考以下文章

如何在 keras 中拟合两个连接 LSTM 的模型?

Keras LSTM 模型过拟合

将存储在 tfrecord 格式的数据转换为 Tensorflow 中 lstm Keras 模型的输入,并用该数据拟合模型

为啥用于预测的 Keras LSTM 批量大小必须与拟合批量大小相同?

LSTM 模型中 epoch 图中的损失跳跃 - keras

python tensorflow 2.0 不使用 Keras 搭建简单的 LSTM 网络