RuntimeError: Expected hidden[0] size (1, 1, 512), got (1, 128, 512) for LSTM pytorch
Posted
技术标签:
【中文标题】RuntimeError: Expected hidden[0] size (1, 1, 512), got (1, 128, 512) for LSTM pytorch【英文标题】: 【发布时间】:2021-02-14 04:01:21 【问题描述】:我以 128 的批大小训练 LSTM,在测试期间我的批大小为 1,为什么会出现此错误?我想在测试时初始化隐藏大小?
这是我正在使用的代码,我将隐藏状态 init_hidden 函数初始化为 (number_of_layers, batch_size, hidden_size) 因为 batch_first=True
class ImageLSTM(nn.Module):
def __init__(self, n_inputs:int=49,
n_outputs:int=4096,
n_hidden:int=256,
n_layers:int=1,
bidirectional:bool=False):
"""
Takes a 1D flatten images.
"""
super(ImageLSTM, self).__init__()
self.n_inputs = n_inputs
self.n_hidden = n_hidden
self.n_outputs = n_outputs
self.n_layers = n_layers
self.bidirectional = bidirectional
self.lstm = nn.LSTM( input_size=self.n_inputs,
hidden_size=self.n_hidden,
num_layers=self.n_layers,
dropout = 0.5 if self.n_layers>1 else 0,
bidirectional=self.bidirectional,
batch_first=True)
if (self.bidirectional):
self.FC = nn.Sequential(
nn.Linear(self.n_hidden*2, self.n_outputs),
nn.Dropout(p=0.5),
nn.Sigmoid()
)
else:
self.FC = nn.Sequential(
nn.Linear(self.n_hidden, self.n_outputs),
# nn.Dropout(p=0.5),
nn.Sigmoid()
)
def init_hidden(self, batch_size, device=None): # input 4D tensor: (batch size, channels, width, height)
# initialize the hidden and cell state to zero
# vectors:(number of layer, batch size, number of hidden nodes)
if (self.bidirectional):
h0 = torch.zeros(2*self.n_layers, batch_size, self.n_hidden)
c0 = torch.zeros(2*self.n_layers, batch_size, self.n_hidden)
else:
h0 = torch.zeros(self.n_layers, batch_size, self.n_hidden)
c0 = torch.zeros(self.n_layers, batch_size, self.n_hidden)
if device is not None:
h0 = h0.to(device)
c0 = c0.to(device)
self.hidden = (h0,c0)
def forward(self, X): # X: tensor of shape (batch_size, channels, width, height)
# forward propagate LSTM
lstm_out, self.hidden = self.lstm(X, self.hidden) # lstm_out: tensor of shape (batch_size, seq_length, hidden_size)
# Decode the hidden state of the last time step
out = self.FC(lstm_out[:, -1, :])
return out
【问题讨论】:
【参考方案1】:请编辑您的帖子并添加代码。你是如何初始化隐藏状态的?你的模型是什么样子的。
hidden[0]
不是你的隐藏大小,它是 lstm 的隐藏状态。隐藏状态的形状必须像这样初始化:
hidden = ( torch.zeros((batch_size, layers, hidden_size)), torch.zeros((layers, batch_size, hidden_size)) )
您似乎做得正确。但是错误告诉您,您提供了一批大小为 1 的批次(因为正如您所说,您只想使用一个样本进行测试),但隐藏状态是使用 batch-size=128 初始化的。
所以我猜(请添加代码)您硬编码批次大小 = 128。不要那样做。由于您必须在每次前向传递中重新初始化隐藏状态,您可以这样做:
...
def forward(self, x):
batch_size = x.shape[0]
hidden = (torch.zeros(self.layers, batch_size, self.hidden_size).to(device=device), torch.zeros(self.layers, batch_size, self.hidden_size).to(device=device))
output, hidden = lstm(x, hidden)
# then do what every you want with the output
我猜这就是导致此错误的原因,但也请发布您的代码!
【讨论】:
以上是关于RuntimeError: Expected hidden[0] size (1, 1, 512), got (1, 128, 512) for LSTM pytorch的主要内容,如果未能解决你的问题,请参考以下文章
RuntimeError:Expected to have finished reduction in the prior iteration
RuntimeError: Expected hidden[0] size (1, 1, 512), got (1, 128, 512) for LSTM pytorch
RuntimeError: Expected a ‘cuda‘ device type for generator but found ‘cpu‘
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for ar
RuntimeError: Expected all tensors to be on the same device, but found at least two devices
RuntimeError: Expected all tensors to be on the same device,但发现至少有两个设备,cpu和cuda:0!使用我的模型进行预测时