如何在代码中使用 PyTorch PackedSequence?
Posted
技术标签:
【中文标题】如何在代码中使用 PyTorch PackedSequence?【英文标题】:How do you use PyTorch PackedSequence in code? 【发布时间】:2017-11-22 09:53:27 【问题描述】:有人可以提供完整的工作代码(不是 sn-p,而是在可变长度循环神经网络上运行的代码),说明您将如何在 PyTorch 中使用 PackedSequence 方法?
文档、github 或 Internet 中似乎没有任何此类示例。
https://github.com/pytorch/pytorch/releases/tag/v0.1.10
【问题讨论】:
感谢您提出这个问题...当名称为“pad_packed_sequence”时,我一直在寻找“解包填充序列”功能... 【参考方案1】:不是最漂亮的一段代码,但这是我在浏览 PyTorch 论坛和文档后收集的供我个人使用的代码。当然可以有更好的方法来处理排序 - 恢复部分,但我选择它在网络本身中
class Encoder(nn.Module):
def __init__(self, vocab_size, embedding_size, embedding_vectors=None, tune_embeddings=True, use_gru=True,
hidden_size=128, num_layers=1, bidrectional=True, dropout=0.6):
super(Encoder, self).__init__()
self.embed = nn.Embedding(vocab_size, embedding_size, padding_idx=0)
self.embed.weight.requires_grad = tune_embeddings
if embedding_vectors is not None:
assert embedding_vectors.shape[0] == vocab_size and embedding_vectors.shape[1] == embedding_size
self.embed.weight = nn.Parameter(torch.FloatTensor(embedding_vectors))
cell = nn.GRU if use_gru else nn.LSTM
self.rnn = cell(input_size=embedding_size, hidden_size=hidden_size, num_layers=num_layers,
batch_first=True, bidirectional=True, dropout=dropout)
def forward(self, x, x_lengths):
sorted_seq_lens, original_ordering = torch.sort(torch.LongTensor(x_lengths), dim=0, descending=True)
ex = self.embed(x[original_ordering])
pack = torch.nn.utils.rnn.pack_padded_sequence(ex, sorted_seq_lens.tolist(), batch_first=True)
out, _ = self.rnn(pack)
unpacked, unpacked_len = torch.nn.utils.rnn.pad_packed_sequence(out, batch_first=True)
indices = Variable(torch.LongTensor(np.array(unpacked_len) - 1).view(-1, 1)
.expand(unpacked.size(0), unpacked.size(2))
.unsqueeze(1))
last_encoded_states = unpacked.gather(dim=1, index=indices).squeeze(dim=1)
scatter_indices = Variable(original_ordering.view(-1, 1).expand_as(last_encoded_states))
encoded_reordered = last_encoded_states.clone().scatter_(dim=0, index=scatter_indices, src=last_encoded_states)
return encoded_reordered
【讨论】:
刚刚意识到我回答了一个 6 个月大的问题 =(【参考方案2】:其实不用自己考虑排序恢复问题,让torch.nn.utils.rnn.pack_padded_sequence
函数完成所有工作,通过设置参数enforce_sorted=False
。
那么返回的PackedSequence
对象会在其sorted_indices
和unsorted_indicies
属性中携带排序相关信息,后面的nn.GRU
或nn.LSTM
可以正确使用这些信息来恢复原来的索引顺序。
可运行代码示例:
import torch
from torch import nn
from torch.nn.utils.rnn import pad_sequence, pack_padded_sequence, pad_packed_sequence
data = [torch.tensor([1]),
torch.tensor([2, 3, 4, 5]),
torch.tensor([6, 7]),
torch.tensor([8, 9, 10])]
lengths = [d.size(0) for d in data]
padded_data = pad_sequence(data, batch_first=True, padding_value=0)
embedding = nn.Embedding(20, 5, padding_idx=0)
embeded_data = embedding(padded_data)
packed_data = pack_padded_sequence(embeded_data, lengths, batch_first=True, enforce_sorted=False)
lstm = nn.LSTM(5, 5, batch_first=True)
o, (h, c) = lstm(packed_data)
# (h, c) is the needed final hidden and cell state, with index already restored correctly by LSTM.
# but o is a PackedSequence object, to restore to the original index:
unpacked_o, unpacked_lengths = pad_packed_sequence(o, batch_first=True)
# now unpacked_o, (h, c) is just like the normal output you expected from a lstm layer.
print(unpacked_o, unpacked_lengths)
我们得到unpacked_o, unpacked_lengths
的输出如下:
# output (unpacked_o, unpacked_lengths):
tensor([[[ 1.5230, -1.7530, 0.5462, 0.6078, 0.9440],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]],
[[ 1.8888, -0.5465, 0.5404, 0.4132, -0.3266],
[ 0.1657, 0.5875, 0.4556, -0.8858, 1.1443],
[ 0.8957, 0.8676, -0.6614, 0.6751, -1.2377],
[-1.8999, 2.8260, 0.1650, -0.6244, 1.0599]],
[[ 0.0637, 0.3936, -0.4396, -0.2788, 0.1282],
[ 0.5443, 0.7401, 1.0287, -0.1538, -0.2202],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]],
[[-0.5008, 2.1262, -0.3623, 0.5864, 0.9871],
[-0.6996, -0.3984, 0.4890, -0.8122, -1.0739],
[ 0.3392, 1.1305, -0.6669, 0.5054, -1.7222],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]]],
grad_fn=<IndexSelectBackward>) tensor([1, 4, 2, 3])
对比原来的data
和lengths
,我们可以发现排序-恢复问题得到了很好的解决。
【讨论】:
以上是关于如何在代码中使用 PyTorch PackedSequence?的主要内容,如果未能解决你的问题,请参考以下文章