将 GRU 层从 PyTorch 转换为 TensorFlow
Posted
技术标签:
【中文标题】将 GRU 层从 PyTorch 转换为 TensorFlow【英文标题】:Converting GRU layer from PyTorch to TensorFlow 【发布时间】:2021-12-08 09:45:49 【问题描述】:我正在尝试将以下 GRU 层从 PyTorch(1.9.1) 转换为 TensorFlow(2.6.0):
# GRU layer
self.gru = nn.GRU(64, 32, bidirectional=True, num_layers=2, dropout=0.25, batch_first=True)
我不确定我当前的实现,尤其是参数bidirectional
和num_layers
的转换。我目前的重建如下:
# GRU Layer
model.add(Bidirectional(GRU(32, return_sequences=True, dropout=0.25, time_major=False)))
model.add(Bidirectional(GRU(32, return_sequences=True, dropout=0.25, time_major=False)))
我错过了什么吗?提前感谢您的帮助!
【问题讨论】:
【参考方案1】:是的,这两个模型是相同的,至少从参数数量和输出形状来看: 在 pytorch 中:
import torch
model = torch.nn.Sequential(torch.nn.GRU(64, 32, bidirectional=True, num_layers=2, dropout=0.25, batch_first=True))
from torchinfo import summary
batch_size = 16
summary(model, input_size=(batch_size, 100, 64))
> ========================================================================================== Layer (type:depth-idx) Output Shape
> Param #
> ========================================================================================== Sequential -- --
> ├─GRU: 1-1 [16, 100, 64]
> 37,632
> Total params: 37,632 Trainable params: 37,632 Non-trainable params: 0
> Total mult-adds (M): 60.21
> ============================================================================= Input size (MB): 0.41 Forward/backward pass size (MB): 0.82 Params
> size (MB): 0.15 Estimated Total Size (MB): 1.38
> =============================================================================
在张量流中:
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Bidirectional, GRU
# GRU Layer
model = Sequential()
model.add(Bidirectional(GRU(32, return_sequences=True, dropout=0.25, time_major=False)))
model.add(Bidirectional(GRU(32, return_sequences=True, dropout=0.25, time_major=False)))
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3), loss='mse')
a = model.call(inputs=tf.random.normal(shape=(16, 100, 64)))
model.summary()
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
bidirectional_8 (Bidirection (16, 100, 64) 18816
_________________________________________________________________
bidirectional_9 (Bidirection (16, 100, 64) 18816
=================================================================
Total params: 37,632
Trainable params: 37,632
Non-trainable params: 0
【讨论】:
以上是关于将 GRU 层从 PyTorch 转换为 TensorFlow的主要内容,如果未能解决你的问题,请参考以下文章
word2vec 的 RNN 模型 (GRU) 到回归不学习
PyTorch笔记 - GRU(Gated Recurrent Unit)