如何解决这个 pytorch 两个设备错误
Posted
技术标签:
【中文标题】如何解决这个 pytorch 两个设备错误【英文标题】:How can I solve this pytorch two devices error 【发布时间】:2022-01-05 13:32:52 【问题描述】:我在使用 PyTorch 时遇到了问题: 预计所有张量都在同一个设备上,但发现至少有两个设备,cpu 和 cuda:0! (在方法 wrapper_addmm 中检查参数 mat1 的参数时)
model = nn.Sequential(
nn.Linear(622, 512),
nn.ReLU(),
nn.Linear(512, 256),
nn.ReLU(),
nn.Linear(256, 5),
).to(device)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
train_loader = Data.DataLoader(
dataset=train_dataset,
batch_size=32,
shuffle=True,
num_workers=0,
)
test_loader = Data.DataLoader(
dataset=test_dataset,
batch_size=100,
shuffle=True,
num_workers=0,
)
best_acc = 0
best_model = model.cpu().state_dict().copy()
# train_acc = 0
# test_acc = 0
for epoch in range(20):
for step, (batch_x, batch_y) in enumerate(train_loader):
batch_x = batch_x.to(device)
batch_y = batch_y.to(device)
print(batch_x)
print(batch_x.device, 0)
out = model(batch_x.to(device)).cuda()
print(out.device, 1)
loss = loss_fn(out, batch_y.long())
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_acc = np.mean((torch.argmax(out, 1) == batch_y).cpu().numpy())
with torch.no_grad():
for batch_x, batch_y in test_loader:
batch_x = batch_x.to(device)
batch_y = batch_y.to(device)
print(batch_x.device, 2)
out = model(batch_x)
print(batch_x.device, 3)
test_acc = np.mean((torch.argmax(out, 1) == batch_y).cpu().numpy())
if test_acc > best_acc:
best_acc = test_acc
best_model = model.cpu().state_dict().copy()
谁能帮忙解释一下,我整天都在研究这个......
【问题讨论】:
源代码是'out = model(batch_x)',它会触发这个错误,所以我把它改成'out = model(batch_x.to(device)).cuda()',stiil有同样的错误。 【参考方案1】:请注意,.to()
在应用于nn.Module
s 和torch.tensor
s 时具有不同的行为:while for torch.tensor
.to(device)
creates a copy of the tensor on the device
, with nn.Module
.to(device)
operates in place。
在您的代码中,您将模型移至 CPU:
best_model = model.cpu().state_dict().copy()
确保在将模型移回 cpu 后将其移回 device
。
【讨论】:
以上是关于如何解决这个 pytorch 两个设备错误的主要内容,如果未能解决你的问题,请参考以下文章
Pytorch RNN 错误:RuntimeError:输入必须有 3 个维度得到 1
如何解决这个 java.lang.RuntimeException?