深度知识追踪(DKT)实现pytorch(及常见问题)
Posted sereasuesue
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了深度知识追踪(DKT)实现pytorch(及常见问题)相关的知识,希望对你有一定的参考价值。
发现代码跑了几遍还是没有自己按照思路写一遍清楚
参考代码
GitHub - dxywill/deepknowledgetracing: Pytorch implementation for Deep Knowledge tracing
论文 https://web.stanford.edu/~cpiech/bio/papers/deepKnowledgeTracing.pdf
数据集介绍
数据形式三行(第一行:答题数
第二行:题目编号(从0开始)
第三行:答题结果,0表示错,1表示对)
data.py
import csv
import random
def load_data(fileName):
rows = []
max_skill_num = 0 # max_skill_num是知识点(题目)个数
max_num_problems = 0 # max_num_problems最长题目序列
with open(fileName, "r") as csvfile:#打开文件
reader = csv.reader(csvfile, delimiter=',')
for row in reader:
rows.append(row)
index = 0
print("the number of rows is " + str(len(rows)))
tuple_rows = []
#turn list to tuple
while(index < len(rows)-1):
problems_num = int(rows[index][0])
tmp_max_skill = max(map(int, rows[index+1]))
if(tmp_max_skill > max_skill_num):
max_skill_num = tmp_max_skill
if(problems_num <= 2):#去除了题目小于两个的数据
index += 3
else:
if problems_num > max_num_problems:
max_num_problems = problems_num
tup = (rows[index], rows[index+1], rows[index+2])## tup:[题目个数, 题目序列, 答对情况]
tuple_rows.append(tup)
index += 3
#shuffle the tuple
random.shuffle(tuple_rows)
print("The number of students is ", len(tuple_rows))
print("max_skill_num ", max_skill_num)
print("max_num_problems ", max_num_problems)
print("Finish reading data")
return tuple_rows, max_num_problems, max_skill_num+1
# skill序号从0开始所以最长题目序列为max_skill_num+1
# if __name__ == '__main__':
# train_data_path='data/test1.csv'
# load_data(train_data_path)
模型
class DeepKnowledgeTracing(nn.Module):
def __init__(self, input_size, hidden_size, num_skills, nlayers, dropout=0.6, tie_weights=False):
super(DeepKnowledgeTracing, self).__init__()
self.rnn = nn.LSTM(input_size, hidden_size, nlayers, batch_first=True, dropout=dropout)
# self.rnn = nn.RNN(input_size, hidden_size, nlayers, nonlinearity='tanh', dropout=dropout)
# nn.Linear是一个全连接层,hidden_size是输入层维数,num_skills是输出层维数
# decoder是隐层(self.rnn)到输出层的网络
self.decoder = nn.Linear(hidden_size, num_skills)
self.nhid = hidden_size
self.nlayers = nlayers
# 前向计算, 网络结构是:input --> hidden(self.rnn) --> decoder(输出层)
# 这里需要注意:在pytorch中,rnn的输入格式已经和tensorflow的rnn不太一样,具体见官网:
# https://pytorch.org/docs/stable/generated/torch.nn.RNN.html?highlight=rnn#torch.nn.RNN
# 根据官网,torch.nn.RNN接收的参数input形状是[时间步数, 批量大小, 特征维数], hidden: 旧的隐藏层的状态
def forward(self, input, hidden):
# output: 隐藏层在各个时间步上计算并输出的隐藏状态, 形状是[时间步数, 批量大小, 隐层维数]
output, hidden = self.rnn(input, hidden)
# decoded: 形状是[时间步数, 批量大小, num_skills]
decoded = self.decoder(output.contiguous().view(output.size(0) * output.size(1), output.size(2)))
return decoded, hidden
可改进的地方
其他代码与其的不同的地方
可参考文章:DKT学习_qq_40282662的博客-CSDN博客
Deep Knowledge Tracing [Pytorch] | 一切皆可解读 (chsong.live)
torch.optim.Adam(model.parameters(), lr=lr ,eps=args.epsilon)
影响模型效果的因素
学习率,batchsize 一般设置为8的倍数 还有epsilon不要过大 0.01或0.001都可
以上是关于深度知识追踪(DKT)实现pytorch(及常见问题)的主要内容,如果未能解决你的问题,请参考以下文章