受限制玻尔兹曼机(RBM)用于电影推荐小例
Posted luchi007
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了受限制玻尔兹曼机(RBM)用于电影推荐小例相关的知识,希望对你有一定的参考价值。
引言
前一篇简要的介绍了受限制玻尔兹曼机原理的文章,RBM的应用有很多,本文打算根据这篇博文的例子做一个使用RBM进行电影推荐的系统.
数据来源
数据来源:[Movielens movie dataset],(http://grouplens.org/datasets/movielens/1m/)
鸣谢:F. Maxwell Harper and Joseph A. Konstan. 2015.The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS)5, 4, Article 19 (December 2015), 19 pages. DOI=http://dx.doi.org/10.1145/2827872
数据包含3000多部电影以及6000个用户,以及超过100万的评分数据
算法模型
对于
N
N
部电影,有个用户,因此用户的电影评价可以使用
M=U×N
M
=
U
×
N
这个矩阵表示,因此,用户
i
i
对电影的评价可以用
Mi,j
M
i
,
j
表示,用户对一部电影的评分低于3分的视为用户不喜欢这部电影,用户对电泳评分高于3分的视为用户喜欢这部电影。喜欢这部电影的话,
Mi,j
M
i
,
j
的值就是1,不喜欢就是0.
得到了用户电影评价矩阵,那么之后的工作就是使用RBM来进行参数训练了,参数训练完毕之后,我们输入一个用户的历史评价向量,就能得到给用户的推荐电影索引了,也就是这篇博文里面的
v1
v
1
.
主要代码及注释
主要的RBM模型代码如下,具体的推导步骤可以看前一篇博文,这里是那篇博文的具体实现
#coding='utf-8'
"""
desc: RBM model
author:luchi
date:9/3/17
"""
import numpy as np
class RBM_Model(object):
def __init__(self,visible_size,hidden_size,lr):
self.visible_size = visible_size
self.hidden_size = hidden_size
self.lr = lr
np.random.seed(10)
self.b_v = np.random.uniform(-1,1,size=[self.visible_size])*0
self.W = np.random.uniform(-1,1,size=[self.visible_size,self.hidden_size])
self.b_h = np.random.uniform(-1,1,size=[self.hidden_size])*0
def sampling(self,data):
"""
sampling h_0 using v_0
"""
h_0 = self.logist_fun(np.dot(data,self.W)+self.b_h)
#print h_0
h_shape = np.shape(h_0)
#h_0_state = h_0>(np.random.rand(h_shape[0],h_shape[1]))
h_0_state = h_0>(np.ones_like(h_0)*0.5)
"""
building contrastive sampling
"""
v_1 = self.logist_fun(np.dot(h_0_state,np.transpose(self.W))+self.b_v)
v_shape = np.shape(v_1)
#v_1_state = v_1>(np.random.rand(v_shape[0],v_shape[1]))
v_1_state = v_1>(np.ones_like(v_1)*0.5)
h_1 = self.logist_fun(np.dot(v_1,self.W)+self.b_h)
return h_0,v_1,h_1,v_1_state
def train(self,data,iter_time):
h_0,v_1,h_1,v_1_state = self.sampling(data)
if iter_time%100==0:
error = np.sum(np.mean((data-v_1) ** 2,axis=0))
print("the %i iter_time error is %s" % (iter_time, error))
"""
updating weight
"""
updating_matrix = []
size = len(data)
for i in range(size):
w_v0= np.reshape(data[i],[self.visible_size,1])
w_h0 = np.reshape(h_0[i],[1,self.hidden_size])
w_u0 = np.dot(w_v0,w_h0)
w_v1 = np.reshape(v_1[i],[self.visible_size,1])
w_h1 = np.reshape(h_1[i],[1,self.hidden_size])
w_u1 = np.dot(w_v1,w_h1)
updating_matrix.append(w_u0-w_u1)
updating_matrix = np.mean(np.array(updating_matrix),axis=0)
self.W = self.W + self.lr*updating_matrix
self.b_v = self.b_v + self.lr*np.mean((data-v_1),axis=0)
self.b_h = self.b_h + self.lr*np.mean((h_0-h_1),axis=0)
def logist_fun(self,narray):
narray = np.clip(narray,-100,100)
return 1.0/(1+np.exp(-1*narray))
def softmax(self,narray):
narray = np.clip(narray,-100,100)
num_a = np.exp(narray)
num_b = np.sum(num_a,axis=1)
return num_a*1.0/num_b[:,None]
def recomendation(self,test_data,topK):
h_0,v_1,h_1 ,_= self.sampling(test_data)
sorted_index = np.argsort(-1*v_1,axis=1)
return sorted_index[:,:topK]
结果分析
示例如下:
history of user watch movies
[ Pulp Fiction (1994) type: Crime ] [ Client, The (1994) type: Drama ] [ Flesh and Bone (1993) type: Drama ] [ Vertigo (1958) type: Mystery ] [ Rear Window (1954) type: Mystery ] [ Charade (1963) type: Comedy ] [ That Darn Cat! (1965) type: Children's ] [ One Flew Over the Cuckoo's Nest (1975) type: Drama ] [ Princess Bride, The (1987) type: Action ] [ Lawrence of Arabia (1962) type: Adventure ] [ Annie Hall (1977) type: Comedy ] [ Chinatown (1974) type: Film-Noir ] [ Unforgiven (1992) type: Western ] [ Arsenic and Old Lace (1944) type: Comedy ] [ Butch Cassidy and the Sundance Kid (1969) type: Action ] [ Star Trek III: The Search for Spock (1984) type: Action ] [ Game, The (1997) type: Mystery ] [ Devil's Advocate, The (1997) type: Crime ] [ Man Who Knew Too Little, The (1997) type: Comedy ] [ Fallen (1998) type: Action ] [ Perfect Murder, A (1998) type: Mystery ] [ X-Files: Fight the Future, The (1998) type: Mystery ] [ Lady Vanishes, The (1938) type: Comedy ] [ Name of the Rose, The (1986) type: Mystery ] [ Little Big Man (1970) type: Comedy ] [ Dead Again (1991) type: Mystery ] [ Agnes of God (1985) type: Drama ] [ Searchers, The (1956) type: Western ] [ Hud (1963) type: Drama ] [ Place in the Sun, A (1951) type: Drama ] [ Anatomy of a Murder (1959) type: Drama ] [ Shane (1953) type: Drama ]
recommendation movie:
[ Godfather, The (1972) type: Action ] [ Men in Black (1997) type: Action ] [ Shakespeare in Love (1998) type: Comedy ] [ Princess Bride, The (1987) type: Action ] [ Schindler's List (1993) type: Drama ] [ Star Wars: Episode IV - A New Hope (1977) type: Action ] [ American Beauty (1999) type: Comedy ] [ Pulp Fiction (1994) type: Crime ] [ Shawshank Redemption, The (1994) type: Drama ] [ Star Wars: Episode VI - Return of the Jedi (1983) type: Action ] [ Silence of the Lambs, The (1991) type: Drama ] [ GoodFellas (1990) type: Crime ] [ Stand by Me (1986) type: Adventure ] [ One Flew Over the Cuckoo's Nest (1975) type: Drama ] [ Braveheart (1995) type: Action ] [ Usual Suspects, The (1995) type: Crime ] [ Good Will Hunting (1997) type: Drama ] [ Election (1999) type: Comedy ] [ Saving Private Ryan (1998) type: Action ] [ Sixth Sense, The (1999) type: Thriller ] [ Back to the Future (1985) type: Comedy ] [ Groundhog Day (1993) type: Comedy ] [ Fugitive, The (1993) type: Action ] [ Apollo 13 (1995) type: Drama ] [ Bug's Life, A (1998) type: Animation ] [ When Harry Met Sally... (1989) type: Comedy ] [ Jurassic Park (1993) type: Action ] [ American Pie (1999) type: Comedy ] [ E.T. the Extra-Terrestrial (1982) type: Children's ] [ North by Northwest (1959) type: Drama ]
看着马马虎虎,不算好,试了一些方法,觉得RBM有点难以调参。示例的源代码在github上
以上是关于受限制玻尔兹曼机(RBM)用于电影推荐小例的主要内容,如果未能解决你的问题,请参考以下文章