一个简单的单层感知器实现,用于具有 sigmoid 激活函数的二进制分类
Posted
技术标签:
【中文标题】一个简单的单层感知器实现,用于具有 sigmoid 激活函数的二进制分类【英文标题】:a simple single layer perceptron implementation for binary classification with sigmoid activation function 【发布时间】:2019-07-30 04:59:33 【问题描述】:您好,我正在尝试基于训练感知器(没有任何隐藏层)以使用 sigmoid 激活函数执行二进制分类来完成分配。但由于某种原因,我的代码无法正常工作。尽管每个 epoch 后误差都在减少,但准确度并没有增加。我有目标标签 1 和 0,但我预测的标签几乎都接近 1。我预测的标签都没有代表 0 类。 下面是我的代码。任何人都请告诉我我做错了什么。
<# Create a Neural_Network class
class Neural_Network(object):
def __init__(self,inputSize = 2,outputSize = 1 ):
# size of layers
self.inputSize = inputSize
self.outputSize = outputSize
#weights
self.W1 = 0.01*np.random.randn(inputSize+1, outputSize) # randomly initialize W1 using random function of numpy
# size of the wieght will be (inputSize +1, outputSize) that +1 is for bias
def feedforward(self, X): #forward propagation through our network
n,m=X.shape
Xbias = np.ones((n,1)) #bias term in input
Xnew = np.hstack((Xbias,X)) #adding biasterm in input to match the dimension with the weigth
self.product=np.dot(Xnew,self.W1) # dot product of X (input) and set of weights
output=self.sigmoid(self.product) # apply activation function (i.e. sigmoid)
return output # return your answer with as a final output of the network
def sigmoid(self, s):# apply sigmoid function on s and return its value
return (1./(1. + np.exp(-s))) #activation sigmoid function
def sigmoid_derivative(self, s):#derivative of sigmoid
#derivative of sigmoid = sigmoid(x)*(1-sigmoid(x))
return s*(1-s) # here s will be sigmoid(x)
def backwardpropagate(self,X, Y, y_pred, lr):
# backward propagate through the network
# compute error in output which is loss, compute cross entropy loss function
self.output_error=self.crossentropy(Y,y_pred) #output error
# applying derivative of sigmoid to the error
self.error_deriv=self.output_error*self.sigmoid_derivative(y_pred)
# adjust set of weights
n,m=X.shape
Xbias = np.ones((n,1)) #bias term in input
Xnew = np.hstack((Xbias,X)) #adding biasterm in input to match the dimension with the weigth
self.W1 += lr*(Xnew.T.dot(self.error_deriv)) # W1=W1+ learningrate*errorderiv*input
#self.W1 += X.T.dot(self.z2_delta)
def crossentropy(self, Y, Y_pred):
# compute error based on crossentropy loss
#Cross entropy= sum(Y_actual*log(y_predicted))/N. here 1e-6 is used to avoid log 0
N = Y_pred.shape[0]
#cr_entropy=-np.sum(((Y*np.log(Y_pred+1e-6))+((1-Y)*np.log(1-Y_pred+1e-6))))/N
cr_entropy=-np.sum(Y*np.log(Y_pred+1e-6))/N
return cr_entropy #error
Null=None
def train(self, trainX, trainY,epochs = 100, learningRate = 0.001, plot_err = True ,validationX = Null, validationY = Null):
tr_error=[]
for i in range(epochs):
# feed forward trainX and trainY and recievce predicted value
y_predicted=self.feedforward(trainX)
print(i,y_predicted)
# backpropagation with trainX, trainY, predicted value and learning rate.
self.backwardpropagate(trainX,trainY,y_predicted,learningRate)
tr_error.append(self.output_error)
print(i,self.output_error)
print(i,self.W1)
# """"""if validationX and validationY are not null than show validation accuracy and error of the model.""""""
# plot error of the model if plot_err is true
epocharray=range(0,epochs)
plt.plot(epocharray,tr_error,'r',linewidth=3.0) #plotting error vs. no. of epochs
plt.xlabel('No. of Epochs')
plt.ylabel('Cross Entropy Error')
plt.title('Error Vs. Epoch')
def predict(self, testX):
# predict the value of testX
self.ytest_pred=self.feedforward(testX)
def accuracy(self, testX, testY):
import math
# predict the value of trainX
self.ytest_pred1=self.feedforward(testX)
acc=0
# compare it with testY
for j in range(len(testY)):
q=math.ceil(self.ytest_pred1[j])
#p=round(q)
if testY[j] == q:
acc +=1
accuracy=acc/float(len(testX))*100
print("Percentage Accuracy is", accuracy,"%")
# compute accuracy, print it and """"""show in the form of picture""""""
return accuracy # return accuracy>
# generating dataset point
np.random.seed(1)
no_of_samples = 2000
dims = 2
#Generating random points of values between 0 to 1
class1=np.random.rand(no_of_samples,dims)
#To add separability we will add a bias of 1.1
class2=np.random.rand(no_of_samples,dims)+1.1
class_1_label=np.array([1 for n in range(no_of_samples)])
class_2_label=np.array([0 for n in range(no_of_samples)])
#Lets visualize the dataset
plt.scatter(class1[:,0],class1[:,1], marker='^', label="class 1")
plt.scatter(class2[:,0],class2[:,1], marker='o', label="class 2")
plt.xlabel('Dimension 1')
plt.ylabel('Dimension 2')
plt.legend(loc='best')
plt.show()
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
# Data concatenation
data = np.concatenate((class1,class2),axis=0)
label = np.concatenate((class_1_label,class_2_label),axis=0)
#Note: shuffle this dataset before dividing it into three parts
data,label=shuffle(data,label)
#print(data)
# now using train_test_split command to split data into 60% training data, 20% testing data and 20% validation data
trainX, testX, trainY, testY = train_test_split(data, label, test_size=0.2, random_state=1)
trainX, validX, trainY, validY = train_test_split(trainX, trainY, test_size=0.25, random_state=1)
model = Neural_Network(2,1)
# try different combinations of epochs and learning rate
model.train(trainX, trainY, epochs = 100, learningRate = 0.000001, validationX = validX, validationY = validY)
model.accuracy( testX,testY)
结果是这样来的(0附近没有标签)
0 [[0.49670809]
[0.4958389 ]
[0.4966064 ]
...
[0.49537492]
[0.49566927]
[0.4961255 ]]
0 828.1069658303942
0 [[0.48311074]
[0.51907406]
[0.52764299]]
1 [[0.69813116]
[0.91746189]
[0.80408611]
...
[0.74821077]
[0.87150079]
[0.75187736]]
1 250.96538025031356
1 [[0.56983781]
[0.59205773]
[0.60057486]]
2 [[0.72602796]
[0.94067579]
[0.83591236]
...
[0.77916283]
[0.90032058]
[0.78291184]]
2 210.645081151866
2 [[0.63353102]
[0.64265939]
[0.65118627]]
3 [[0.74507968]
[0.95318096]
[0.85588864]
...
[0.79953834]
[0.91705918]
[0.80329027]]
3 186.2933734713245
3 [[0.6846678 ]
[0.68164316]
[0.69020355]]
4 [[0.75952936]
[0.96114086]
[0.87010085]
...
[0.81456476]
[0.92830628]
[0.81829009]]
4 169.32091332021724
4 [[0.72771826]
[0.71342293]
[0.72202744]]
5 [[0.77112943]
[0.96669774]
[0.88093323]
...
[0.82635507]
[0.93649788]
[0.83004119]]
5 156.53923256347372
请帮我解决这个问题
【问题讨论】:
您的代码无法在model.train
运行,因为trainX
未在该位置定义。
@lincr 是的,我忘了添加那部分,我现在已经添加了。请看一下。
您在反向传播函数中缺少对y_pred
的损失衍生项。此外,通常使用cross-entropy
函数,softmax
作为最后一个输出层。如果cross-entroy
不是很需要,你可以尝试mse
之类的东西。我修改了您的代码以使用mse
loss,epochs=1000,lr=1e-4,我得到了 0f 98% 的准确率。
@lincr 非常感谢我现在也要去 mse 和它的工作
【参考方案1】:
我看到你设置的学习率太小了。将其设置为 0.001 并将 epoch 增加到 20k,您将看到您的模型学习良好。
绘制错误与纪元应该让您更好地了解在哪里停止。
【讨论】:
我的模型仍然不能正常工作...还有什么建议吗?以上是关于一个简单的单层感知器实现,用于具有 sigmoid 激活函数的二进制分类的主要内容,如果未能解决你的问题,请参考以下文章