一个简单的单层感知器实现,用于具有 sigmoid 激活函数的二进制分类

Posted

技术标签:

【中文标题】一个简单的单层感知器实现,用于具有 sigmoid 激活函数的二进制分类【英文标题】:a simple single layer perceptron implementation for binary classification with sigmoid activation function 【发布时间】:2019-07-30 04:59:33 【问题描述】:

您好,我正在尝试基于训练感知器(没有任何隐藏层)以使用 sigmoid 激活函数执行二进制分类来完成分配。但由于某种原因,我的代码无法正常工作。尽管每个 epoch 后误差都在减少,但准确度并​​没有增加。我有目标标签 1 和 0,但我预测的标签几乎都接近 1。我预测的标签都没有代表 0 类。 下面是我的代码。任何人都请告诉我我做错了什么。

 <# Create a Neural_Network class
class Neural_Network(object):    
    def __init__(self,inputSize = 2,outputSize = 1 ):        
        # size of layers
        self.inputSize = inputSize
        self.outputSize = outputSize    
        #weights
        self.W1 = 0.01*np.random.randn(inputSize+1, outputSize) # randomly initialize W1 using random function of numpy
        # size of the wieght will be (inputSize +1, outputSize) that +1 is for bias    

    def feedforward(self, X): #forward propagation through our network
        n,m=X.shape
        Xbias = np.ones((n,1))    #bias term in input
        Xnew = np.hstack((Xbias,X))   #adding biasterm in input to match the dimension with the weigth
        self.product=np.dot(Xnew,self.W1) # dot product of X (input) and set of weights
        output=self.sigmoid(self.product) # apply activation function (i.e. sigmoid)
        return output # return your answer with as a final output of the network

    def sigmoid(self, s):# apply sigmoid function on s and return its value
        return (1./(1. + np.exp(-s)))     #activation sigmoid function

    def sigmoid_derivative(self, s):#derivative of sigmoid
        #derivative of sigmoid = sigmoid(x)*(1-sigmoid(x)) 
        return s*(1-s) # here s will be sigmoid(x) 


    def backwardpropagate(self,X, Y, y_pred, lr):
        # backward propagate through the network

        # compute error in output which is loss, compute cross entropy loss function
        self.output_error=self.crossentropy(Y,y_pred)   #output error

        # applying derivative of sigmoid to the error
        self.error_deriv=self.output_error*self.sigmoid_derivative(y_pred)
        # adjust set of weights
        n,m=X.shape
        Xbias = np.ones((n,1))    #bias term in input
        Xnew = np.hstack((Xbias,X))   #adding biasterm in input to match the dimension with the weigth
        self.W1 += lr*(Xnew.T.dot(self.error_deriv))   # W1=W1+ learningrate*errorderiv*input
        #self.W1 += X.T.dot(self.z2_delta)

    def crossentropy(self, Y, Y_pred):
        # compute error based on crossentropy loss 
        #Cross entropy= sum(Y_actual*log(y_predicted))/N. here 1e-6 is used to avoid log 0
        N = Y_pred.shape[0]
        #cr_entropy=-np.sum(((Y*np.log(Y_pred+1e-6))+((1-Y)*np.log(1-Y_pred+1e-6))))/N
        cr_entropy=-np.sum(Y*np.log(Y_pred+1e-6))/N 
        return cr_entropy #error

    Null=None
    def train(self, trainX, trainY,epochs = 100, learningRate = 0.001, plot_err = True ,validationX = Null, validationY = Null):
        tr_error=[]
        for i in range(epochs):
            # feed forward trainX and trainY and recievce predicted value
            y_predicted=self.feedforward(trainX)
            print(i,y_predicted)
            # backpropagation with trainX, trainY, predicted value and learning rate.
            self.backwardpropagate(trainX,trainY,y_predicted,learningRate)
            tr_error.append(self.output_error)
            print(i,self.output_error)
            print(i,self.W1)
            # """"""if validationX and validationY are not null than show validation accuracy and error of the model.""""""

        # plot error of the model if plot_err is true
        epocharray=range(0,epochs)
        plt.plot(epocharray,tr_error,'r',linewidth=3.0)    #plotting error vs. no. of epochs
        plt.xlabel('No. of Epochs')
        plt.ylabel('Cross Entropy Error')
        plt.title('Error Vs. Epoch')

    def predict(self, testX):
        # predict the value of testX
        self.ytest_pred=self.feedforward(testX)

    def accuracy(self, testX, testY):
        import math
        # predict the value of trainX
        self.ytest_pred1=self.feedforward(testX)
        acc=0
        # compare it with testY
        for j in range(len(testY)):
            q=math.ceil(self.ytest_pred1[j])  
            #p=round(q)
            if testY[j] == q:
                acc +=1
        accuracy=acc/float(len(testX))*100
        print("Percentage Accuracy is", accuracy,"%")
        # compute accuracy, print it and """"""show in the form of picture""""""
        return accuracy # return accuracy> 




    # generating dataset point
np.random.seed(1)
no_of_samples = 2000
dims = 2
#Generating random points of values between 0 to 1
class1=np.random.rand(no_of_samples,dims)
#To add separability we will add a bias of 1.1
class2=np.random.rand(no_of_samples,dims)+1.1
class_1_label=np.array([1 for n in range(no_of_samples)])
class_2_label=np.array([0 for n in range(no_of_samples)])
#Lets visualize the dataset
plt.scatter(class1[:,0],class1[:,1], marker='^', label="class 1")
plt.scatter(class2[:,0],class2[:,1], marker='o', label="class 2")
plt.xlabel('Dimension 1')
plt.ylabel('Dimension 2')
plt.legend(loc='best')
plt.show()
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split

# Data concatenation
data = np.concatenate((class1,class2),axis=0)
label = np.concatenate((class_1_label,class_2_label),axis=0)
#Note: shuffle this dataset before dividing it into three parts
data,label=shuffle(data,label)
#print(data)

# now using train_test_split command to split data into 60% training data, 20% testing data and 20% validation data
trainX, testX, trainY, testY = train_test_split(data, label, test_size=0.2, random_state=1)  
trainX, validX, trainY, validY = train_test_split(trainX, trainY, test_size=0.25, random_state=1)
    model = Neural_Network(2,1)
    # try different combinations of epochs and learning rate
    model.train(trainX, trainY, epochs = 100, learningRate = 0.000001, validationX = validX, validationY = validY)
    model.accuracy( testX,testY)

结果是这样来的(0附近没有标签)

0 [[0.49670809]
 [0.4958389 ]
 [0.4966064 ]
 ...
 [0.49537492]
 [0.49566927]
 [0.4961255 ]]
0 828.1069658303942
0 [[0.48311074]
 [0.51907406]
 [0.52764299]]
1 [[0.69813116]
 [0.91746189]
 [0.80408611]
 ...
 [0.74821077]
 [0.87150079]
 [0.75187736]]
1 250.96538025031356
1 [[0.56983781]
 [0.59205773]
 [0.60057486]]
2 [[0.72602796]
 [0.94067579]
 [0.83591236]
 ...
 [0.77916283]
 [0.90032058]
 [0.78291184]]
2 210.645081151866
2 [[0.63353102]
 [0.64265939]
 [0.65118627]]
3 [[0.74507968]
 [0.95318096]
 [0.85588864]
 ...
 [0.79953834]
 [0.91705918]
 [0.80329027]]
3 186.2933734713245
3 [[0.6846678 ]
 [0.68164316]
 [0.69020355]]
4 [[0.75952936]
 [0.96114086]
 [0.87010085]
 ...
 [0.81456476]
 [0.92830628]
 [0.81829009]]
4 169.32091332021724
4 [[0.72771826]
 [0.71342293]
 [0.72202744]]
5 [[0.77112943]
 [0.96669774]
 [0.88093323]
 ...
 [0.82635507]
 [0.93649788]
 [0.83004119]]
5 156.53923256347372

请帮我解决这个问题

【问题讨论】:

您的代码无法在model.train 运行,因为trainX 未在该位置定义。 @lincr 是的,我忘了添加那部分,我现在已经添加了。请看一下。 您在反向传播函数中缺少对y_pred 的损失衍生项。此外,通常使用cross-entropy 函数,softmax 作为最后一个输出层。如果cross-entroy 不是很需要,你可以尝试mse 之类的东西。我修改了您的代码以使用mse loss,epochs=1000,lr=1e-4,我得到了 0f 98% 的准确率。 @lincr 非常感谢我现在也要去 mse 和它的工作 【参考方案1】:

我看到你设置的学习率太小了。将其设置为 0.001 并将 epoch 增加到 20k,您将看到您的模型学习良好。

绘制错误与纪元应该让您更好地了解在哪里停止。

【讨论】:

我的模型仍然不能正常工作...还有什么建议吗?

以上是关于一个简单的单层感知器实现,用于具有 sigmoid 激活函数的二进制分类的主要内容,如果未能解决你的问题,请参考以下文章

深度学习课程笔记神经网络基础

用 sigmoid 神经元替换感知器网络

R语言入门——不掉包实现FNN(单层感知机)

R语言入门——不掉包实现FNN(单层感知机)

计算机潜意识- 单层神经网络(感知器)

感知神经网络模型与学习算法