神经网络（无隐藏层）与逻辑回归？

Posted 2023-02-23

技术标签:

【中文标题】神经网络（无隐藏层）与逻辑回归？【英文标题】：Neural Network (No hidden layers) vs Logistic Regression? 【发布时间】：2018-03-05 06:39:33 【问题描述】：

我一直在学习神经网络课程，但我并不真正理解为什么我从逻辑回归的准确度得分和两层神经网络（输入层和输出层）中得到不同的结果。输出层使用 sigmoid 激活函数。根据我的学习，我们可以使用神经网络中的 sigmoid 激活函数来计算概率。如果与逻辑回归试图完成的不同，这应该非常相似。然后从那里反向传播以使用梯度下降最小化误差。可能有一个简单的解释，但我不明白为什么准确度分数差异如此之大。在这个例子中，我没有使用任何训练或测试集，只是简单的数据来展示我不理解的内容。

我的逻辑回归准确率为 71.4%。在下面的示例中，我刚刚为“X”和结果“y”数组创建了数字。当结果等于“1”时，我故意将“X”的数字设置得更高，以便线性分类器可以具有一定的准确性。

import numpy as np
from sklearn.linear_model import LogisticRegression
X = np.array([[200, 100], [320, 90], [150, 60], [170, 20], [169, 75], [190, 65], [212, 132]])
y = np.array([[1], [1], [0], [0], [0], [0], [1]])

clf = LogisticRegression()
clf.fit(X,y)
clf.score(X,y) ##This results in a 71.4% accuracy score for logistic regression

但是，当我实现一个没有隐藏层的神经网络时，只对单节点输出层使用 sigmoid 激活函数（因此总共两层，输入层和输出层）。我的准确率在 42.9% 左右？为什么这与逻辑回归准确度得分显着不同？为什么这么低？

import keras
from keras.models import Sequential
from keras.utils.np_utils import to_categorical
from keras.layers import Dense, Dropout, Activation

model = Sequential()

#Create a neural network with 2 input nodes for the input layer and one node for the output layer. Using the sigmoid activation function
model.add(Dense(units=1, activation='sigmoid', input_dim=2))
model.summary()
model.compile(loss="binary_crossentropy", optimizer="adam", metrics = ['accuracy'])
model.fit(X,y, epochs=12)

model.evaluate(X,y) #The accuracy score will now show 42.9% for the neural network

【问题讨论】：

【参考方案1】：

你不是在比较同一件事。 Sklearn 的LogisticRegression 设置了许多您在 Keras 实现中没有使用的默认值。在考虑到这些差异时，我实际上得到的准确度在 1e-8 以内，主要是：

迭代次数

在 Keras 中，这是在 fit() 期间传递的 epochs。您将其设置为 12。在 Sklearn 中，这是在 LogisticRegression 的 __init__() 期间传递的 max_iter。默认为 100。

优化器

您在 Keras 中使用 adam 优化器，而 LogisticRegression 默认使用 liblinear 优化器。 Sklearn 称之为solver。

正则化

Sklearn 的LogisticRegression 默认使用 L2 正则化，您在 Keras 中没有进行任何权重正则化。在 Sklearn 中，这是 penalty，在 Keras 中，您可以使用每一层的 kernel_regularizer 调整权重。

这些实现都达到了 0.5714% 的准确率：

import numpy as np

X = np.array([
  [200, 100], 
  [320, 90], 
  [150, 60], 
  [170, 20], 
  [169, 75], 
  [190, 65], 
  [212, 132]
])
y = np.array([[1], [1], [0], [0], [0], [0], [1]])

逻辑回归

from sklearn.linear_model import LogisticRegression

# 'sag' is stochastic average gradient descent
lr = LogisticRegression(penalty='l2', solver='sag', max_iter=100)

lr.fit(X, y)
lr.score(X, y)
# 0.5714285714285714

神经网络

from keras.models import Sequential
from keras.layers import Dense
from keras.regularizers import l2

model = Sequential([
  Dense(units=1, activation='sigmoid', kernel_regularizer=l2(0.), input_shape=(2,))
])

model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.fit(X, y, epochs=100)
model.evaluate(X, y)
# 0.57142859697341919

【讨论】：

非常感谢！我没有意识到这些参数会产生如此大的差异。我们了解了这两种模型，并且我认为准确度分数一开始就几乎相同。我真的很想了解神经网络是如何工作的，这确实有助于澄清事情。看起来我仍然需要花几个小时来审查神经网络的参数才能完全真正地理解一切。再次感谢！

以上是关于神经网络（无隐藏层）与逻辑回归？的主要内容，如果未能解决你的问题，请参考以下文章