Keras 精度不变
Posted
技术标签:
【中文标题】Keras 精度不变【英文标题】:Keras accuracy does not change 【发布时间】:2016-09-09 20:24:47 【问题描述】:我有几千个音频文件,我想使用 Keras 和 Theano 对它们进行分类。到目前为止,我生成了每个音频文件的 28x28 频谱图(更大可能更好,但我只是想让算法在这一点上工作)并将图像读入矩阵。所以最后我得到这个大的图像矩阵来输入网络进行图像分类。
在一个教程中我找到了这个mnist分类代码:
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense
from keras.utils import np_utils
batch_size = 128
nb_classes = 10
nb_epochs = 2
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255
print(X_train.shape[0], "train samples")
print(X_test.shape[0], "test samples")
y_train = np_utils.to_categorical(y_train, nb_classes)
y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Dense(output_dim = 100, input_dim = 784, activation= "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = nb_classes, activation = "softmax"))
model.compile(optimizer = "adam", loss = "categorical_crossentropy")
model.fit(X_train, y_train, batch_size = batch_size, nb_epoch = nb_epochs, show_accuracy = True, verbose = 2, validation_data = (X_test, y_test))
score = model.evaluate(X_test, y_test, show_accuracy = True, verbose = 0)
print("Test score: ", score[0])
print("Test accuracy: ", score[1])
这段代码运行,我得到了预期的结果:
(60000L, 'train samples')
(10000L, 'test samples')
Train on 60000 samples, validate on 10000 samples
Epoch 1/2
2s - loss: 0.2988 - acc: 0.9131 - val_loss: 0.1314 - val_acc: 0.9607
Epoch 2/2
2s - loss: 0.1144 - acc: 0.9651 - val_loss: 0.0995 - val_acc: 0.9673
('Test score: ', 0.099454972004890438)
('Test accuracy: ', 0.96730000000000005)
到目前为止,一切都运行良好,但是当我将上述算法应用于我的数据集时,准确性会卡住。
我的代码如下:
import os
import pandas as pd
from sklearn.cross_validation import train_test_split
from keras.models import Sequential
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers.core import Dense, Activation, Dropout, Flatten
from keras.utils import np_utils
import AudioProcessing as ap
import ImageTools as it
batch_size = 128
nb_classes = 2
nb_epoch = 10
for i in range(20):
print "\n"
# Generate spectrograms if necessary
if(len(os.listdir("./AudioNormalPathalogicClassification/Image")) > 0):
print "Audio files are already processed. Skipping..."
else:
print "Generating spectrograms for the audio files..."
ap.audio_2_image("./AudioNormalPathalogicClassification/Audio/","./AudioNormalPathalogicClassification/Image/",".wav",".png",(28,28))
# Read the result csv
df = pd.read_csv('./AudioNormalPathalogicClassification/Result/result.csv', header = None)
df.columns = ["RegionName","IsNormal"]
bool_mapping = True : 1, False : 0
nb_classes = 2
for col in df:
if(col == "RegionName"):
a = 3
else:
df[col] = df[col].map(bool_mapping)
y = df.iloc[:,1:].values
y = np_utils.to_categorical(y, nb_classes)
# Load images into memory
print "Loading images into memory..."
X = it.load_images("./AudioNormalPathalogicClassification/Image/",".png")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0)
X_train = X_train.reshape(X_train.shape[0], 784)
X_test = X_test.reshape(X_test.shape[0], 784)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255
print("X_train shape: " + str(X_train.shape))
print(str(X_train.shape[0]) + " train samples")
print(str(X_test.shape[0]) + " test samples")
model = Sequential()
model.add(Dense(output_dim = 100, input_dim = 784, activation= "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = nb_classes, activation = "softmax"))
model.compile(loss = "categorical_crossentropy", optimizer = "adam")
print model.summary()
model.fit(X_train, y_train, batch_size = batch_size, nb_epoch = nb_epoch, show_accuracy = True, verbose = 1, validation_data = (X_test, y_test))
score = model.evaluate(X_test, y_test, show_accuracy = True, verbose = 1)
print("Test score: ", score[0])
print("Test accuracy: ", score[1])
音频处理.py
import os
import scipy as sp
import scipy.io.wavfile as wav
import matplotlib.pylab as pylab
import Image
def save_spectrogram_scipy(source_filename, destination_filename, size):
dt = 0.0005
NFFT = 1024
Fs = int(1.0/dt)
fs, audio = wav.read(source_filename)
if(len(audio.shape) >= 2):
audio = sp.mean(audio, axis = 1)
fig = pylab.figure()
ax = pylab.Axes(fig, [0,0,1,1])
ax.set_axis_off()
fig.add_axes(ax)
pylab.specgram(audio, NFFT = NFFT, Fs = Fs, noverlap = 900, cmap="gray")
pylab.savefig(destination_filename)
img = Image.open(destination_filename).convert("L")
img = img.resize(size)
img.save(destination_filename)
pylab.clf()
del img
def audio_2_image(source_directory, destination_directory, audio_extension, image_extension, size):
nb_files = len(os.listdir(source_directory));
count = 0
for file in os.listdir(source_directory):
if file.endswith(audio_extension):
destinationName = file[:-4]
save_spectrogram_scipy(source_directory + file, destination_directory + destinationName + image_extension, size)
count += 1
print ("Generating spectrogram for files " + str(count) + " / " + str(nb_files) + ".")
ImageTools.py
import os
import numpy as np
import matplotlib.image as mpimg
def load_images(source_directory, image_extension):
image_matrix = []
nb_files = len(os.listdir(source_directory));
count = 0
for file in os.listdir(source_directory):
if file.endswith(image_extension):
with open(source_directory + file,"r+b") as f:
img = mpimg.imread(f)
img = img.flatten()
image_matrix.append(img)
del img
count += 1
#print ("File " + str(count) + " / " + str(nb_files) + " loaded.")
return np.asarray(image_matrix)
所以我运行上面的代码并收到:
Audio files are already processed. Skipping...
Loading images into memory...
X_train shape: (2394L, 784L)
2394 train samples
1027 test samples
--------------------------------------------------------------------------------
Initial input shape: (None, 784)
--------------------------------------------------------------------------------
Layer (name) Output Shape Param #
--------------------------------------------------------------------------------
Dense (dense) (None, 100) 78500
Dense (dense) (None, 200) 20200
Dense (dense) (None, 200) 40200
Dense (dense) (None, 2) 402
--------------------------------------------------------------------------------
Total params: 139302
--------------------------------------------------------------------------------
None
Train on 2394 samples, validate on 1027 samples
Epoch 1/10
2394/2394 [==============================] - 0s - loss: 0.6898 - acc: 0.5455 - val_loss: 0.6835 - val_acc: 0.5716
Epoch 2/10
2394/2394 [==============================] - 0s - loss: 0.6879 - acc: 0.5522 - val_loss: 0.6901 - val_acc: 0.5716
Epoch 3/10
2394/2394 [==============================] - 0s - loss: 0.6880 - acc: 0.5522 - val_loss: 0.6842 - val_acc: 0.5716
Epoch 4/10
2394/2394 [==============================] - 0s - loss: 0.6883 - acc: 0.5522 - val_loss: 0.6829 - val_acc: 0.5716
Epoch 5/10
2394/2394 [==============================] - 0s - loss: 0.6885 - acc: 0.5522 - val_loss: 0.6836 - val_acc: 0.5716
Epoch 6/10
2394/2394 [==============================] - 0s - loss: 0.6887 - acc: 0.5522 - val_loss: 0.6832 - val_acc: 0.5716
Epoch 7/10
2394/2394 [==============================] - 0s - loss: 0.6882 - acc: 0.5522 - val_loss: 0.6859 - val_acc: 0.5716
Epoch 8/10
2394/2394 [==============================] - 0s - loss: 0.6882 - acc: 0.5522 - val_loss: 0.6849 - val_acc: 0.5716
Epoch 9/10
2394/2394 [==============================] - 0s - loss: 0.6885 - acc: 0.5522 - val_loss: 0.6836 - val_acc: 0.5716
Epoch 10/10
2394/2394 [==============================] - 0s - loss: 0.6877 - acc: 0.5522 - val_loss: 0.6849 - val_acc: 0.5716
1027/1027 [==============================] - 0s
('Test score: ', 0.68490593621422047)
('Test accuracy: ', 0.57156767283349563)
我尝试更改网络,添加更多 epoch,但无论如何我总是得到相同的结果。我不明白为什么我会得到相同的结果。
任何帮助将不胜感激。谢谢。
编辑: 我发现了一个错误,即未正确读取像素值。我将下面的 ImageTools.py 修复为:
import os
import numpy as np
from scipy.misc import imread
def load_images(source_directory, image_extension):
image_matrix = []
nb_files = len(os.listdir(source_directory));
count = 0
for file in os.listdir(source_directory):
if file.endswith(image_extension):
with open(source_directory + file,"r+b") as f:
img = imread(f)
img = img.flatten()
image_matrix.append(img)
del img
count += 1
#print ("File " + str(count) + " / " + str(nb_files) + " loaded.")
return np.asarray(image_matrix)
现在我实际上得到了从 0 到 255 的灰度像素值,所以现在我将它除以 255 是有意义的。但是,我仍然得到相同的结果。
【问题讨论】:
【参考方案1】:最可能的原因是优化器不适合您的数据集。这是文档中的Keras optimizers 列表。
我建议您首先尝试使用默认参数值的 SGD。如果仍然不起作用,请将学习率除以 10。如有必要,请重复几次。如果你的学习率达到了 1e-6,但还是不行,那你就有另一个问题了。
总而言之,替换这一行:
model.compile(loss = "categorical_crossentropy", optimizer = "adam")
用这个:
from keras.optimizers import SGD
opt = SGD(lr=0.01)
model.compile(loss = "categorical_crossentropy", optimizer = opt)
如果不起作用,请更改学习率几次。
如果是问题所在,您应该会在几个 epoch 后看到损失越来越低。
【讨论】:
当我尝试 10^-5 时,准确度变为 0.53,在 10^-6 时变为 0.43。其余的都是相同的 0.57。我还尝试了您链接中的其他优化器,但结果是一样的。 您可以尝试的另一件事是更改数据标准化的方式。试试 scikit-learn StandardScaler。如果它仍然不起作用,您将需要一个更复杂的模型。 是的,但它不是 RNN,只是几个全连接层。 循环神经网络通常会在处理序列数据(如音频)时提供良好的结果。请参阅有关 RNN 和 LSTM 的 Keras 示例。 adam 不适合数据的可能原因是什么?【参考方案2】:经过一番检查,我发现问题出在数据本身。它非常脏,因为相同的输入有 2 个不同的输出,因此造成了混乱。现在清理数据后,我的准确率上升到 %69。还不够好,但至少我现在可以从这里开始,因为数据很清楚。
我用下面的代码来测试:
import os
import sys
import pandas as pd
import numpy as np
from keras.models import Sequential
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers.core import Dense, Activation, Dropout, Flatten
from keras.utils import np_utils
sys.path.append("./")
import AudioProcessing as ap
import ImageTools as it
# input image dimensions
img_rows, img_cols = 28, 28
dim = 1
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
nb_pool = 2
# convolution kernel size
nb_conv = 3
batch_size = 128
nb_classes = 2
nb_epoch = 200
for i in range(20):
print "\n"
## Generate spectrograms if necessary
if(len(os.listdir("./AudioNormalPathalogicClassification/Image")) > 0):
print "Audio files are already processed. Skipping..."
else:
# Read the result csv
df = pd.read_csv('./AudioNormalPathalogicClassification/Result/AudioNormalPathalogicClassification_result.csv', header = None, encoding = "utf-8")
df.columns = ["RegionName","Filepath","IsNormal"]
bool_mapping = True : 1, False : 0
for col in df:
if(col == "RegionName" or col == "Filepath"):
a = 3
else:
df[col] = df[col].map(bool_mapping)
region_names = df.iloc[:,0].values
filepaths = df.iloc[:,1].values
y = df.iloc[:,2].values
#Generate spectrograms and make a new CSV file
print "Generating spectrograms for the audio files..."
result = ap.audio_2_image(filepaths, region_names, y, "./AudioNormalPathalogicClassification/Image/", ".png",(img_rows,img_cols))
df = pd.DataFrame(data = result)
df.to_csv("NormalVsPathalogic.csv",header= False, index = False, encoding = "utf-8")
# Load images into memory
print "Loading images into memory..."
df = pd.read_csv('NormalVsPathalogic.csv', header = None, encoding = "utf-8")
y = df.iloc[:,0].values
y = np_utils.to_categorical(y, nb_classes)
y = np.asarray(y)
X = df.iloc[:,1:].values
X = np.asarray(X)
X = X.reshape(X.shape[0], dim, img_rows, img_cols)
X = X.astype("float32")
X /= 255
print X.shape
model = Sequential()
model.add(Convolution2D(64, nb_conv, nb_conv,
border_mode='valid',
input_shape=(1, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(32, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adadelta')
print model.summary()
model.fit(X, y, batch_size = batch_size, nb_epoch = nb_epoch, show_accuracy = True, verbose = 1)
【讨论】:
这很脏,因为在同一个输入中有 2 个不同的输出,因此造成混乱 -> 你是什么意思?那是混乱 我的意思是在数据标记中存在错误。同样,一些应该标记为 1 的输入被标记为 0。 @MuratAykanat 尝试更多地增加您的时期数,例如 1000 或 5000 @MuratAykanat,你为什么在这里的最后一层使用softmax
激活:model.add(Dense(nb_classes)) model.add(Activation('softmax'))
如果你只有2个类,不应该是sigmoid
吗?
@bit_scientist 如果您将最后一个激活更改为 sigmoid,您还需要将最后一个密集层更改为只有 1 个神经元。这会带来一些改进,尽管它会非常小。明智的做法是让代码保持原样,如果有可能重用超过 2 个类的代码。【参考方案3】:
看看这个
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile( loss = "categorical_crossentropy",
optimizer = sgd,
metrics=['accuracy']
)
查看documentation
我使用 MNIST 获得了更好的结果
【讨论】:
【参考方案4】:如果准确率没有变化,则意味着优化器已找到损失的局部最小值。这可能是不希望的最小值。一种常见的局部最小值是始终预测具有最多数据点的类别。您应该对类使用权重来避免这个最小值。
from sklearn.utils import compute_class_weight
classWeight = compute_class_weight('balanced', outputLabels, outputs)
classWeight = dict(enumerate(classWeight))
model.fit(X_train, y_train, batch_size = batch_size, nb_epoch = nb_epochs, show_accuracy = True, verbose = 2, validation_data = (X_test, y_test), class_weight=classWeight)
【讨论】:
【参考方案5】:我遇到了类似的问题。在 Keras 中使用 nputils 对目标变量进行 One-hot 编码,解决了准确性和验证丢失卡住的问题。使用权重来平衡目标类可进一步提高性能。
解决方案:
from keras.utils.np.utils import to_categorical
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)
【讨论】:
在帖子中粘贴 sn-p 会更好,而不是图片链接。【参考方案6】:我和你有同样的问题 我的解决方案是循环而不是时代
for i in range(10):
history = model.fit_generator(generator=training_generator,
validation_data=validation_generator,
use_multiprocessing=True,
workers=6,
epochs=1)
你也可以在每个时期保存模型,这样你就可以在你想要的任何时期之后暂停训练
for i in range(10):
history = model.fit_generator(generator=training_generator,
validation_data=validation_generator,
use_multiprocessing=True,
workers=6,
epochs=1)
#save model
model.save('drive/My Drive/vggnet10epochs.h5')
model = load_model('drive/My Drive/vggnet10epochs.h5')
【讨论】:
【参考方案7】:另一个我没有看到这里提到但对我造成类似问题的解决方案是最后一个神经元的激活函数,特别是如果它是 relu
而不是像 sigmoid
这样的非线性函数。
换句话说,它可能会帮助您在最后一层使用非线性激活函数
最后一层:
model.add(keras.layers.Dense(1, activation='relu'))
输出:
7996/7996 [==============================] - 1s 76us/sample - loss: 6.3474 - accuracy: 0.5860
Epoch 2/30
7996/7996 [==============================] - 0s 58us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 3/30
7996/7996 [==============================] - 0s 58us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 4/30
7996/7996 [==============================] - 0s 57us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 5/30
7996/7996 [==============================] - 0s 58us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 6/30
7996/7996 [==============================] - 0s 60us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 7/30
7996/7996 [==============================] - 0s 57us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 8/30
7996/7996 [==============================] - 0s 57us/sample - loss: 6.3473 - accuracy: 0.5860
现在我使用了非线性激活函数:
model.add(keras.layers.Dense(1, activation='sigmoid'))
输出:
7996/7996 [==============================] - 1s 74us/sample - loss: 0.7663 - accuracy: 0.5899
Epoch 2/30
7996/7996 [==============================] - 0s 59us/sample - loss: 0.6243 - accuracy: 0.5860
Epoch 3/30
7996/7996 [==============================] - 0s 56us/sample - loss: 0.5399 - accuracy: 0.7580
Epoch 4/30
7996/7996 [==============================] - 0s 56us/sample - loss: 0.4694 - accuracy: 0.7905
Epoch 5/30
7996/7996 [==============================] - 0s 57us/sample - loss: 0.4363 - accuracy: 0.8040
Epoch 6/30
7996/7996 [==============================] - 0s 60us/sample - loss: 0.4139 - accuracy: 0.8099
Epoch 7/30
7996/7996 [==============================] - 0s 58us/sample - loss: 0.3967 - accuracy: 0.8228
Epoch 8/30
7996/7996 [==============================] - 0s 61us/sample - loss: 0.3826 - accuracy: 0.8260
这不是原始答案的直接解决方案,但由于在 Google 上搜索此问题时答案是 #1,它可能会使某人受益。
【讨论】:
【参考方案8】:我有类似的问题。我有一个被标记为 1 和 2 的二进制类。在测试了不同类型的优化器和激活函数后,我发现问题的根源在于我对类的标记。也就是说我把标签改成了0和1而不是1和2,那么这个问题就解决了!
【讨论】:
【参考方案9】:使用这个“sigmoid”激活,我获得了 13% 的准确度增量
model = Sequential()
model.add(Dense(3072, input_shape=(3072,), activation="sigmoid"))
model.add(Dense(512, activation="sigmoid"))
model.add(Dense(1, activation="sigmoid"))
或者您也可以测试以下内容,其中“relu”位于第一层和隐藏层。
model = Sequential()
model.add(Dense(3072, input_shape=(3072,), activation="relu"))
model.add(Dense(512, activation="sigmoid"))
model.add(Dense(1, activation="sigmoid"))
【讨论】:
【参考方案10】:我错误地在末尾添加了一个 softmax 而不是 sigmoid。尝试做后者。当我这样做时,它按预期工作。对于一个输出层,softmax 总是给出 1 的值,这就是发生的情况。
【讨论】:
【参考方案11】:我在多类中遇到了同样的问题,尝试更改默认的优化器 Adam 将其更改为 sgd。
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
您还可以尝试不同的激活功能,例如。 (relu、sigmoid、softmax、softplus等)
一些展示链接
Optimizers
Activations
【讨论】:
【参考方案12】:正如其他人所指出的,优化器可能不适合您陷入局部最小值的数据/模型。神经网络至少应该能够过拟合数据(training_acc 接近 1)。 我曾经遇到过类似的问题。我通过尝试不同的优化器来解决(在我的例子中是从 SGD 到 RMSprop)
【讨论】:
【参考方案13】:如上所述,问题主要出在所选择的优化器类型上。但是,它也可以从具有相同激活函数(例如 softmax)的顶部 2 个 Dense 层的事实驱动。 在这种情况下,NN 找到一个局部最小值,并且无法从该点进一步下降,围绕相同的 acc (val_acc) 值滚动。 希望对您有所帮助。
【讨论】:
【参考方案14】:我知道这是一个老问题,但截至今天(2021 年 6 月 14 日),@theTechGuy 的评论在 tf 2.3 上运行良好。代码是:
from tensorflow.keras.optimizers import SGD
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile( loss = "categorical_crossentropy",
optimizer = sgd,
metrics=['accuracy']
)
【讨论】:
【参考方案15】:就我而言,我的问题是二进制的,我使用了“softmax”激活函数,但它不起作用。我改成了“sigmoid”,它对我来说可以正常工作。
【讨论】:
正如目前所写,您的答案尚不清楚。请edit 添加其他详细信息,以帮助其他人了解这如何解决所提出的问题。你可以找到更多关于如何写好答案的信息in the help center。【参考方案16】:我尝试了很多优化器和激活函数,但唯一有效的是 Batchnormalization1。我想这也是一个很好的做法。 您可以将其导入为:
from tensorflow.keras.layers import BatchNormalization
只需在每个隐藏层之前添加它:
model.add(BatchNormalization())
【讨论】:
以上是关于Keras 精度不变的主要内容,如果未能解决你的问题,请参考以下文章